This is the tale of three really simple and reliable tools, rsync, cron, and keychain, combining to create some confounding emergent complexity.
ssh-agent is a daemon that holds onto your credentials so that you only have to enter a password once a session. Once you’ve entered your password for your private key once via
ssh-agent decrypts your private key and holds onto it. Whenever another program needs something to be done with a private key, it asks
ssh-agent to do it.
ssh-agent passes the results of that operation to the client program instead of letting it have the unencrypted private key.
Some limitations of
- When the user logs out,
ssh-agentmay shut down, and the user will have to run
- Cron jobs and other automated processes don’t have access to the agent, and thus, need to find another way to provide credentials for their tasks.
On many Linux distros, there is a
keychain program that will keep the same
ssh-agent running across logins and also publishes information about it so that other programs (including those running in cron jobs) can get at it.
Scheduling authentication-requiring tasks with cron
I was trying to get cron to run a bash script that ran
rsync to copy a directory on a remote machine to a local directory.
This didn’t work. In the context of the cron job, I kept getting
Permission denied (publickey) from the
After sshing into the remote machine to trigger a login and get my password into
ssh-agent, I ran the script directly in the terminal, and it worked without prompting.
Something I forgot (for a while) is that I did not do this in a login shell. In order to simulate the cron environment, I ran
sh, and ran the script in that shell.
It still failed via cron job, though. I thought that
ssh-agent would have the unencrypted private key it would need at that point, but clearly something was still wrong.
In my script, I had a line like this, which I thought would let my script get access to
That file, which is generated by
keychain, is a shell script that exports two environment variables:
The process id and the socket are what programs need to communicate with
ssh-agent. When I echoed those variables in my script, they were empty. When the script ran rsync, I got prompted for credentials. (This is a problem because you can’t be around to enter credentials for most automated tasks.)
When I updated the script to use
source ~/.keychain/machinename-sh, those variables were filled out and pointed to an existing process and an existing sock file. (I don’t have a good explanation for why
source worked, but running the keychain file as a script didn’t work.)
However, it still failed with “Permission denied” when I ran it via cron. I was able to log those two environment variables, and they still referred to things that existed.
After hours of failed attempts to flush out information that could tell me what was going on, I ran
ps aux | grep ssh-agent again. I noticed that the PID of the
ssh-agent that I used from the successful manual run was NOT from the PID in
ssh-agent in this env var seemed to not have my key.
So, I logged out of the host machine in all terminals (I had three), then logged back in, and
keychain ran when I logged in. I then ran
ssh-add to add my key. I made sure there were no other
ssh-agent processes running.
Then, the script ran via the cron job, and the
rsync command worked. The PID that the script logged matched that of the single
What I think happened was this:
- The first time I logged in to the host machine,
keychainran, but I did not, directly or indirectly,
ssh-agentPID and socket into
- I then logged into another terminal, but then start an
shshell — a non-login shell which did not run
- There a new
- My key was added to that one.
- The script ran successfully (when run directly, outside of cron), using that ssh-agent.
- There a new
- I fixed the script’s importing of the keychain environment variables, but they referred to an
ssh-agent(the first one) that did not have the key added to it.
- When run via cron, the script ran
ssh-agentto log into the remote server, but the key for it was in a different
ssh-agentprocess, not that one. So, it failed with
Permission denied (publickey).
So, make sure you track your
ssh-agents (try to have only one) and pay attention to
keychain is pointing clients at.
#linux #unix #auth #keychain #ssh #cron