Imagine this crazy situation:
I have two servers. One has the files I want and the other is empty. I need to move all the files from A to B. I have SSH access to both servers.
Easy, right? Not so much.
rsync – this is a great little tool for syncing two servers over SSH. The whole thing is just dreamy, with one glaring exception: when it goes to mkdir the new directories on server B it doesn’t throw the mkdir -p flag and therefore can only create 1 child directory.
ie: If I am syncing opt/git/repos/test to /opt/ on B and server B has nothing in opt, rsync will attempt to create a new directory /git/repos/test and will fail because mkdir can’t make 3 directories at once.
scp – this is another great tool for transferring files securely over SSH. It easily solves the rsync problem by being able to create any necessary directories into any sub-levels necessary. So, what’s the catch? It won’t check for existing files, which means you have to transfer every single file, every single time. When I’m transferring over 20gigs in files, this just won’t cut it.
wget – as bad as it sounds, wget started looking like a great option for what I need to do. wget will FTP into a server and recursively download everything. The entire process is fairly quick and it’s great about ignoring files if the exist locally. The first time I ran this I believed I had the solution I needed. Unfortunately, after working with Server B for a while I started to notice that some files were missing. Ends up that, despite apparently completing successfully, the command missed hundreds of files. Running it again continues to find more files that weren’t transferred, but the logs don’t explain why it’s ending prematurely.
unison – unison is a tool I installed based on it’s promising attempts to fix the limitations of rsync and other tools. Unfortunately, after installing and reading through the user manual, it appears that unison is a heavier tool than I wanted or needed and requires logic based on what files had previously been transferred, which had changed, etc. I may have missed something here, but the logic was far different than what I was looking for “out of the box” so I moved on.
The Solution?
If you can address this correctly the first time it makes much more sense:
- First Run – Use SCP to securely transfer all the files from Server A to Server B the first time.
- Future Runs – Use rsync to compare the two servers and only download new/changed files.
The only downside to this situation is the fact that if you create two new child directories (ie: uploads/new/new2/) then rsync will still fail as it tries to create the two new directories at once. Alternatively, if the files are not substantial in size, you can tar them up and transfer as one large file.
What am I missing? There must be a better solution somewhere. All I’m asking for is rsync with a mkdir -p flag to create all the parent directories before it creates a child directory.