rsync is a utility for efficiently transferring and synchronizing files between a computer and an external hard drive and across networked computers by comparing the modification times and sizes of files. It is commonly found on Unix-like operating systems.
In other words, it's the linux hobbyist's best friend when it comes to efficient networked data transfer between SSH-enabled hosts.
Over the years I've gathered quite a few tips and tricks for (ab)using the power of
Before proceeding with actual transfers, or when using the dangerous
--delete flag, it's useful to get a preview of the operations
rsync will perform.
List files present on
SRC but not on
This offers an accurate preview of what will get transfered over the wire:
-n flag, the shorthand for
It is wise to always use it for testing commands which
List files that would be transferred from
Sometimes it's useful to mirror a local directory structure using hard links.
A good example is wanting to use a backup tool that does not yet support advanced include/exclude/filter logic.
We can piggyback on
rsync to do that for us, then run the tool against the filtered "mirror".
ROOT/.rsync_mirror using a
ROOT/.rsync_mirror using a
To avoid cycles make sure the exclude or filter file references itself, as well as the mirror directory:
Sometimes, due to limited computing capacity on the receiver, or simply because we're dealing with compressed binary files, it's useful to skip the checksum checks and act solely based on the file-size.
This can be achieved using the
Other times, we're not interesting in all the stuff that the
-a archive mode would transfer.
We can easily exclude a bunch of stuff:
When computing power permits, force checksum-based skipping even when the
mtime and the
size of a file match by using the
-N flag to transfer the creation time of files. Good for those special cameras who don't include timestamps in the file names.
Transfer only the directory structure🔗
rsync to filter in everything that looks like a directory and filter out everything else:
Alternatively, using the
Filter based on file prefixes🔗
Use different SSH keys and/or parameters🔗
Change ownership & permissions during transfer🔗
Congratulations for making it thus far. Let the fun stuff begin!
Detect file moves & renames🔗
Sometimes we get in that special mood of moving files and directories around in an effort to take control of the festering pile of bytes that make up our hard acquired digital hoards.
We proceed with the re-org, only to realize, with a certain degree of horror, that we now have to sync the changes to the a redundant remote hoard. Why the horror? Because
rsync will transfer the entire content of moved files, unable to detect complex move operations. (No, the
--fuzzy flag doesn't help.)
So what gives?
BEFORE the re-org - make a hard linked copy of the working tree, either by using the rsync itself or a simple
Now do the re-org in the
~/media/photo-work dir: renaming, moving, adding and deleting as you see fit, but DO NOT touch the tree in
When done with the re-org:
Finalize by swapping the original and
-work trees on both machines.
We already know how to preview file transfers.
Let's keep the filenames only and use
split in streaming (round-robin) mode to create equal work logs for a bunch of
-n r/8 flag tells split to use round-robin for populating 8 output files, splitting the input at the line boundary. Since it's impossible to know the length of standard input data in advance, this is the only viable splitting strategy.
parallel and the
--files-from flag to start the actual transfers:
-j 8 flag matches the number of files we've generated with
Caveat: the method outlined here does not guarantee an even distribution of transferred data between workers.