Table of Contents
Background
rsync
is powerful and easy to use utility to keep files in sync between two servers. The principal way of rsync
‘s command structure is to push files from point A to point B.
How to transfer a very large file (> GBs)
If you want to transfer a very large file from one server to another, rsync
is your friend.
Method I
rsync --archive --partial --inplace --progress --compress \ /home/bigfiles/ MYMIRROR::mybackups/bigfiles/
From: https://www.system-rescue.org/manual/Backup_and_transfer_your_data_using_rsync
Method II
Here is an example:
Server A: 50 GB file
Server B: NVMe mounted disk on remote host ready to do qm importdisk
Here is a transcript:
[[email protected]:/var/lib/libvirt/images]> date;rsync -avz --partial --progress iis.bigfile.qcow2 [email protected]:/mnt/fastdisk/;date Wed Jul 10 17:12:21 SAST 2024 sending incremental file list iis.bigfile.qcow2 50,627,477,504 90% 11.50MB/s 0:07:05
In this example, we’re using the date;
commands to mark start and end times, although --progress
also helps to see what’s going on. The key number to monitor here is the 11.50MB/s
. In fact, that is excruitiating slow and that 50 GB file takes almost an hour to transfer. The main reason is the source disk has many other VMs and it’s mirror 1, so it’s performance is terrible.
What is also notable about this transfer, is the “standard” switches of avz
is used, verbose
, compress
, and a bunch of other defaults like recursion that you would typically use.
The only thing that is however missing from this command is an &
which you would use if you have doubts that the transfer might be interrupted.
Note: This transfer was done on a running server. To get consistency, you would have to switch off the server and then do a final rsync
.
Another super important flag is noted here, --partial
. This allows us to recover if the rsync
aborts. Typically, when rsync
aborts, the temporary file is deleted. In the case of a very large file, you would want to keep it!
Method III – Large File from Slow Disk
After experimenting with method I, we discovered method II. Here is it:
nohup rsync -vP super-large.example.com.qcow2 [email protected]:/mnt/fastdisk/ &
This time we’re ignoring the famous -a
flags and rather just using -P
which combines keep partial
and progress
.
nohup ignores hang up signals but also super useful it print a log file it goes on. Unfortunately, this may also break as can be seen here:
cat nohup.out super-large.example.com.qcow2 206,964,523,008 19% 113.84MB/s 2:03:56 Killed by signal 1. rsync error: unexplained error (code 255) at rsync.c(638) [sender=3.1.2] rsync: [sender] write error: Broken pipe (32)
When running rsync as a background job, you’ll see the following in the process list:
ps -ef | grep rsync root 9192 1 0 09:32 ? 00:00:00 rsync -vP super-large.example.com.qcow2 [email protected]:/mnt/fastdisk/ root 9193 9192 0 09:32 ? 00:00:00 ssh -l root a.b.c.d rsync --server -ve.LsfxC --partial . /mnt/fastdisk/
The unfortunate thing about -P
is that it doesn’t really resume on a large file, rather it starts from scratch and does a comparison first and then only resumes. So if you have some kind of reliability issue, it’s still not the best solution.
Why use rsync instead of SCP?
To copy files between servers you can also use SCP. The problem however with SCP is it’s not as flexible, for example, if doesn’t have a “do not overwrite if exists” mode, nor “do not overwrite if date is never”. Instead, you can use this example:
rsync -au ~/public_html/uploads/ [email protected]:/home/forge/dev.example.com/public/uploads/
The u
flag means do not overwrite if target is newer.
Another example of an rsync
Command
rsync -vzru /home/location_A/ [email protected]:/home/location_B/
In the command above, files from location_A
on a local disk and server is `pushed` to location_B
on server2
.
The flags means the following:
v = verbose
z = compress
r = recurse into directories
u = skip files that are newer on the receiver
The r
is really useful because it means all directories and subdirectories will be copied.
u
means that it won’t copy files again which are the same as the destination receiver
Rsync Example with Custom SSH Port and Preserve Directory Permissions and Attributes
rsync -vrzua -e 'ssh -p 34229' /home/location_A/ [email protected]:/home/location_B/
In the above example, two additional flags were added, namely a
and -e
. These means:
-e 'ssh -p PORT'
a = archive which preserves permissions
The a
switch is extremely useful as it will copy all the attributes, e.g. directory permissions and so on.
Using rsync
in Real Time
Typically rsync
commands are stored in a CRON
to run every XX minutes.
If you want continuous rsync
, and you are well aware of what it’s going to do to your network traffic, then you need to use an utility called flock
because running one rsync
if the previous one hasn’t completed will cause problems.
Here is an example of the flock command:
flock -n lock_file -c "rsync ..."
References
man rsync
produces:
-a, --archive archive mode; equals -rlptgoD (no -H,-A,-X)
Other references
- https://unix.stackexchange.com/questions/12198/preserve-the-permissions-with-rsync
- https://stackoverflow.com/questions/9390134/rsync-cronjob-that-will-only-run-if-rsync-isnt-already-running/9976937
- https://www.atlantic.net/vps-hosting/how-to-use-rsync-copy-sync-files-servers/
- https://unix.stackexchange.com/questions/85993/why-rsync-attempts-to-copy-file-that-is-already-up-to-date
- https://www.tecmint.com/rsync-local-remote-file-synchronization-commands/
- https://unix.stackexchange.com/questions/14191/scp-without-replacing-existing-files-in-the-destination
See also
- https://kb.vander.host/operating-systems/how-to-get-rsync-to-not-overlap-because-the-previous-job-has-not-completed/