Using rsync to Create Backups and Sync Data


Browse by products and services

    • Applies to: DV
      • Difficulty: Medium
      • Time Needed: Variable
      • Tools Required: SSH

Overview

This article explains how to use the rsync command to create backups and sync data across hosts. Rsync is a very useful command that is often used to copy data, make backups, migrate hosts, and bridge the gap between site staging and production environments.

This article is provided as a courtesy. Installing, configuring, and troubleshooting third-party applications is outside the scope of support provided by Media Temple. Please take a moment to review the Statement of Support.  

Before you begin:

  • The OS used in this tutorial is Linux Ubuntu 16.04, though most unix environments (including Mac) support rsync and have it packaged as part of the software.
  • You can use this tutorial with your Media Temple VPS with Plesk or cPanel.
  • You’ll need to be familiar with the Linux CLI and basic server administration. 

Instructions

Begin by connecting to your server via SSH and performing an update. If you are using Plesk, avoid causing conflicts by updating from within the Plesk control panel instead of over SSH.

Ubuntu:

apt-get update

CentOS:

yum update

Begin by connecting to your server via SSH.

Create local backups

We’ll start with the most basic rsync format for copying (syncing) files. While connected to your server, execute the following command, replacing the file path info with yours:

rsync -av /path/to/directory1/ path/to/directory2/

The above command copies the contents of directory1 into directory2. An important consideration here is the final slash (/) in the file paths of the command. In our example, the contents of directory1 are copied into directory2, but directory1 will not be created in directory2. To accomplish this, we drop the slash after directory1:

rsync -av path/to/directory1 /path/to/directory2/

The flags:
-a: Copies files recursively and preserves users, groups, symbolic links, file permissions, and timestamps.
-v: As with many other commands, this option asks for verbose output. This is especially useful when copying large amounts of data.
--delete: This flag isn’t used here but it is a common feature of rsync. This option deletes any files or folders in the destination that aren’t at the source. Use with extreme caution!
-h or --help: This prints a help page that has useful information about using rsync.

Check out this doc for information about rsync's many other useful options.  

Why rsync?

You may be wondering why you would want to use rsync in this instance instead of the simpler ‘cp’ command. Often, cp is perfectly adequate, but there are a few reasons that you should consider using rsync instead of cp. One is that rsync only copies the delta (difference) between the files at the source and destination, possibly saving large amounts of system resources. Another is that it is possible to compress the data as it’s being copied. These differences are particularly meaningful when making regular backups, copying large sites or applications, and especially when sending that data over a network.

Combine rsync with cron to create scheduled backups

Cron is a useful tool that is used to schedule events that occur automatically. If you’d like to learn more about cron, check out this Media Temple community article.

To get started, open the crontab so that you can create a new job.

crontab -e

You may be asked to select an editor by pressing either 1 or 2. Make your selection and scroll to the end of the file. Cron’s syntax can seem confusing at first, but it's simple once you break it down.

The scheduling works like this:

* * * * command

The asterisks (*) correspond to specific blocks of time:
Minute (0-59) Hour (0-24) Day (1-7) Month (1-12) Weekday (0-6) command

You use numbers in place of asterisks to dictate when the specified command will run. For instance, let’s assume that you want to schedule your backup to run every Monday at 11:15pm:

15 23 * * 1 rsync -av path/to/directory1 /path/to/directory2/

That’s the 15th minute of the 23rd hour of the first day (Monday) of each week.

Or perhaps you need it to run each evening at 8:30pm:

30 20 * * * rsync -av path/to/directory1 /path/to/directory2/

That’s the 30th minute of the 20th hour of each day.

To help you get the hang of it, there’s a tool located here that breaks the syntax down nicely.

Sending data to a different host

Now that we know how to move data around locally with rsync and create scheduled backups with cron, let’s look at what rsync does really well, which is move data across networks. This is useful for many reasons. Perhaps you need to migrate hosts, or you’d like to create remote backups of files on your computer, or move a website from a staging or testing environment to a public facing web server. In any case where you need to move large amounts of data across a network, rsync is a great utility to know about.

Remote backups - It's always a good idea to have off-site backups of data. This helps protect you from events like hardware failure. 

Migrating Hosts - A common method used to complete a host migration is to download your site’s data and then upload it to the new host using S/FTP. With a simple rsync command, you can move that data directly from the old host to the new host, possibly saving large amounts of time.

Staging > Production - A common practice in web development is to create a staging site where changes are tested before being pushed to the production server. This helps you avoid introducing errors into your website, and can save you a lot of downtime. Rsync is great for pushing this data. Of course, this setup requires two separate server environments, which may not be practical for some. For WordPress users, Media Temple has made this process incredibly easy using our Managed WordPress Hosting service.

To move data from one host to another, use the structure in the command below. If you use OSX or Linux, you can use this structure to move files from your computer as well.

Don’t forget to replace ‘user’ with your user, and the ip address with the address of your remote host that you’ll be transferring data to. Unless you’re using ssh keys to connect to the remote server, you’ll be prompted for the remote user’s password. That remote user must also have write permissions for the target directory:

Be sure to use your access domain if you're sending data to or from your Media Temple Grid
rsync -avz path/to/local/directory1 user@123.456.789.234:/path/to/remote/directory2/

To download from a remote directory, simply reverse the order:

rsync -avz user@123.456.789.234:/path/to/remote/directory1 path/to/local/directory2/
  • You may want to add the --delete option when using rsync to make updates to a site. This will remove any files or folders from the destination that are not at the source. Use caution! A misstep could erase needed files.
  • The -z option specifies compression during the transfer process. This is useful when sending large amounts of data across the network.
  • In the event of a failed transfer, such as a dropped connection or similar, restart your transfer using the --append option. This will allow you to restart the transfer at the spot where the transfer failed. This is especially useful for very large transfers that are much more likely to error.


If you’re using a port other than the default 22 for ssh on the remote host–a good security practice–you’ll specify that in the command.

rsync -avz ‘ssh -p 1234’ path/to/directory1 user@123.456.789.234:/path/to/directory2/

Rsync works very well for migrating data from one to host to another, but if you’re creating remote backups that you won’t need to access on a regular basis, you may want to consider archiving the files using the ‘tar’ command or a similar utility prior to sending them out. This will significantly reduce the amount of storage space and bandwidth used:

tar -zcvf backup1.tar.gz path/to/files/
rsync -avz --remove-source-files path/to/backup1 user@123.456.789.234:/path/to/backups/

The ‘--remove-source-files’ option deletes the local tar file on the host so that you don’t end up with several unneeded local backups. 

You can create dated backups by adding the info to the command. The tar command below names the compressed file based on the current date and rsync grabs the file and syncs it with your backup server.

tar -zcvf "$(date '+%y-%m-%d').tar.gz" path/to/files/
rsync -avz --remove-source-files "$(date '+%y-%m-%d').tar.gz" user@123.456.789.234:/path/to/backups/


It’s also possible to create automated remote backups over ssh. To do that, generate an ssh key pair that doesn’t require a password to use. If you need help doing this, Media Temple has a helpful article that can have you set up in less than 20 minutes. Once you’ve created your keys, simply add the command to your crontab:


crontab -e
* * * * * rsync -avz path/to/directory1/ user@123.456.789.234:/path/to/directory2/


These examples should get you started with rsync, but it’s considered to be one of the more dynamic commands and can be molded to fit many more specific uses. If you have any questions or need additional guidance, please feel free to contact Media Temple's award winning 24/7 support