Quantcast

IN THIS MONTH'S ISSUE!
iDEALS
Rsync - Backup All Your Data With a Single Terminal Command
Posted 05/19/2009 at 6:37:20pm | by Jason Schroeder

A few years ago, I was faced with the daunting task of manually migrating information from one SAN to another SAN, without the luxury of having an automated migration system.  Anyone who is dealing with terabytes of data has, at one time or another, faced having to archive, transfer, or backup thousands of files at once.

I was introduced to a command line program called rsync to accomplish this task. Little did I know this one utility would be a permanent part of my toolset when managing large amounts of data. If you can't copy all of your data in one session, and you need to break it up into several days (or weeks) of data copying, rsync's usefulness becomes all the more apparent.

Rsync is a command line program. There are GUI variants built on the CLI base, but for the most part, the execution of rsync is so simple you shouldn't need a GUI to use it. We are going to set up a basic local rsync to show its basic function.

1. Go into your Documents folder. Create a test folder called "Data" and another folder called "DataBackup". For the purposes of this test, put a few files in the "Data" folder that aren't very big. A few image files or documents should be fine.

For this basic exercise, we are going to assume your hard drive is named "Macintosh HD" and your user name is "Joe". Please treat these as placeholders when reviewing the following examples, and put your own information in as required.

Note: Spaces in the command line are handled with a \ preceding the space. For example: Macintosh HD = Macintosh\ HD

2. Navigate to Applications>Utilities>Terminal.app.

3. Type the following command:

rsync -avx --progress /Volumes/Macintosh\ HD/Users/Joe/Documents/Data/ /Volumes/Macintosh\ HD/Users/Joe/Documents/DataBackup/

You should see something similar to this as on output:

building file list ...
3 files to consider
./
Picture 1.png
      409559 100%  179.67MB/s    0:00:00 (xfer#1, to-check=1/3)
self_portrait_two_sides.jpg
     4721398 100%   15.74MB/s    0:00:00 (xfer#2, to-check=0/3)

sent 5131825 bytes  received 70 bytes  10263790.00 bytes/sec
total size is 5130957  speedup is 1.00


--

Now, what exactly did rsync do? It would appear, by virtue of the output, that two files were simply copied. Actually, rsync parsed the source and the target folders, and copied over files that didn't exist on the target.

The real power of rsync is evident in my next execution. For this example, I added two more files to my "Data" folder, and ran the exact same rsync command:

building file list ...
5 files to consider
./
Army.jpg
       29993 100%    0.00kB/s    0:00:00 (xfer#1, to-check=3/5)
Manny375.jpg
       56776 100%   54.15MB/s    0:00:00 (xfer#2, to-check=2/5)

sent 87057 bytes  received 70 bytes  174254.00 bytes/sec
total size is 5217726  speedup is 59.89



Notice how the number of files "to consider" increased, but only two additional files were copied? Rsync knew the other files were the same as the originals in my "Data" directory, so it did not take the time to copy them. Now that you have an idea of rsync's mechanisms, here is a breakdown of the command and how it was used in this example:

rsync [options] [source] [target]

The options I used in my example are the defaults I use for most of my rsyncing. Here is a breakdown of the options I used and how the affect the outcome of the rsync:

-a  - "archive" rsync, includes ownership info and extended attributes extremely useful for moving large volumes of data and keeping AD/OD/POSIX permissions intact

-v - "verbose" gives the user more information on the rsync display

-x  - prevents crossing filesystem boundaries

--progress  - combined with the "-v" option, gives you the best in-terminal display of rsync's progress


---

Now that you've exposed yourself to a very basic rsync, here are a few tips to make using this software easier to use. Not only will these tips help you speed up your rsync use, some of them will help with you feeling comfortable on the command line overall.

First, rsync has an option that allows a "dry run", so that you can test an rsync execution without actually moving any data. By default I always include this option in my first run of an archive to make sure my directories in order. This option is:

-n  - "test run", "dry run", shows output but doesn't actually copy anything

Very often this just translates to adding an "n" to your option string, so instead of typing

-avx

you will type

-avxn

 

The second tip for efficient command line execution of rsync is how your "Tab" key operates when typing out directory names. If you think you need to type out "/Volumes/Macintosh\ HD/Documents/blahbalhbalbhalbahaba" every single time, you are wrong! Here are a few CLI shortcuts to help you avoid excessive typing:

up arrow / down arrow - cycle through commands previously typed


TAB  - autofill known directory names, for example, if I type "/Volu" and hit TAB the CLI will auto-fill the rest and display "/Volumes".

Lastly, spaces in command line directory names can be a little frustrating. Make use of the backslash to indicate where a space exists:

directory name:        /Volumes/MacHD/Users/Joe/Stuff I Like/

is actually

directory name:        /Volumes/MacHD/Users/Joe/Stuff\ I\ Like/

 

COMMENTS: 2
TAGS:  Terminal
COMMENTS
avatarHow to automate?

This is great! How do I create something automated so I can run this once a week to back up my entire hard drive, or better yet, create this to run automatically once a week?

Login or register to post comments
avatarAutomate with cron/cronnix

Another Unix utility, cron, schedules things to run at given intervals. And a free GUI utility, cronnix, will help you schedule cron runs for any command line or script. Just type the rsync command into cronnix and set the intervals for running it.

Login or register to post comments