160

I'm trying to understand what the difference is between two options

rsync --size-only

and

rsync --ignore-times

It is my understanding that by default rsync will compare both the timestamps and the file sizes in order to decide whether or not a file should be synchronized or not. The options above allow the user to influence this behavior.

Both options seem, at least verbally to result in the same thing: comparing by size only.

Am I missing something subtle here?

1
  • 24
    This would probably fit better on something like SuperUser.com or Unix.SE, since it's about using an existing (non-programming related) tool rather than anything directly related to writing code. Commented Dec 8, 2012 at 15:32

4 Answers 4

147

The short answer is that --ignore-times does more than its name implies. It ignores both the time and size. In contrast, --size-only does exactly what it says.


The long answer is that rsync has three ways to decide if a file is outdated:

  1. Compare the size of source and destination.
  2. Compare the timestamp of source and destination.
  3. Compare the static checksum of source and destination.

These checks are performed before transferring data. Notably, this means the static checksum is distinct from the stream checksum - the later is computed while transferring data.

By default, rsync uses only 1 and 2. Both 1 and 2 can be acquired together by a single stat, whereas 3 requires reading the entire file (this is independent from reading the file for transfer). Assuming only one modifier is specified, that means the following:

  • By using --size-only, only 1 is performed - timestamps and checksum are ignored. A file is copied unless its size is identical on both ends.

  • By using --ignore-times, neither of 1, 2 or 3 is performed. A file is always copied.

  • By using --checksum, 3 is used in addition to 1, but 2 is not performed. A file is copied unless size and checksum match. The checksum is only computed if size matches.

3
  • 9
    --checksum is exactly what I was looking for. I was copying build output that only had the time change for most of the files. Adding --checksum meant it ignored the time differences but made sure they were identical bit for bit. It was what I expected --ignore-times to do so thank you for additional info. Commented Jan 25, 2019 at 16:02
  • 3
    This answer is so clear and easy to understand. Thank you! I try and be "good" and start with the man pages and existing documentation... More times than not they are a thick haze of super deep detail level stuff and I end up back on SO for some excellent content like this. Commented Nov 21, 2020 at 4:26
  • That is probably the best explanation of rsync (and also one of the shortest) I've ever read. Bravo! if --ignore-times always copies a file, it would seem that a better described flag, such as, say, --always-copy, would be less perplexing
    – Bleakley
    Commented Jun 25 at 22:08
144

There are several ways rsync compares files -- the authoritative source is the rsync algorithm description: https://www.andrew.cmu.edu/course/15-749/READINGS/required/cas/tridgell96.pdf. The wikipedia article on rsync is also very good.

For local files, rsync compares metadata and if it looks like it doesn't need to copy the file because size and timestamp match between source and destination it doesn't look further. If they don't match, it cp's the file. However, what if the metadata do match but files aren't actually the same? Then rsync probably didn't do what you intended.

Files that are the same size may still have changed. One simple example is a text file where you correct a typo -- like changing "teh" to "the". The file size is the same, but the corrected file will have a newer timestamp. --size-only says "don't look at the time; if size matches assume files match", which would be the wrong choice in this case.

On the other hand, suppose you accidentally did a big cp -r A B yesterday, but you forgot to preserve the time stamps, and now you want to do the operation in reverse rsync B A. All those files you cp'ed have yesterday's time stamp, even though they weren't really modified yesterday, and rsync will by default end up copying all those files, and updating the timestamp to yesterday too. --size-only may be your friend in this case (modulo the example above).

--ignore-times says to compare the files regardless of whether the files have the same modify time. Consider the typo example above, but then not only did you correct the typo but you used touch to make the corrected file have the same modify time as the original file -- let's just say you're sneaky that way. Well --ignore-times will do a diff of the files even though the size and time match.

1
  • 1
    This answer and the answer by @MisterMiyagi both sound good, but say different things about --ignore-times. I believe that @MisterMiyagi is right and --ignore-times ignores size and times and doesn't do a diff or checksum at all (transferring all files). For a checksum, you need --checksum. Commented Jul 18, 2022 at 14:07
56

You are missing that rsync can also compare files by checksum.

--size-only means that rsync will skip files that match in size, even if the timestamps differ. This means it will synchronise fewer files than the default behaviour. It will miss any file with changes that don't affect the overall file size. If you have something that changes the dates on files without changing the files, and you don't want rsync to spend lots of time checksumming those files to discover they haven't changed, this is the option to use.

--ignore-times means that rsync will checksum every file, even if the timestamps and file sizes match. This means it will synchronise more files than the default behaviour. It will include changes to files even where the file size is the same and the modification date/time has been reset to the original value. Checksumming every file means it has to be entirely read from disk, which may be slow. Some build pipelines will reset timestamps to a specific date (like 1970-01-01) to ensure that the final build file is reproducible bit for bit, e.g. when packed into a tar file that saves the timestamps.

4
  • 4
    "resetting the date/time is unlikely to be done in practise, but it could happen" -- For example when using software that, in the name of reproducible builds, forcibly resets every file to 1970-01-01 instead of the date and time of the actual creation / modification.
    – user743382
    Commented May 17, 2015 at 20:03
  • 12
    Actually, I think you need the -c option if you want checksums to be used. Without it, --ignore-times will copy all files unconditionally. Commented Jan 13, 2017 at 1:19
  • 1
    The -a option may override these options. In my case I was using --compare-dir= and --size-only and getting unexpected results. Changing -a to -r solved the problem.
    – dbagnara
    Commented Jun 21, 2017 at 22:02
  • 1
    @dbagnara I confirmed today that --size-only "sits on top of" -a, or "overrides" -a. I had a drive that for whatever reason mounted with all modification times increased by a month. Rsync to backup was copying every file (with -a ON). Adding --size-only fixed the problem and led to the desired results (so -a --size-only). So I conclude that size-only overrides archive.
    – Tommy
    Commented Apr 12, 2020 at 19:20
1

On a Scientific Linux 6.7 system, the man page on rsync says:

--ignore-times          don't skip files that match size and time

I have two files with identical contents, but with different creation dates:

[root@windstorm ~]# ls -ls /tmp/master/usercron /tmp/new/usercron
4 -rwxrwx--- 1 root root 1595 Feb 15 03:45 /tmp/master/usercron
4 -rwxrwx--- 1 root root 1595 Feb 16 04:52 /tmp/new/usercron

[root@windstorm ~]# diff /tmp/master/usercron /tmp/new/usercron
[root@windstorm ~]# md5sum /tmp/master/usercron /tmp/new/usercron
368165347b09204ce25e2fa0f61f3bbd  /tmp/master/usercron
368165347b09204ce25e2fa0f61f3bbd  /tmp/new/usercron

With --size-only, the two files are regarded the same:

[root@windstorm ~]# rsync -v --size-only -n  /tmp/new/usercron /tmp/master/usercron

sent 29 bytes  received 12 bytes  82.00 bytes/sec
total size is 1595  speedup is 38.90 (DRY RUN)

With --ignore-times, the two files are regarded different:

[root@windstorm ~]# rsync -v --ignore-times -n  /tmp/new/usercron /tmp/master/usercron
usercron

sent 32 bytes  received 15 bytes  94.00 bytes/sec
total size is 1595  speedup is 33.94 (DRY RUN)

So it does not looks like --ignore-times has any effect at all.

1
  • 2
    --ignore-times would have copied the files even if their timestamps were the same. Commented Jun 4, 2017 at 8:08

Not the answer you're looking for? Browse other questions tagged or ask your own question.