Canonical Answer
Regarding rdiff the post, librsync 2.0.1
is a good read for the command functionality clarification so I've referenced that below to preserve the content to this answer if nothing else.
It's important to try to get a good understanding of the rdiff three steps to updating a file: signature, delta, and patch as talked about on the rdiff man page. I've also found an rdiff
command example script on GitHub that's helpful which I'll reference and quote.
Essentially...
- With a "starting" or base file [
file1
] and you create a signature file from it
- This is usually much smaller than the base/original file itself
- With the signature file you compare it against another file [
file2
] similar to your base file but different (e.g. recently
updated) and create a delta file containing just the differences
between the two files
- Use the "differences only" or delta file and compare it with your base file [
file1
] to generate a new file containing the changes
from the other file [file2
] matching the two.
rdiff signature file1 signature-file ## signature base file1
rdiff delta signature-file file2 delta-file ## delta differences file2
rdiff patch file1 delta-file gen-file ## compare delta to file1 to create matching file2
rdiff-example.sh
# $ rdiff --help
# Usage: rdiff [OPTIONS] signature [BASIS [SIGNATURE]]
# [OPTIONS] delta SIGNATURE [NEWFILE [DELTA]]
# [OPTIONS] patch BASIS [DELTA [NEWFILE]]
# Options:
# -v, --verbose Trace internal processing
# -V, --version Show program version
# -?, --help Show this help message
# -s, --statistics Show performance statistics
# Delta-encoding options:
# -b, --block-size=BYTES Signature block size
# -S, --sum-size=BYTES Set signature strength
# --paranoia Verify all rolling checksums
# IO options:
# -I, --input-size=BYTES Input buffer size
# -O, --output-size=BYTES Output buffer size
# create signature for old file
rdiff signature old-file signature-file
# create delta using signature file and new file
rdiff delta signature-file new-file delta-file
# generate new file using old file and delta
rdiff patch old-file delta-file gen-file
# test
diff -s gen-file new-file
# Files gen-file and new-file are identical
rdiff is a program to compute and apply network deltas. An rdiff delta
is a delta between binary files, describing how a basis (or old) file
can be automatically edited to produce a result (or new) file.
Unlike most diff programs, librsync does not require access to both of
the files when the diff is computed. Computing a delta requires just a
short "signature" of the old file and the complete contents of the new
file. The signature contains checksums for blocks of the old file.
Using these checksums, rdiff finds matching blocks in the new file,
and then computes the delta.
rdiff deltas are usually less compact and also slower to produce than
xdeltas or regular text diffs. If it is possible to have both the old
and new files present when computing the delta, xdelta will generally
produce a much smaller file. If the files being compared are plain
text, then GNU diff is usually a better choice, as the diffs can be
viewed by humans and applied as inexact matches.
rdiff comes into its own when it is not convenient to have both files
present at the same time. One example of this is that the two files
are on separate machines, and you want to transfer only the
differences. Another example is when one of the files has been moved
to archive or backup media, leaving only its signature.
Symbolically
signature(basis-file) -> sig-file
delta(sig-file, new-file) -> delta-file
patch(basis-file, delta-file) -> recreated-file
Use patterns
A typical application of the rsync algorithm is to transfer a file A2
from a machine A to a machine B which has a similar file A1. This can
be done as follows:
- B generates the rdiff signature of A1. Call this S1. B sends the signature to A. (The signature is usually much smaller than the file
it describes.)
- A computes the rdiff delta between S1 and A2. Call this delta D. A sends the delta to B.
- B applies the delta to recreate A2. In cases where A1 and A2 contain runs of identical bytes, rdiff should give a significant space
saving.
source
rdiff
would be valuable for future reference. Example: Let's sayfile1
andfile2
are two similar files of 1GB each. 1) How to compute the rdiff? 2) How to save this rdiff into apatch
file? 3) How to apply thispatch
file tofile1
to recoverfile2
?