9

There is bdiff(1) command in Solaris, which allow you to diff(1) files with size bigger than your RAM size (documentation).

Is there something like that in Linux? I tried googling but I don't find which package has bdiff in Ubuntu.

2

1 Answer 1

14

bdiff appears to be available on Linux (at least as part of the Heirloom Toolchest).

diff

I would probably just use regular old diff with this switch however:

diff --speed-large-files bigfileA bigfileB

Why it doesn't work?

See comment by @EvanTeitelman, --speed-large-files doesn't affect how files are loaded into memory.

Can be demonstrated/confirmed not to work using the following command:

fallocate -l 10G testa; fallocate -l 10G testb && \
        diff --speed-large-files -a testa testb

bsdiff

Hard to confirm this but I found a tool called bsdiff which derives from bdiff. I've confirmed that this tool is in Ubuntu, simply apt-get install bsdiff.

Why it might work?

Again thanks @EvanTeitelman in the comments, the bsdiff above is a diff tool for binary files. bsdiff is a binary diff tool and can deal with large files. It's unclear just how large. See the following links to a thread which discuss it's use.

rdiff

I think you could also use rdiff to do this as well. Rdiff is able to deal with very large files.

  1. Create a signature of one file:

    rdiff signature A sigs.txt
    
  2. Use generated signature file sigs.txt and the other big file B to create the delta:

    rdiff delta sigs.txt B deltaAB.txt
    
  3. Delta contains all the info you need to recreate file B when you have just A and the delta file deltaAB.txt.

    To recreate B, run:

    rdiff patch A deltaAB.txt B
    

Why it works?

I found this blog post titled: A Better diff Or What To Do When GNU diff Runs Out Of Memory ("diff: memory exhausted"), which reports that a rdiff of 4.5GB files only consumed ~66MB of RAM.

lfhex

lfhex is an application for viewing and editing files in hex, octal, binary, or ascii text. The main strength of lfhex is it's ability to work with files much larger than system memory. It's a GUI tool however.

screenshot

                  ss of lfhex]![ss of lfhex

References

7
  • Although this isn't always documented, GNU diff allows you to use -H as a synonym for --speed-large-files.
    – user26112
    Commented May 27, 2013 at 12:10
  • 2
    bsdiff is a binary diff tool, not a large-file diff tool.
    – user26112
    Commented May 27, 2013 at 12:17
  • 3
    Unfortunately, it seems as though the --speed-large-files flag doesn't affect the manner in which GNU diff loads files into memory. Try running fallocate -l 10G testa; fallocate -l 10G testb && diff --speed-large-files -a testa testb to confirm this. (Or take a look at the source code.)
    – user26112
    Commented May 27, 2013 at 12:25
  • 2
    I managed to build bdiff from the Heirloom Toolchest after replacing /sbin/sh by /bin/sh in the makefiles. Now when I try to execute it in place, I get bdiff: Can not execute '/usr/5bin/diff'. Sorry, I do not want to install anything to /usr/5bin/. This is not a viable solution. The other options mentioned here do not work for me because I want to eyeball the differences as text. Commented Feb 2, 2015 at 17:57
  • 1
    PS: Surprisingly, lfhex -c file1 file2 works well for me after setting View –> Editing base –> ASCII for both panes. Commented Feb 2, 2015 at 18:13

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .