1

I have two large (2-3MB) text files with only a few hundred characters different. Normally diff would work fine for comparing them to find the differences. But although it's a text file, there are NO newlines at all in these files, which makes "diff" useless. The entire file is one gigantic line so even text editors hate it.

The files very frequently use : (colons) so maybe if I could insert a newline after each : character, diff might produce something meaningful. (Inserting newlines every N character isn't going to work because the lines will all be different after the first character that differs, so I think it needs to be based on some common pattern or character.)

How do use sed to do this?

7
  • (1) "so maybe if I could insert a newline after each : character" – How about a newline instead of each : character? tr can easily do this. Can you take it from here? (2) "The entire file is one gigantic line" – Is this line properly terminated by a newline character? "NO newlines at all" suggests it's not properly terminated. So it's an incomplete line, right? Commented Aug 14, 2022 at 21:09
  • Are both files the same size and do you just wish to know what is different? If they are the same size, you could use od and then compare the difference of the od output.
    – cup
    Commented Aug 14, 2022 at 21:45
  • Hmm, not familiar with tr but replacing the : chars would probably work. Good point, I expect it has a newline at the end. man page for tr is kinda vague.
    – Manius
    Commented Aug 14, 2022 at 21:49
  • They're different size files, a few thousand characters different, and I would guess mostly additions rather than changes.
    – Manius
    Commented Aug 14, 2022 at 21:50
  • Try Winmerge 64-bit You need a Windows machine for this
    – anon
    Commented Aug 14, 2022 at 21:56

0

You must log in to answer this question.

Browse other questions tagged .