4

I have two log files that are being generated from a decoded binary data. The decoders are slightly different, and I am trying to isolate the differences in the output. To do this, I am diffing the two log files, which works pretty well except that the time stamps are different for each line. For certain reasons, the differences in the time stamps is not relevant, so I want diff to ignore them.

Because the log files follow a specific format, I can simply exclude the last ~40 characters from each line to ignore the time stamps. EX:

Line A:

[T9] | ENTRY NAME                       varA             = 0000012B  varB             = 00000000 | 000015.508.107.113s | file.cpp              :738

Line B:

[T9] | ENTRY NAME                       varA             = 0000012B  varB             = 00000000 | 000015.508.107.163s | file.cpp              :738

These lines should be treated as identical in my case.

How can I tell diff to only include the first n characters from each line, or exclude the last m characters from each line?

1 Answer 1

8

In bash, you can use process substitution.

To remove last 40 characters, you can use

diff <(sed 's/.\{40\}$//' file1) \
     <(sed 's/.\{40\}$//' file2)

To select the first 40 characters, you can use

cut -c1-40 file
1
  • 2
    I combined both parts to do the following: diff <(cut -c -97 fileA.txt) <(cut -c -97 fileB.txt) > log.patch
    – TheBat
    Commented Aug 18, 2016 at 16:00

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .