Why diff3 on inputs from process substitution finds a difference while diff3 on same data in files finds no difference?

Question

diff3 on files finds no difference:

$ grep -P '\[\[.*?\]\]' -o intro.tex | sort > A.txt
$ grep -P '\[\[.*?\]\]' -o intro.tex | sort | uniq > B.txt
$ grep '\\pnum %% \[\[' intro.tex | sed 's/\\pnum %% //' | sort > C.txt
$ diff3 A.txt B.txt C.txt | wc -l
0

diff3 on process substitutions running the same commands finds a difference:

$ diff3 \
  <(grep -P '\[\[.*?\]\]' -o intro.tex | sort) \
  <(grep -P '\[\[.*?\]\]' -o intro.tex | sort | uniq) \
  <(grep '\\pnum %% \[\[' intro.tex | sed 's/\\pnum %% //' | sort) | wc -l
95

Why? Any ideas?

A minimum reproducer:

$ echo test > a
$ diff3 a a a
$ diff3 <(cat a) <(cat a) <(cat a)
====1
1:1c
  test
2:0a
3:0a

What do your eyes tell you? Is there a difference?
– White Owl
Commented May 12, 2023 at 11:36 — White Owl, Commented May 12, 2023 at 11:36

Stéphane Chazelas · Accepted Answer · 2023-05-12 14:17:08Z

If we run diff3 under strace -f:

$ strace -qqfe execve -e signal=none diff3 a b c
execve("/usr/bin/diff3", ["diff3", "a", "b", "c"], 0x7ffef2bb2d78 /* 53 vars */) = 0
[pid 13360] execve("/usr/bin/diff", ["diff", "--horizon-lines=100", "--", "b", "c"], 0x7ffc019c6d50 /* 53 vars */) = 0
[pid 13361] execve("/usr/bin/diff", ["diff", "--horizon-lines=100", "--", "a", "c"], 0x7ffc019c6d50 /* 53 vars */) = 0

As you can see diff3 calls diff twice, and the third file is one of the operands for both invocations.

With ksh-style <(...) process substitution, the file is a pipe.

cmd1 <(cmd2)

is the same as:

cmd2 | cmd1 /dev/fd/0

except that a fd other than 0 is used.

So, for the third argument to diff3, the first diff will consume the whole input, the second will have nothing left to read, so the file will appear empty.

So the third argument at least cannot be a pipe¹.

If using zsh, you can use the =(...) form of process substitution that uses a temporary file instead of a pipe:

diff3 <(cmd1) <(cmd2) =(cmd3)

(the first 2 can still be pipes).

In your case:

$ diff3 \
  <(grep -Po '\[\[.*?\]\]' intro.tex | sort) \
  <(grep -Po '\[\[.*?\]\]' intro.tex | sort -u) \
  =(sed -n 's/\\pnum %% \[\[/[[/p' intro.txt | sort)

(I've also removed the superfluous grep and uniq and moved the -o option before the non-options).

You could even do some factorisation with:

(){ diff3 $1 <(uniq<$1) $2; } =(grep -Po '\[\[.*?\]\]' intro.tex) \
                              =(sed -n 's/\\pnum %% \[\[/[[/p' intro.txt | sort)

^{¹ if the third argument is - which diff3 interprets as reading stdin, then it's the second argument that cannot be a pipe as that's the one that it passes to both diff invocations in that case. You'll notice that the manual says: "At most one of
these three file names may be -, which tells diff3 to read the
standard input for that file."}

Re: "In your case": in my case (bash) diff3 <(cmd1) <(cmd2) =(cmd3) leads to -bash: syntax error near unexpected token ('`. Any ideas / solutions for bash? — pmor, Commented May 15, 2023 at 9:43
@pmor, yes, as I said, that's for zsh. In bash, you'll have to create the temp files by hand with mktemp or equivalent as bash has no builtin support for creating temp files (older versions used temp files for << and <<< which you'd have been able to use on Linux/Cygwin, but that has been replaced with pipes in newer versions). — Stéphane Chazelas, Commented May 15, 2023 at 10:22

Stack Exchange Network

Why diff3 on inputs from process substitution finds a difference while diff3 on same data in files finds no difference?

1 Answer 1

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged
diff
diff3
.

Hot Network Questions

Why diff3 on inputs from process substitution finds a difference while diff3 on same data in files finds no difference?

1 Answer 1

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged diffdiff3.

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
diff
diff3
.