I'm currently writing a bash script in which I'm using a lot of named pipes. I thought that this might create a bit of overhead as compared to using pipes directly, but I decided that I'm okay with that, I just wanted to get some stats to find out how much exactly I'm dealing with. I therefore ran these two commands 50 times each, writing down the times to then average:
time seq 1000000 | sort | head;
time seq 1000000 | cat >a | cat a | sort | head; #a was created with mkfifo
This is not the actual way I'll be using named pipes
To write down the times, I used this command:
for i in `seq 50`; do { time seq 1000000 | sort | head; } 2>&1 | grep real | cut -c8-12 >> normal_pipe; done
To my astonishment, I discovered these results:
Normal pipe:
Average: 1.712 sec
stddev: 0.0157 sec
Unnamed pipe:
Average: 1.644 sec
stddev: 0.0339 sec
My questions are now:
- Why is the named pipe faster?
- Is my benchmarking setup flawed?
- Or is the difference probably just due to other processes running in the background?
I'm guessing that, since sort
can only start working once it has all the input (right?), this is about how quickly the pipe spits out the EOF...
seq 100 | cat > a | cat a | sort
can be written asseq 100 > a & sort a
.