21

I'm trying to pipe extremely high speed data from one application to another using 64-bit CentOS6. I have done the following benchmarks using dd to discover that the pipes are holding me back and not the algorithm in my program. My goal is to achieve somewhere around 1.5 GB/s.

First, without pipes:

dd if=/dev/zero of=/dev/null bs=8M count=1000
1000+0 records in
1000+0 records out
8388608000 bytes (8.4 GB) copied, 0.41925 s, 20.0 GB/s

Next, a pipe between two dd processes:

dd if=/dev/zero bs=8M count=1000 | dd of=/dev/null bs=8M
1000+0 records in
1000+0 records out
8388608000 bytes (8.4 GB) copied, 9.39205 s, 893 MB/s

Are there any tweaks I can make to the kernel or anything else that will improve performance of running data through a pipe? I have tried named pipes as well, and gotten similar results.

1

2 Answers 2

8

Have you tried with smaller blocks?

When I try on my own workstation I note successive improvement when lowering the block size. It is only in the realm of 10% in my test, but still an improvement. You are looking for 100%.

As it turns out testing further, really small block sizes seem to do the trick:

I tried

dd if=/dev/zero bs=32k count=256000 | dd of=/dev/null bs=32k
256000+0 records in
256000+0 records out
256000+0 records in
256000+0 records out
8388608000 bytes (8.4 GB) copied8388608000 bytes (8.4 GB) copied, 1.67965 s, 5.0 GB/s
, 1.68052 s, 5.0 GB/s

And with your original

dd if=/dev/zero bs=8M count=1000 | dd of=/dev/null bs=8M
1000+0 records in
1000+0 records out
1000+0 records in
1000+0 records out
8388608000 bytes (8.4 GB) copied8388608000 bytes (8.4 GB) copied, 6.25782 s, 1.3 GB/s
, 6.25203 s, 1.3 GB/s

5.0/1.3 = 3.8 so that is a sizable factor.

2
  • 1
    Thanks for figuring this out! I did some additional follow-on tests and found that it's really only the write speed that matters.
    – KyleL
    Commented Sep 27, 2012 at 23:56
  • 3
    IMO, the question doesn't match the question text nor the answer. I want to learn the answer to the actual question, myself. :D Commented May 6, 2017 at 23:48
3

It seems that Linux pipes only yield up 4096 bytes at a time to the reader, regardless of how large the writer's writes were.

So trying to stuff more than 4096 bytes into a already stuffed pipe per write(2) system call will just cause the writer to stall, until the reader can invoke the multiple reads needed to pull that much data out of the pipe and do whatever processing it has in mind to do.

This tells me that on multi-core or multi-thread CPU's (does anyone still make a single core, single thread, CPU?), one can get more parallelism and hence shorter elapsed clock times by having each writer in a pipeline only write 4096 bytes at a time, before going back to whatever data processing or production it can do towards making the next 4096 block.

2
  • 4096bytes or 4kb is the default page size in most systems. That's probably why this cap. Maybe If you can increase the page size in your system it will make the pipe read and write more data.
    – The Fool
    Commented May 23, 2021 at 23:05
  • 2
    @TheFool no, PIPE_BUF is a constant, and on Linux it is 4096, POSIX only mandates it to be at least 512 bytes, there's no way to increase it other than to patch the kernel, and you'd probably have a hard time because that's the size up to which the writes are atomic, the synchronization is probably the bottleneck.
    – user11877195
    Commented Mar 11, 2022 at 16:12

Not the answer you're looking for? Browse other questions tagged or ask your own question.