I am looking at speed of writing to file vs a pipe. Please look at this code, which writes to a file handle unless there is a command line argument, otherwise it writes to a pipe:
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <iostream>
#include <chrono>
#include <string.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
using namespace std;
void do_write(int fd)
{
const char* data = "Hello world!";
int to_write = strlen(data), total_written = 0;
int x = 0;
auto start = chrono::high_resolution_clock::now();
while (x < 50000)
{
int written = 0;
while (written != to_write)
{
written += write(fd, data + written, to_write - written);
}
total_written += written;
++x;
}
auto end = chrono::high_resolution_clock::now();
auto diff = end - start;
cout << "Total bytes written: " << total_written << " in " << chrono::duration<double, milli>(diff).count()
<< " milliseconds, " << endl;
}
int main(int argc, char *argv[])
{
//
// Write to file if we have not specified any extra argument
//
if (argc == 1)
{
{
int fd = open("test.txt", O_WRONLY | O_TRUNC | O_CREAT, 0655);
if (fd == -1) return -1;
do_write(fd);
}
return 0;
}
//
// Otherwise, write to pipe
//
int the_pipe[2];
if (pipe(the_pipe) == -1) return -1;
pid_t child = fork();
switch (child)
{
case -1:
{
return -1;
}
case 0:
{
char buf[128];
int bytes_read = 0, total_read = 0;
close(the_pipe[1]);
while (true)
{
if ((bytes_read = read(the_pipe[0], buf, 128)) == 0)
break;
total_read += bytes_read;
}
cout << "Child: Total bytes read: " << total_read << endl;
break;
}
default:
{
close(the_pipe[0]);
do_write(the_pipe[1]);
break;
}
}
return 0;
}
Here is my output:
$ time ./LinuxFlushTest pipe
Total bytes written: 600000 in 59.6544 milliseconds,
real 0m0.064s
user 0m0.020s
sys 0m0.040s
Child: Total bytes read: 600000
$ time ./LinuxFlushTest
Total bytes written: 600000 in 154.367 milliseconds,
real 0m0.159s
user 0m0.028s
sys 0m0.132s
You can see writing to the pipe is way faster than the file from both the time
output and my C++ code timing.
Now, from what I know, when we call write()
the data will be copied to a kernel buffer, at which point a pdflush
style thread will actually flush it from the page cache to the underlying file. I am not forcing this flush in my code so there is no disk seeking delay.
But what I don't know (and can't seem to find out: and yes, I've looked at the kernel code but get lost in it, so no comments like "look at the code" please) is what different happens when writing to a pipe: is it not just a block of memory in the kernel somewhere that the child can read from? In that case, why is it so much faster than the basically identical process of writing to a file?