1

I've been trying to write file data as fast as possible.

  • I've increased the buffer size to reduce i/o operations.
  • I've tested with both fstream and fopen.

For some reason fstream is faster than fopen.

  • on a 64 byte buffer it's ~1.3 times faster
  • on a 8192 byte buffer it's ~4.8 times faster.

I've been hearing that C's file I/O is faster (which makes sense)
<fstream> includes <stdio.h> yet i can't get fopen to perform as fast.

NOTE (old questions):

  • my fopen was 2 times slower than fstream because i used fprintf (thanks jamesdlin)
  • fstream buffer wasn't changing since you have to set it before opening (thanks Paul Sanders)

also realized fstream.put(char) is faster than fstream << char
(otherwise fopen is faster than fstream if the buffer is < ~256)

Here's my testing:

#include <iostream>
#include <fstream>
#include <ctime>

int filesize; // total bytes (individually "put" in buffered stream)
int buffsize; // buffer size

void writeCPP(){
    std::ofstream file;
    char buffer[buffsize]; file.rdbuf()->pubsetbuf(buffer,buffsize);    // set buffer (before opening)
    file.open("test.txt",std::ios::binary);                             // open file
    for(int i=0; i<filesize; i++) file.put('a');                        // write bytes
    file.close();                                                       // close
}

void writeC(){
    FILE* file=fopen("test.txt","wb");                                  // open file
    char buffer[buffsize]; setvbuf(file,buffer,_IOFBF,buffsize);        // set buffer
    for(int i=0; i<filesize; i++) fputc('a',file);                      // write bytes
    fclose(file);                                                       // close
}

#define getTime() double(clock())/CLOCKS_PER_SEC // good enough

double start;

void test(int s){ // C++ vs C (same filesize / buffsize)
    buffsize=s;
    std::cout<<"  buffer: "<<buffsize<<"\t"<<std::flush;

    start=getTime();
    writeCPP();
    std::cout<<"  C++: "<<getTime()-start<<",\t"<<std::flush;

    start=getTime();
    writeC();
    std::cout<<" C: "<<getTime()-start<<std::endl;
}

#define MB (1024*1024)

int main(){
    filesize=10*MB;
    std::cout<<"size: 10 MB"<<std::endl;

    // C++ fstream faster
    test(64);   // C++ 0.86 < C 1.11 (1.29x faster)
    test(128);  // C++ 0.44 < C 0.79 (1.80x faster) (+0.51x)
    test(256);  // C++ 0.27 < C 0.63 (2.33x faster) (+0.53x)
    test(512);  // C++ 0.19 < C 0.56 (2.94x faster) (+0.61x)
    test(1024); // C++ 0.15 < C 0.52 (3.46x faster) (+0.52x)
    test(2048); // C++ 0.14 < C 0.51 (3.64x faster) (+0.18x)
    test(4096); // C++ 0.12 < C 0.49 (4.08x faster) (+0.44x)
    test(8192); // C++ 0.10 < C 0.48 (4.80x faster) (+0.72x)
}
15
  • 1
    I did not read how you did your measurements - You did them wrong.
    – Ted Lyngmo
    Commented Jul 5, 2019 at 22:26
  • 4
    @TedLyngmo That comment is borderline incomprehensible.
    – melpomene
    Commented Jul 5, 2019 at 22:41
  • 1
    Also, finding C++ streams to be slow is not an "attack".
    – melpomene
    Commented Jul 5, 2019 at 22:42
  • 1
    Er, isn't the question claiming that C++ streams are faster than C streams? How are C++ being "under attack" is any sense?
    – jamesdlin
    Commented Jul 5, 2019 at 22:47
  • 2
    @jamesdlin thank you! that solved it. lol finally someone actually trying to help!
    – Puddle
    Commented Jul 5, 2019 at 22:50

3 Answers 3

3

In WriteCPP, you have to set the buffer before opening the file, like so:

std::ofstream file;
char buffer[BUFF]; file.rdbuf()->pubsetbuf(buffer, BUFF);   // set buffer
file.open ("test.txt", std::ios::binary);                   // open file

Then you get the sort of results that you might you expect (times are for writing 20MB with the buffer sizes shown):

writeCPP, 32: 2.15278
writeCPP, 128: 1.21372
writeCPP, 512: 0.857389

I also benchmarked WriteC with your change from fprintf to fputc and got the following (again writing 20MB):

writeC, 32: 1.41433
writeC, 128: 0.524264
writeC, 512: 0.355097

Test program is here:

https://wandbox.org/permlink/F2H2jcrMVsc5VNFf

2
  • 1
    Interestingly enough, on my rig (MacOS, Apple LLVM version 10.0.1 (clang-1001.0.46.4), -O2), the performance of cpp still outshines c considerably (by roughly the same ratio the OP is seeing). Expected results on wandbox here, actual results on my rig (same code) here
    – WhozCraig
    Commented Jul 6, 2019 at 0:14
  • @WhozCraig Looks like libc++'s basic_filebuf uses a FILE* under the hood, and that FILE has its own buffer that you can't change. That means you'll get odd results with any buffer size that's not a multiple of FILE's default buffer size (4096 on my copy). Commented Jul 6, 2019 at 0:51
2

fprintf has extra overhead since it needs to scan its input string for format specifiers, so you're not quite doing an apples-to-apples comparison.

A better comparison would be to use fputs instead of fprintf or to use fputc and then use file << 'a' in the iostream version.

1
  • thanks again. i've updated the question to only be about the buffer now. (since i was still unable to get better performance by changing that)
    – Puddle
    Commented Jul 5, 2019 at 23:01
0

The only standard-defined behavior for std::basic_filebuf::setbuf is that setbuf(0, 0) sets the stream to unbuffered output, and even then I wouldn't count on it.

The actual behavior of setbuf varies wildly from implementation to implementation:

  • libstdc++: setbuf only works if called before the file is opened. Beyond that, it does what you would probably expect. Every time the buffer fills you'll get one call to the underlying write syscall.
  • libc++: setbuf can be called after opening a file, but before any I/O is done. Every time the buffer fills you get a call to fwrite on an underlying FILE*. That means that the output is still buffered using the FILE's internal buffer. There's no way to access that internal FILE* to setbuf or setvbuf on it, so you're stuck with the default buffer size (currently 4096 bytes, in glibc's implementation at least).
  • MSVCRT: basic_filebuf shares its buffers with its underlying FILE object. setbuf just passes the buffer you give it on to a call to setvbuf on an underlying FILE*. May be called at any time, but will discard any previously-buffered data.

Not the answer you're looking for? Browse other questions tagged or ask your own question.