stdin seems much slower than stdout (python). Why?

Question

I have two python programs (one is a subprocess) that need to communicate with each other. Currently I am doing that through stdin and stdout. However, writing to the subprocess's stdin seems painfully slow.

a.py, a program that takes an arbitrary line of input and prints the time:

from time import time, sleep
from sys import stdout, stdin
while True:
    stdin.readline()
    stdout.write('%f\n' % time())
    stdout.flush()

b.py, a program that runs a.py and times how long it took to write to the program's stdin and read from it's stdout:

from time import time
from subprocess import PIPE, Popen
from threading import Thread
stdin_times = []
stdout_times = []
p = Popen(['python', 'a.py'], stdin=PIPE, stdout=PIPE)
for i in range(100000):
    t1 = time()
    p.stdin.write(b'\n')
    p.stdin.flush()
    t2 = float(p.stdout.readline().strip().decode())
    t3 = time()
    stdin_times.append(t2 - t1)
    stdout_times.append(t3 - t2)
p.kill()
print('stdin (min/ave):', min(stdin_times), sum(stdin_times) / len(stdin_times))
print('stdout (min/ave):', min(stdout_times), sum(stdout_times) / len(stdout_times))

Sample output:

stdin (min/ave): 1.69277191162e-05 0.000138891274929
stdout (min/ave): 1.78813934326e-05 2.09228754044e-05

I'm using Python 3.1.2 on Ubuntu 10.10.

Why is writing to a.py's stdin so much slower than reading from its stdout? Is there anyway I can get these two programs to communicate faster?

How many times have you run this test to come up with those numbers? Perhaps between the time a.py and b.py were scheduled, crond may have started an updatedb task or ntpdate might have skewed your clock or... — sarnold, Commented Mar 22, 2011 at 7:25
You don't want averages in this context, you want minimum times (since "best case" = "nothing else interfering"). — ncoghlan, Commented Mar 22, 2011 at 7:44
You also want to get all the extraneous junk out of your timing loop. Cache the method lookups, use b'\n' instead of calling encode, calculate t2 after recording t3. However, you're still going to be at the mercy of the OS scheduler. If it has decided the worker process is non-interactive, it may be optimising its scheduling for IO throughput rather than low latency. — ncoghlan, Commented Mar 22, 2011 at 7:52
I've updated to show mins and they are comparable, but I don't think that really helps me. I'm not trying to prove the theoretical inferiority of stdin; I just need my process to be able to communicate faster (on average). — Conley Owens, Commented Mar 22, 2011 at 7:53

sehe · Accepted Answer · 2011-03-28 12:22:46Z

2

I'd see if you can reproduce this when disabling buffering on both input and output. I have a hunch that output is being (line) buffered by default (as it is in most languages: perl, .NET, C++ iostreams)

answered Mar 28, 2011 at 12:22

sehe

388k47 gold badges459 silver badges651 bronze badges

python subprocess defaults bufsize to 0 <docs.python.org/release/3.1.3/library/subprocess.html>. Also, I'm obviously flushing every time the output needs to be flushed. Isn't flushing when I need to just as good as setting the buffer sizes to 0?
– Conley Owens
Commented Mar 28, 2011 at 18:42
it's just as good, but it's not the same. depending on what was costing time, it may make the difference...
– sehe
Commented Mar 28, 2011 at 18:56

Add a comment |

chuck · Accepted Answer · 2011-03-22 08:59:07Z

0

If I try this a few times, the variance of the numbers is very high.

Maybe you should set up a more sane test to benchmark stdin and stdout, without a lot of other overhead, so you don't measure other stuff that's going on on your cpu. Also you're measuring string operations and float conversion.

edited Mar 22, 2011 at 8:59

answered Mar 22, 2011 at 7:29

chuck

4933 silver badges9 bronze badges

1

That's not the opposite. stdin is still slower in your example. Also, I've updated my post to show averages
– Conley Owens
Commented Mar 22, 2011 at 7:41
1

That's not actually reversed, it's just that both are being printed in the same output format -- the stdin time is still greater than the stdout time.
– sarnold
Commented Mar 22, 2011 at 7:41

Add a comment |

Collectives™ on Stack Overflow

stdin seems much slower than stdout (python). Why?

2 Answers 2

Not the answer you're looking for? Browse other questions tagged
python
io
stdio
or ask your own question.

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Not the answer you're looking for? Browse other questions tagged pythoniostdio or ask your own question.

Related

Not the answer you're looking for? Browse other questions tagged
python
io
stdio
or ask your own question.