Timeout within a popen works, but popen inside a timeout doesn't?

Question

It's easiest to explain in code:

require 'timeout'

puts "this block will properly kill the sleep after a second"

IO.popen("sleep 60") do |io|
  begin
    Timeout.timeout(1) do
      while (line=io.gets) do
        output += line
      end
    end
  rescue Timeout::Error => ex
    Process.kill 9, io.pid
    puts "timed out: this block worked correctly"
  end
end

puts "but this one blocks for >1 minute"

begin
  pid = 0
  Timeout.timeout(1) do
    IO.popen("sleep 60") do |io|
      pid = io.pid
      while (line=io.gets) do
        output += line
      end
    end
  end
rescue Timeout::Error => ex
  puts "timed out: the exception gets thrown, but much too late"
end

My mental model of the two blocks is identical:

flow chart

So, what am I missing?

edit: drmaciver suggested on twitter that in the first case, for some reason, the pipe socket goes into non-blocking mode, but in the second it doesn't. I can't think of any reason why this would happen, nor can I figure out how to get the descriptor's flags, but it's at least a plausible answer? Working on that possibility.

this behavior occurs on at least 1.8.7 and 1.9.3. jruby blocks for all 60 on both blocks, which is the behavior I'd have guessed a priori. — llimllib, Commented Jun 21, 2013 at 15:46
Note that your puts("but this one...") between the two blocks for me waits until the first sleep is complete, because the first IO#popen block is dutifully in a call to waitpid(). If you don't want that, then your rescue logic needs to kill the child process. — pilcrow, Commented Jun 21, 2013 at 15:54
@pilcrow the real code does kill the proc, but I removed it for this code... I shouldn't have. Will re-add it, thanks — llimllib, Commented Jun 21, 2013 at 15:56
@llimllib Timeout accepts an arbitrary class as a second argument to override the default of Timeout::Error. If you plug in your own class that records system time when initialized, you should be able to determine if your problem is due to when the Timeout tries to raise, or to the way the Ruby VM propogates raised exceptions in the context of subprocesses. Not an answer, but maybe a path to one. — animal, Commented Jun 21, 2013 at 16:53

pilcrow · Accepted Answer · 2014-04-07 21:44:00Z

Aha, subtle.

There is a hidden, blocking ensure clause at the end of the IO#popen block in the second case. The Timeout::Error is raised raised timely, but you cannot rescue it until execution returns from that implicit ensure clause.

Under the hood, IO.popen(cmd) { |io| ... } does something like this:

def my_illustrative_io_popen(cmd, &block)
  begin
    pio = IO.popen(cmd)
    block.call(pio)      # This *is* interrupted...
  ensure
    pio.close            # ...but then control goes here, which blocks on cmd's termination
  end

and the IO#close call is really more-or-less a pclose(3), which is blocking you in waitpid(2) until the sleeping child exits.

You can verify this like so:

#!/usr/bin/env ruby

require 'timeout'

BEGIN { $BASETIME = Time.now.to_i }

def xputs(msg)
  puts "%4.2f: %s" % [(Time.now.to_f - $BASETIME), msg]
end

begin
  Timeout.timeout(3) do
    begin
      xputs "popen(sleep 10)"
      pio = IO.popen("sleep 10")
      sleep 100                     # or loop over pio.gets or whatever
    ensure
      xputs "Entering ensure block"
      #Process.kill 9, pio.pid      # <--- This would solve your problem!
      pio.close
      xputs "Leaving ensure block"
    end
  end
rescue Timeout::Error => ex
  xputs "rescuing: #{ex}"
end

So, what can you do?

You'll have to do it the explicit way, since the interpreter doesn't expose a way to override the IO#popen ensure logic. You can use the above code as a starting template and uncomment the kill() line, for example.

I was staring for so long at io.c, looking just a few lines above the ensure, without seeing the ensure or considering it at all. Great answer, thanks so much. — llimllib, Commented Jun 21, 2013 at 17:53

Jonathan Julian · Accepted Answer · 2013-06-21 15:47:08Z

1

In the first block, the timeout is raised in the child, killing it and returning control to the parent. In the second block, the timeout is raised in the parent. The child never gets the signal.

See io.c https://github.com/ruby/ruby/blob/trunk/io.c#L6021 and timeout.rb https://github.com/ruby/ruby/blob/trunk/lib/timeout.rb#L51

answered Jun 21, 2013 at 15:47

Jonathan Julian

12.2k2 gold badges43 silver badges48 bronze badges

I know it's not a detailed answer Bill, but that's the way I read the blocks.
– Jonathan Julian
Commented Jun 21, 2013 at 15:47
1

A block passed to IO#popen is executed in the context of the parent process. I'm not sure what you mean when you say that the child process could or could not "[get] the signal."
– pilcrow
Commented Jun 21, 2013 at 15:50
@JonathanJulian I've had those two files open in a split screen, trying to figure it out. As far as I can tell, the timeout is spawned from the main thread in both examples. Here's where popen runs the block it's passed: github.com/ruby/ruby/blob/trunk/io.c#L6075
– llimllib
Commented Jun 21, 2013 at 16:00
@JonathanJulian I instrumented it to show that the timeout is being spawned from the same thread both times: gist.github.com/llimllib/48da4b10d9b6bd6538d5
– llimllib
Commented Jun 21, 2013 at 16:03

Add a comment |

Collectives™ on Stack Overflow

Timeout within a popen works, but popen inside a timeout doesn't?

2 Answers 2

Not the answer you're looking for? Browse other questions tagged
ruby
multithreading
io
timeout
subprocess
or ask your own question.

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Not the answer you're looking for? Browse other questions tagged rubymultithreadingiotimeoutsubprocess or ask your own question.

Related

Not the answer you're looking for? Browse other questions tagged
ruby
multithreading
io
timeout
subprocess
or ask your own question.