10

Why is it that using bash and suspending a while loop, the loop stops after being resumed? Short example below.

$ while true; do echo .; sleep 1; done
.
.
^Z
[1]+  Stopped                 sleep 1
$ fg
sleep 1
$

I'm familiar with signals, and I'm guessing this may be the natural behaviour of bash here, but I'd like to better understand why it behaves in this particular way.

4
  • because it has to handle an interrupt and has to accurately reflect that in $? on return, and so true is not then true. probably. i think.
    – mikeserv
    Commented Nov 26, 2015 at 13:54
  • 1
    I hope this comment doesn't get flagged, but I will answer your question with another question, Unix koan-style: "Why does a student stop fighting on the playground after he gets suspended?" The answer being, because he is no longer at the playground where he has the ability to start fights. Thus, the behavior in question has simply been halted. Commented Nov 26, 2015 at 14:06
  • You stop the command, the loop is broken. Then you resume the single sleep 1 command, not the loop.
    – 123
    Commented Nov 26, 2015 at 14:53
  • 1
    Relevant
    – 123
    Commented Nov 26, 2015 at 15:01

2 Answers 2

10

This looks like a bug in several shells, it works as expected with ksh93 and zsh.

Background:

Most shells seem to run the while loop inside the main shell and

Bourne Shell suspends the whole shell if you type ^Z with a non-login shell

bash suspends only the sleep and then leaves the while loop in favor of printing a new shell prompt

dash makes this command unsuspendable

With ksh93, things work very different:

ksh93 does the same, while the command is started the first time, but as sleep is a buitin in ksh93, ksh93 has a handler that causes the while loop to fork off the main shell and then suspend at the time when you type ^Z.

If you in ksh93 later type fg, the forked off child that still runs the loop is continued.

You see the main difference when comparing the jobcontrol messages from bash and ksh93:

bash reports:

[1]+ Stopped sleep 1

but ksh93 reports:

^Z[1] + Stopped while true; do echo .; sleep 1; done

zsh behaves similar to ksh93

With both shells, you have a single process (the main shell) as long as you don't type ^Z, and two shell processes after you typed ^Z.

4
  • doesn't dash actually wind up handling the signal when the loop terminates? in [d]?ash source code there are all of these macros for INTON and INTOFF dispersed throughout, and typically signals received while in an INTOFF state really are handled at (or around) INTON. anyway, i'm only curious cause i think you know better - it's a great answer. thank you.
    – mikeserv
    Commented Nov 26, 2015 at 16:07
  • I rarely use dash and I recently fetched and compiled it for performance comparisons with bash, ksh93 and my Bourne Shell. While doing these tests, I discovered that dash mainly seems to be fast because it does not include support for multi byte characters. A single sleep 100 can be suspended and resumed in dash, so it seems that dash knows about problems in this command and selectively disables job control.
    – schily
    Commented Nov 26, 2015 at 16:35
  • so, in your tests, you were able to equal dash's performance in other shells by dropping multibyte processing? and yes, dash does support job control, but the standard says an interactive shell should ignore TSTP, and running a while loop in the current shell at an interactive terminal is no less an interactive shell than any other.
    – mikeserv
    Commented Nov 26, 2015 at 16:37
  • I did not test this exactly, but I could see that while dash consumes more system CPU time than ksh93 or the Bourne Shell (mainly because it issues more fork() calls), it uses less user CPU time and this results in a similar total CPU time compared to my version of the Bourne Shell. From trying to reduce user CPU time in the Bourne Shell, I know that most of this time is spend in multibyte conversions.
    – schily
    Commented Nov 26, 2015 at 16:40
4

I wrote one of the co-authors of Bash about the issue, and here is his reply:

It's not really a bug, but it is certainly a caveat.

The idea here is that you suspend processes, which are a different unit of granularity than shell commands. When a process is suspended, it returns to the shell (with a non-zero status, which has consequences when you, say, stop a process that's the loop test), which has a choice: it can break out of or continue the loop, leaving the stopped process behind. Bash chooses -- and has always chosen -- to break out of loops when a job is stopped. Continuing the loop is rarely what you want.

Some other shells do things like fork a copy of the shell when a process gets suspended due to SIGTSTP, and stop that process. Bash hasn't ever done that -- it seems more complicated than the benefit warrants -- but if someone wants to submit that code as a patch, I'd take a look at incorporating the changes.

So if anyone wants to submit a patch, use the email addresses found in the man pages.

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .