control seq for errors in unnamed pipe

Question

$ awk -v f=<(cmdmayfail) -e 'BEGIN { r = getline < f ; print r }'
-bash: cmdmayfail: command not found
0

In the above-unnamed pipe example, awk will not know the error from the unnamed pipe.

$ awk -v f=<(cmdmayfail || echo "control sequence" ) -e 'BEGIN { r = getline < f ; print r }'

To make awk aware of this error, I could use the above code by sending a control sequence and some error info.

Given that file already knows many file types, is there reasonable control sequence that can be used for this application so that awk will not mistreat the error control sequence as from a legitimate file? Thanks.

Ljm Dullaart · Accepted Answer · 2019-09-28 11:51:13Z

If your commandmayfail is a standard Unix command, preceding the controlsequence with your own complex text (for example __ERROR__CMDFAIL__:) should be sufficient to let awk understand the difference.

However, if you also include your own and/or propriety software, it is hard to give you a general string. It is possible (though unlikely) that one of your propriety commands uses such a string. You should look at the general setup of the error messages and create a string that is unlikely to be used.

If the commandmayfail is file, as the question suggests, it may be sufficient to use a string without :.

Kamil Maciorowski · Accepted Answer · 2019-09-30 08:41:15Z

If the output from cmdmayfail is relatively small and the command itself terminates independently from other parts of your code, then you can store its output in a variable and pass the exit status as the very first line. The code in <() would be like:

out="$(cmdmayfail)"; printf '%s\n' "$?" "$out"

Your awk should getline<f to get the exit status. Consecutive getline<f will read the actual output of cmdmayfail.

Limitations:

Variables in Bash cannot store null characters.
$() will strip all trailing newline characters; then printf will add exactly one. A cumbersome trick to avoid this:
```
out="$(cmdmayfail; s="$?"; printf X; exit "$s")"; printf '%s\n%s' "$?" "${out%X}"
```

The fact you handle the output from cmdmayfail in the BEGIN block may indicate you expect cmdmayfail to terminate early. Maybe this solution will be enough.

In general cmdmayfail may even run "endlessly" (i.e. until you terminate it) and you may want to read from its output while processing the ("endless") stdin of awk. In such case the above solution will not work.

You can prepend each line of the output from cmdmayfail with some fixed status line (e.g. OK) and finally add a line with the exit status of cmdmayfail. The code in <() would be like:

 cmdmayfail | sed 's/^/OK\n/'; printf '%s\n' "${PIPESTATUS[0]}"

Example:

$ (printf '%s\n' foo "bar baz"; exit 7) | sed 's/^/OK\n/'; printf '%s\n' "${PIPESTATUS[0]}"
OK
foo
OK
bar baz
7

Then your awk code should getline<f and check if it's OK. If so, the next line (getline<f again) is from cmdmayfail for sure. Loop to parse all lines until there is no OK when you expect it. Then it's the exit status.

This will work fine unless cmdmayfail may generate an incomplete line. Example:

$ (printf 'foo\nincomplete line'; exit 22) | sed 's/^/OK\n/'; printf '%s\n' "${PIPESTATUS[0]}"

Depending on the implementation of sed, the tool may

ignore the incomplete line at all, or
process it and add the missing newline character, or
process it as-is.

In effect you will

miss some part of the output, or
not know the line was incomplete, or
get the line with the exit status attached to it.

In case of (3) printf '\n%s\n' "${PIPESTATUS[0]}" may help. It will generate an extra empty line if the last line from cmdmayfail is complete; this way your awk code will be able to tell.

Consider a case when cmdmayfail was forcefully terminated mid-line. You may not want to parse the incomplete line then. The problem is: to know in awk whether the line form cmdmayfail was complete or not you need to test the next (status) line. Implementing useful logic for this in awk may be at least inconvenient.

It's good to detect an incomplete line as soon as possible, read in Bash can do this. The downside is read is slow (and remember Bash variables cannot store null characters). Example solution:

# define this helper function in the main shell
encerr () { ( eval "$@" ) | (while IFS= read -r line; do printf 'C\n%s\n' "$line"; done; [ -n "$line" ] && printf 'I\n%s\n' "$line") ; printf 'E\n%s\n' "${PIPESTATUS[0]}"; }
# this part you want to put in <()
encerr cmdmayfail

Then you need to decode the custom protocol inside awk. Lines go in pairs. (See the examples down below to understand the protocol more intuitively.)

Read the first line from a pair (getline<f).
Store the first line in a variable (first=$0).
Read the second line from a pair (getline<f).
Analyze the first line ($first).
- If it's C then the second one (current $0) is a complete line from cmdmayfail, you can parse it.
- If it's I then the second one is an incomplete line from cmdmayfail, you may or may not want to parse it. Expect E in the next pair.
- If it's E then the second one is the exit status from cmdmayfail. You should not expect further pairs.
Loop.

Note I used eval "$@" inside the function. What you write after encerr will be evaluated for the second time, so usually you would like to run something like

encerr 'cmd1 -opt foo'

or

encerr "cmd1 -opt foo"

or even

encerr 'cmd1 -opt foo | cmd2'

Basically this is the form you use to run remote commands with ssh. Compare:

ssh a@b 'cmd1 -opt foo | cmd2'

Or you can build the function like this:

encerr () { "$@" | …

and call it like this:

encerr cmd1 -opt foo

Compare to sudo:

sudo cmd1 -opt foo

Examples (using the original function with eval):

success with empty output
```
$ encerr true
E
0
```

failure with empty output

$ # I'm redirecting stderr so "command not found" doesn't obfuscate the example
$ encerr nonexisting-command-foo1231234 2>/dev/null
E
127

success after complete lines

$ encerr 'date; sleep 1; date'
C
Mon Sep 30 09:07:40 CEST 2019
C
Mon Sep 30 09:07:41 CEST 2019
E
0

failure after complete lines

$ encerr 'printf "foo\nbar\n"; false'
C
foo
C
bar
E
1

success after an incomplete line

$ encerr 'printf "foo bar\n89 baz"'
C
foo bar
I
89 baz
E
0

failure after an incomplete line

$ encerr 'printf "\nThe first line was empty and this one was interru"; exit 33'
C

I
The first line was empty and this one was interru
E
33

Stack Exchange Network

control seq for errors in unnamed pipe

2 Answers 2

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged
linux
command-line
bash
unix
.

Hot Network Questions

control seq for errors in unnamed pipe

2 Answers 2

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged linuxcommand-linebashunix.

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
linux
command-line
bash
unix
.