If the question is how to ignore the warnings or errors that the shell outputs when you try to do something not supported such as here a command substitution with a command that outputs NULs in the GNU implementation of sh
(bash
), then as @GMan says, the best you can do is:
{ <potentially-unsupported-stuff>; } 2> /dev/null
The shell could also decide to abort in addition to or instead of writing an error message, which:
(<potentially-unsupported-stuff>) 2> /dev/null
Which uses a subshell might avoid, but even then if the <unsupported-stuff>
is a syntax error, that won't help.
$ bash -c '( if ) 2> /dev/null; echo not reached'
bash: -c: line 1: syntax error near unexpected token `)'
bash: -c: line 1: `( if ) 2> /dev/null; echo not reached'
And of course in your case using a subshell won't do as you'll lose the value of the assignment.
Instead, you can do, POSIXly:
command eval '<potentially-unsupported-stuff>' 2> /dev/null
So here:
command eval 'var=$(command-that-outputs-non-text)' 2> /dev/null
POSIX requires eval
to exit if it fails (which bash
ignores when not in POSIX mode), but prefixing with command
¹ prevents it.
So it would discard all errors by the shells whilst evaluating the code passed to eval
as well as the errors by the commands run during that evaluation and would also be less likely to cause the shell to abort, while still not running a subshell.
Now, that doesn't make that portable. Example:
$ cat test.sh
command eval 'var=$(printf "\61\200\62\0\63\12\12")' 2> /dev/null
printf %s "$var" | od -An -vto1 -tc
$ ARGV0=sh zsh ./test.sh
061 200 062 000 063
1 200 2 \0 3
$ ARGV0=sh dash ./test.sh
061 200 062 063
1 200 2 3
$ ARGV0=sh bash ./test.sh
061 200 062 063
1 200 2 3
$ ARGV0=sh ksh ./test.sh
061 200 062
1 200 2
$ ARGV0=sh yash ./test.sh
061
1
$ locale charmap
UTF-8
(where ARGV0=sh
is my shell (zsh) way to pass sh
as argv[0]
).
It's simply not possible to store non-text in a sh
variable portably.
NUL is a problem for all shells except zsh. Some shells remove them in command substitutions (some with a warning such as bash), some don't but as they work internally with C-style NUL-delimited strings end up discarding it and everything that follows.
NUL is not the only problem as seen in yash
's output: in a locale that uses UTF-8 as charmap, that sequence of bytes cannot be decoded into text, and yash stops at the first decoding error.
And you see that all strip the two 012 bytes (the encoding of newline on ASCII-based systems) as required by POSIX.
What you can do is store some text encoding of that output.
In the POSIX tool chest, you can use od
or uuencode
for that, though to be able to use it later, uuencode
would be more useful as you can use uudecode
to decode it:
$ var=$(printf '\61\200\62\0\63\12\12' | uuencode -)
$ printf '%s\n' "$var" | uudecode | od -An -vto1 -tc
061 200 062 000 063 012 012
1 200 2 \0 3 \n \n
See how all 7 bytes were preserved.
Beware printf
might still fail for relatively small strings with shells where printf
is not builtin (such as ksh88 and pdksh and most of its derivatives) on systems where there's a limit on the size of arguments+environment passed to an executed command (most).
If the question is how to portably remove NULs from the output of a command without relying on specific shells doing by themselves in their command substitutions like bash does with a warning (as an extension to or in conflict to the standard, as it's not clear to me what POSIX has to say about it), then yes:
cmd_output_without_NULs_and_trailing_newlines=$(
cmd | tr -d '\0'
)
Or:
file_contents_without_NULs_and_trailing_newlines=$(
<file tr -d '\0'
)
is the way to go, but note that it still removes trailing 0xA bytes on ASCII-based systems and can still fail if the output/contents cannot be decoded as text in the current locale in some shells such as yash
.
To preserve the trailing newlines (and the exit status), as usual:
file_contents_without_NULs_and_trailing_newlines=$(
<file tr -d '\0'
ret=$?
echo .
exit "$ret"
)
ret=$?
file_contents_without_NULs=${file_contents_without_NULs%.}
If the question is how to POSIXly decode the output of a command made of several concatenated NUL-delimited strings such as the output of find -print0
into separate shell parameters, then since the 2024 edition of the POSIX standard, you can do:
cmd | {
set --
while IFS= read -rd '' var; do
set -- "$@" "$var"
done
# rest of the script that needs to process those strings in
# the positional parameters must go here, as this part runs
# in a subshell in some shells such as bash
printf 'There are %d strings and the first is "%s"\n' "$#" "$1"
}
But not all shells are confirming to that yet. In particular dash
, the sh
implementation on many GNU/Linux systems isn't and its read
doesn't support -d
yet as of July 2024.
An alternative is to do:
eval "$(
cmd |
LC_ALL=C od -An -vtu1 |
LC_ALL=C awk -v q="'" '
BEGIN {
for (i = 1; i < 256; i++) {
c[i] = sprintf("%c", i)
if (c[i] == q) c[i] = q "\\" q q
}
printf "set --"
started = 0
}
{
for (i = 1; i <= NF; i++) {
if (!started) {
printf " " q
started = 1
}
n = $i
sub(/^0+/, "", n) # remove leading 0s that some od
# implementations add.
if (n == "") {
printf q
started = 0
} else printf "%s", c[n]
}
}
END {if (started) printf q}'
)"
That's still not POSIX as POSIX allows systems where the encoding of '
could vary between locales or locales with charsets with a shift state (where a given byte or byte sequence can represent different characters depending on context), but those are not workable anyway and you won't find them in any locale by default on GNU/Linux based systems (where in practice all locale charsets are supersets of ASCII and locales with charsets with shift states are not enabled by default and not properly supported (not that it's possible to properly support them)).
¹ note that it doesn't work in zsh
when not in sh
emulation, where command
there (which predates POSIX') is for running an external command rather than only bypassing functions and remove their specialness to special builtins.
stderr
to/dev/null
”.)NUL
character, it isn't a Text File, as per POSIX definition. This is a way for programs to detect "binary" files.