3

Concrete use case:

I'm trying to curl a URL with -v, output the verbose debug information to stderr as normal, output the response body to stdout as normal, but also check for the presence of a regexp in the response with grep, and exit nonzero if it's not found. The simplest form of the command is

curl -v "${url}" | grep -q "${pattern}"

but grep consumes the response body.

What I've tried:

  1. I've looked at “Convince grep to output all lines, not just those with matches”, which would suggest things like:

    curl -v "${url}" | grep --color -E "${pattern}|$"
    

    but this only color-highlights the match, it doesn't get you a nonzero exit code if pattern is not present.

  2. I've also looked at “How to get grep exit code but print all lines?”, which suggests:

    curl -v "${url}" | tee /dev/stderr | grep -q "${pattern}"
    

    which exits with the exit code from grep and does output the response body; but it outputs the response body to stderr, and I want to keep curl's stderr and stdout streams separate and intact.

  3. Following “Direct output to pipe and stdout”, I tried

    curl -v "${url}" | tee >(grep -q "${pattern}")
    

    but while this outputs everything, and correctly separates the stdout and stderr streams, it discards the grep exit code.

Current workaround:

So far, the only thing I've been able to come up with is to stick the response body in a temporary file:

curl -v "${url}" | tee /tmp/response
grep -q "${pattern}" /tmp/response

This gets me the correct output and the correct exit code.

But surely there's a way to do this as a single pipeline, without the temporary file?

2 Answers 2

4

Your second answer is on the right track.  Try

{ curl -v "${url}" | tee /dev/fd/3 | grep -q "${pattern}";} 3>&1

Brief explanation:

  • The 3>&1 duplicates file descriptor 1 (standard output) onto file descriptor 3 (the lowest one that doesn’t have a standard usage).  This refers to the standard output of the entire pipeline (i.e., the entire command line).  The new file descriptor (3) is in effect for the entire pipeline.

    You could use any number — at least any number up to 9.  The POSIX Shell Command Language specification, section 2.7 Redirection, says

    … all implementations shall support at least 0 to 9 …

    implying that a POSIX-compliant shell might not recognize numbers larger than 9.  And bash(1) says

    Redirections using file descriptors greater than 9 should be used with care …

  • tee /dev/fd/3 tells tee to write to file descriptor 3 — so it hooks tee up to the standard output of the pipeline.

    • You must use the same number here as you used in the n>&1 directive.
    • Filenames like /dev/fd/n might not work on non-Linux operating systems.

    Note that this happens independent of the standard output of tee, which is the pipe to the grep.

Stéphane Chazelas points out:

grep -q exits when the pattern is found, which would cause tee to die if it writes output after that.  The GNU implementation of tee at least has a -p for it to ignore the SIGPIPE and carry on writing to the targets that can still get output.  Also note that some shells only wait for the last component of pipelines so here, would carry with the rest of the script as soon as grep -q has found the pattern …

As a formality I verified the first point (the pipeline can be terminated prematurely if the pattern is matched)1.

I offer this enhanced solution to address both of the issues:

{ curl -v "${url}" | tee /dev/fd/3 | { grep -q "${pattern}" && cat > /dev/null;} } 3>&1

Notes:

  • If the data stream doesn’t match the pattern, then grep will read the entire stream (i.e., its input), and so the problem won’t arise.  And, in this case, grep will “fail”, and so, because of the &&, the cat will not run, and the compound command will return the exit code from grep; i.e., failure (i.e., no match).
  • If the data stream does match the pattern, then (as pointed out by Stéphane) grep will exit when it matches the pattern, and not read the entire input (i.e., the output from curl | tee).  But in this case, grep will “succeed”, and so, because of the &&, the cat will run, and it will soak up the rest of the data (until EOF).  This will ensure that tee will be able to write the entire data stream to the pipe, and the pipeline (i.e., the whole command line) won’t terminate until all of curl’s output has been processed.

There’s technically still a problem with this: if the data stream matches the pattern and grep “succeeds”, then the pipeline’s exit status will be the exit status from cat.  I don’t know why (pipe) | cat > /dev/null would ever fail, but it’s theoretically possible.  To guard against that case, I offer:

{ curl -v "${url}" | tee /dev/fd/3 | if grep -q "${pattern}"; then cat > /dev/null; true; else false; fi; } 3>&1

which explicitly returns true if grep succeeds and false if it fails.
______________
1 The -p option to tee was added in release 8.24 (2015-07-03).


For any of the above variations, if you want to pipe the output from curl somewhere else, just add the pipe at the end of the command line, as you normally would:

{ curl …; fi; } 3>&1 | lpr

But, if you want to redirect it to a file, you must insert the output redirection before the 3>&1:

{ curl …; fi; } > output_file 3>&1

Recall that “pipe into a file” is incorrect terminology.

5
  • grep -q exits when the pattern is found, which would cause tee to die if it writes output after that. The GNU implementation of tee at least has a -p for it to ignore the SIGPIPE and carry on writing to the targets that can still get output. Also note that some shells only wait for the last component of pipelines so here, would carry with the rest of the script as soon as grep -q has found the pattern even if curl has not finished downloading. Commented Jan 12, 2022 at 8:48
  • @StéphaneChazelas: Any problems with this? Commented Jan 12, 2022 at 20:07
  • @G-ManSays'ReinstateMonica' Does /dev/fd/3 just get me an arbitrary third file descriptor, then? That's the piece I was missing, thanks. Commented Jan 13, 2022 at 0:40
  • 1
    I have added some explanation. Commented Jan 13, 2022 at 5:20
  • 1
    While opening /dev/fd/3 gets you a dup of fd 3 on most systems, on Linux-based systems or Cygwin, that reopens the file fd 3 points to anew, so it won't work if stdout goes to a socket for instance and if stdout is for instance pointing to the end of a log file opened in append more, you'll end up clobbering that log file. While it wouldn't help for sockets, with tee -a, you'd at least avoid clobbering files. See also { tee >(grep -q... && cat); wait "$!"; } in newer versions of bash. Commented Jan 13, 2022 at 11:34
3

If there's nothing special about grep here, maybe use awk to do the pattern matching instead:

curl -v "${url}" | awk -v p="$pattern" '$0 ~ p {found=1} {print} END {exit ! found}'
  • -v p="$pattern" sets the awk variable p to the value of the shell variable pattern
  • $0 ~ p {found=1} sets the awk variable found to 1 if the line matched regex in p variable.
  • {print} - prints all lines (since no condition is given for this block)
  • END {exit ! found} - at the end of input, exit with status of negated value of found (so 0 if the pattern was matched, 1 otherwise).

Awk and grep support different regexes, so you might need to change the pattern.

1
  • 2
    Avoid -v for passing regexps as it mangles backslashes. It's better to use the environment (P=$pattern awk '$0 ~ ENVIRON["P"]...') Commented Jan 12, 2022 at 8:52

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .