3

Most of the time I see someone suggesting using a pipe in a bash script there is someone pointing out not to use it and instead use only one command.

Example:

find $dir -name $pattern

instead of

ls $dir | grep $pattern

Is there another reason than look to avoid pipe?

3
  • 1
    you shouldn't parse the output of ls : mywiki.wooledge.org/ParsingLs
    – ymonad
    Commented Sep 6, 2017 at 8:19
  • You know that there's a difference between the two commands? find also searches subdirectories. ls does not.
    – fancyPants
    Commented Sep 6, 2017 at 8:19
  • @fancyPants yes, the exact commands I ment would have been "ls $dir | grep -i $pattern" and "find $dir -maxdepth 1 -iname '$patter' -exec basename \{} .po \;" With those you should get the same output but the find command seems much more complicated. But the Question is more about pipes in general.
    – realape
    Commented Sep 6, 2017 at 8:26

2 Answers 2

3

There is nothing wrong with piping per se. What should be avoided is useless fork()ing, meaning that starting a process is a relatively time-consuming thing.

If something can be done in one process, that is usually better than using two processes for the same result.

4
  • I would assume that the required exec is more costly than fork. Does anybody have numbers?
    – W.Mann
    Commented Sep 6, 2017 at 9:11
  • You can easily test that, but fork() involves creating a new process, copying the current process, using locking and whatnot, while execl simply replaces the current process.
    – marcolz
    Commented Sep 6, 2017 at 9:20
  • copying the current process? You have to duplicate the context, but not the memory itself. At least in Linux, fork is implemented based on copy-on-write.
    – W.Mann
    Commented Sep 6, 2017 at 11:08
  • @W.Mann Yes you are correct in that the writable pages are copy-on-write. The page tables linking to them though have to be copied, plus all open file descriptors, etc.
    – marcolz
    Commented Sep 6, 2017 at 12:17
2

Because pipe create a new process. In your example, ls and grep are two processes and find is one. One or more pipes makes command slower. One trivial example:

$ time find Downloads -name *.pdf &>/dev/null

real    0m0.019s
user    0m0.012s
sys 0m0.004s

$ time ls Downloads | grep pdf &>/dev/null

real    0m0.021s
user    0m0.012s
sys 0m0.004s
3
  • In principle correct, but there are also different semantics. ls also sorts the output alphabetically, while find just uses the order as returned by the underlying system commands. Additionally, as already pointed out, find is recursive.
    – W.Mann
    Commented Sep 6, 2017 at 9:08
  • you are only time-ing ls here, not the grep
    – marcolz
    Commented Sep 6, 2017 at 9:17
  • @W. Mann: Ah, indeed, for the bash time builtin, this is the case.
    – marcolz
    Commented Sep 6, 2017 at 11:59

Not the answer you're looking for? Browse other questions tagged or ask your own question.