2

I have a directory with a large number of files.

./I_am_a_dir_with_many_subdirs/

Within a script I'd like to find all subdirs in it, to sort them and to output to a bash array. So, I do:

SubdirsArray=(`find ./I_am_a_dir_with_many_subdirs/ -maxdepth 2 -mindepth 2 -type d | sort`)

Executing the script, I get the following error messages:

    sort: write failed: standard output: Broken pipe
    sort: write error

As explained in this post: probably sort executes and closes the pipe, before find completes writing to it. Thus write() command initiated by find gets an error EPIPE "Broken pipe", OS sends find a SIGPIPE. Before the SIGPIPE reaches find, it prints the error message, then gets SIGPIPE and dies.

Questions:

  1. So, what does my SubdirsArray contain? The Subdirs, that find found, but sort left unsorted?

  2. If so, than what would be the way around this issue with broken pipes? Make find write it's results to a temporary file and then make sort read it?

    I don't understand, why "it's also nothing to be concerned about" if it happens within a non-interactive shell: why? My SubdirsArray contains something unsorted and further in the script, I assume, that its elements are sorted?!

  3. I get two error messages:

    sort: write failed: standard output: Broken pipe
    sort: write error
    

In this thread it is suggested, that sort doesn't have enough space in a temporary directory to sort all the input. But, doesn't it mean, that sort got something from find?!? I'm confused... Anyways, I tried to use

SubdirsArray=(`find ./I_am_a_dir_with_many_subdirs/ -maxdepth 2 -mindepth 2 -type d | sort -T /home/temp_dir`)

but it didn't help.

P.S.

I'm not sure whether it's important, but I use find|sort in a multi-processor script: several processors execute the same command at once in the subshells.

2
  • sort can't do anything before it read the input in full and besides if it was the sort ending prematurely, it would be find reporting broken pipe, not sort. The error in the other thread you mention looks very different and is indeed different.
    – Jan Hudec
    Commented Mar 14, 2014 at 6:44
  • @JanHudec thank you for pointing that out, I didn't pay attention, to what command reported the problem. Commented Mar 14, 2014 at 7:10

1 Answer 1

2
sort: write failed: standard output: Broken pipe

The problem is not between find and sort. The sort has problem with output, which means the shell is not willing to read as long list in a variable.

You'll have to process the input with while read…, storing it in temporary file if you need it more than once. With the added advantage, that this splits on newline only, so it correctly handles filenames with spaces which the backtick approach does not.

Unfortunately you don't say how you want to use the result, I can't tell you how to exactly rewrite it.

Note, that arrays are not part of POSIX shell specification and there are shells that are noticeably faster than bash, but don't have them. That's why many people, including me, often avoid using them in scripts.

8
  • Jan, thank you for the answer and the comment. I'd like to use SubdirsArray in a for loop. So, I will implement your solution like: find ./I_am_a_dir_with_many_subdirs/ -maxdepth 2 -mindepth 2 -type d | sort > temp.txt; while read Subdir; do myFunction $Subdir; done; rm temp.txt In the end, I'd like to apply myFunction to all Subdirs. To do it faster, I try to parallelise my code and use N subshells with wait. Each subshell should take only it's part of Subdirs. I didn't want to send a long array of Subdirs it should handle to each subshell, but first/last index of SubdirsArray. Commented Mar 14, 2014 at 7:22
  • @user1541776: Don't forget to redirect input into the loop. It could be done even without temporary file, but probably not if you want to split it first.
    – Jan Hudec
    Commented Mar 14, 2014 at 7:29
  • you mean find ./I_am_a_dir_with_many_subdirs/ -maxdepth 2 -mindepth 2 -type d | sort > temp.txt; while read Subdir; do myFunction $Subdir; done < "temp.txt" ; rm temp.txt ? Commented Mar 14, 2014 at 7:35
  • @user1541776: Yes, exactly. read just reads from standard input.
    – Jan Hudec
    Commented Mar 14, 2014 at 8:05
  • @user1541776: You can also pipe to the loop, but it will than run in a subshell, on in shell that has it (like bash, but not ash/dash) you can use process substitution, i.e. like <(find ... | sort).
    – Jan Hudec
    Commented Mar 14, 2014 at 8:07

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .