2

I just have tried to exclude couple of directories while creating a tar archive. The directory structure is rather simple (Centos 6, tar v1.23):

/test/t1
     /t2
     /t3
     ...

each subdirectory (t1, t2, t3, ...) contains some txt files. Nothing unusual.

Fine, lets try this:

tar czvf test.tar.gz test/ --exclude={"t2"}

Failed, t2 subdirectory is included in the archive.

tar czvf test.tar.gz test/ --exclude={"t2",""}

Success, t2 is excluded - as expected.

I have tried to reproduce the same situation on my laptop (Ubuntu18.04, tar v1.29) with the same directory structure. Here, the both commands failed - the t2 directory was included in the created archive!

  1. Why a single directory entry, provided in {} is not working?
  2. Why are the results different in different environments?

What is going on here? Is this something about the tar version? Linux distro dependent? Looking at the tar manual (current, v1.32) nor the tar changelog gave me any answer.

1 Answer 1

1

Let's start from the end.

Why are the results different in different environments?

In some versions of tar --exclude=… counts only for paths (e.g. test/) that come after it. This means tar … test/ --exclude=t2 works as if there was no --exclude at all. GNU tar 1.29 in my Debian 9 behaves like this for sure.

You claim in 1.23 --exclude after a path works (at least in some circumstances). I have no reason not to believe you. In fact I reached GNU tar 1.26 in Debian 7 and it behaves like you described for 1.23.

Conclusion: place --exclude=… before paths like test/.


Why a single directory entry, provided in {} is not working?

This one is not about tar, it's about the shell and brace expansion the shell performs (or not). Not all shells do it. Bash does, so does Zsh. Plain POSIX shell doesn't have such functionality.

In a shell that supports brace expansion invoke:

printf '<%s> ' --exclude={"t2"}; echo
printf '<%s> ' --exclude={"t2",""}; echo
printf '<%s> ' --exclude={foo,bar,"baz qux"}; echo

In each output whatever is inside <> is a separate argument printf got. You will see respectively:

<--exclude={t2}>
<--exclude=t2> <--exclude=>
<--exclude=foo> <--exclude=bar> <--exclude=baz qux>

I used <> to be able to tell for sure there may be multiple arguments rather than a single argument with spaces.

You may have thought --exclude={foo,bar,baz} makes {foo,bar,baz} get to tar and the tool interprets it as some kind of list. No. The shell expands this syntax and generates multiple words before tar is started. The tool gets these words as command line arguments, interprets them as options and is not aware there were some braces involved; exactly like printf got its arguments.

And now the quirk: for brace expansion to occur, there must be at least one comma (,) or dot-dot (.., it serves different purpose) inside the braces. From the linked manual for Bash:

A correctly-formed brace expansion must contain unquoted opening and closing braces, and at least one unquoted comma or a valid sequence expression. Any incorrectly formed brace expansion is left unchanged.

We may call --exclude={"t2"} an incorrectly-formed brace expansion. The string is left unchanged, tar gets it almost as it was typed (almost, because quotes are removed). The tool would exclude a file literally named {t2}. As you can see there is no reason to exclude t2 then; t2 is not {t2}.

On the other hand --exclude={"t2",""} is a correctly-formed brace expansion and it expands to --exclude=t2 --exclude=. The latter excludes nothing but tar doesn't complain. The former excludes t2 as you wanted.

--exclude={foo,bar,baz,whatever} saves you typing --exclude= again and again thanks to the feature of the shell. But in the end tar will get

--exclude=foo --exclude=bar --exclude=baz --exclude=whatever

as if you typed four --exclude= statements.

If there is just foo to exclude then you must not shorten the "list" to --exclude={foo}, it should be just --exclude=foo. You discovered you can use --exclude={foo,} but this is quite cumbersome and serves no purpose. Remember these braces are not required by tar, they never get to it.

Note if you ever want to exclude any file (of any type, including files of the type directory) literally named {foo,bar,baz,whatever} then you need to quote to prevent brace expansion from kicking in:

--exclude='{foo,bar,baz,whatever}'
7
  • Hmm? "... if you ever want to exclude a single file literally named {foo,bar,baz,whatever}"
    – Hannu
    Commented Jan 11, 2020 at 10:16
  • @Hannu I don't get the point of your comment. {foo,bar,baz,whatever} is a legitimate name for a file. Why "single"? Commented Jan 11, 2020 at 11:59
  • You have this one quoted: --exclude='{foo,bar,baz,whatever}' in a way that indicates a single file, still the text says 'files' i.e. plural, more than one. Confusion?
    – Hannu
    Commented Jan 12, 2020 at 17:48
  • @Hannu This will exclude any file literally named {foo,bar,baz,whatever} regardless on directory the file is in. There may be at most one such file per directory. There may be many directories in the tree being archived. Therefore "files". Commented Jan 12, 2020 at 17:55
  • Ahh... same filename but in any dirs, didn't think of that. Maybe you can add THAT to the text; e.g. "note: may exist in more than position in the directory tree"
    – Hannu
    Commented Jan 12, 2020 at 18:08

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .