4

General Question:

In Bash, I know that using the variable myvar can be done in two ways:

# Define a variable:
bash$ myvar="two words"

# Method one to dereference:
bash$ echo $myvar
two words

# Method two to dereference:
bash$ echo "$myvar"
two words

In the case above, the behavior is identical. This is because of how echo works. In other Unix utilities, whether the words are grouped together with double-quotes will make a huge difference:

bash$ myfile="Cool Song.mp3"
bash$ rm "$myfile"            # Deletes "Cool Song.mp3".
bash$ rm $myfile              # Tries to delete "Cool" and "Song.mp3".

I am wondering what the deeper significance of this difference is. Most importantly, how can I view exactly what will be passed to the command, so that I can see if it is quoted properly?

Specific Odd Example:

I will just write the code with the observed behavior:

bash$ mydate="--date=format:\"%Y-%m-%d T%H\""
bash$ git log "$mydate"    # This works great.
bash$ git log $mydate
fatal: ambiguous argument 'T%H"': unknown revision or path not in the working tree.

Why do I need the double-quotes? What exactly is git-log seeing after the variable is dereferenced without double-quotes?

But now see this:

bash$ nospace="--date=format:\"%Y-%m-%d\""
bash$ git log $nospace        # Now THIS works great.
bash$ git log "$nospace"      # This kind of works, here is a snippet:

# From git-log output:
Date:   "2018-04-12"

Yuck, why does the printed output have double-quotes in it now? It looks like if double-quotes are unnecessary, they do not get stripped, they are interpreted as literal quote characters if and only if they were not necessary.

What is Git being passed as arguments? I wish I knew how to find out.

To make matters more complex, I wrote a Python script using argparse that just prints all of the arguments (as Bash interpreted them, so with double-quote literals where Bash thinks they are part of the argument, and with words grouped or not grouped as Bash sees fit), and the Python argparse script behaves very rationally. Sadly, I think argparse may be silently fixing a known problem with Bash and thus obscuring the messed up stuff that Bash is passing to it. That's just a guess, I have no idea. Maybe git-log is secretly screwing up what Bash is passing to it.

Or maybe I just don't know what is going on at all.

Thanks.

Edited Edit: Let me say this now, before there are any answers: I know that I can maybe use single quotes around the whole thing and then not escape the double-quotes. This actually does work somewhat better for my initial problem using git-log, but I tested it in some other contexts and it is just about equally unpredictable and unreliable. Something weird is afoot with quoting inside variables. I'm not even going to post all the weird things that happened with single quotes.

Edit 2 - This also doesn't work: I just had this wonderful idea, but it doesn't work at all:

bash$ mydate="--date=format:%Y-%m-%d\ T%H"
bash$ git log "$mydate"

# Git log output has this:
Date:   2018-04-12\ T23

So it doesn't have quotes wrapping it, but it has a literal backslash character in the date string. Also, git log $mydate with no quotes errors out, with the backslash-space in the variable.

2
  • Is this Q about git only? Or whitespace?
    – Xen2050
    Commented Apr 13, 2018 at 6:52
  • @Xen2050 I'm honestly not sure whether the problem is related to Git. I am fairly certain it relates to Bash. It is possible that Git has broken something, or Python's argparse has fixed something, because they have divergent behavior. Also, the value I really want contains -, =, :, [space], %, and also either double-quote or single quote, so it is possibly a very hard value to use at all.
    – SerMetAla
    Commented Apr 13, 2018 at 7:15

2 Answers 2

4

Different approach:

When you run git log --format="foo bar", those quotes aren't interpreted by git – they're removed by the shell (and protect the quoted text from splitting). This results in a single arg:

  • --format=foo bar

However, when unquoted variables are expanded, the results go through word-splitting, but not through unquoting. So if your variable contains --format="foo bar", it is expanded into these args:

  • --format="foo
  • bar"

This can be verified using:

  • printf '%s\n' $variable

...as well as any simple script which prints its received arguments.

  • #!/usr/bin/env perl
    for $i (0..$#ARGV) {
        print ($i+1)." = ".$ARGV[$i]."\n";
    }
    
  • #!/usr/bin/env python3
    import sys
    for i, arg in enumerate(sys.argv):
        print(i, "=", arg)
    

If you always have bash available, the preferred workaround is to use array variables:

myvar=( --format="foo bar" )

With this, the usual parsing is done during assignment, not during expansion. You use this syntax to expand the variable's contents, each element getting its own arg:

git log "${myvar[@]}"
9
  • I have accepted this answer, it is much more helpful than the other one. Thank you.
    – SerMetAla
    Commented Apr 13, 2018 at 7:31
  • Please add this at the very top, because it is the crux of the answer: mydate="--date=format:%Y-%m-%d T%H" This doesn't use array variables (which are awesome), and it emphasizes the working solution that changes only one character from the original problem code. Thanks.
    – SerMetAla
    Commented Apr 13, 2018 at 7:33
  • @SerMetAla I'm just curious: what specifically is the problem with simply using the approach I showed, i.e. mydate="--date=format:%Y-%m-%d T%H" and git log "$mydate"?
    – slhck
    Commented Apr 13, 2018 at 7:34
  • @slhck I think the ideal answer will be this answer, that I have accepted, plus that one snippet from your answer, ideally prepended at the top of this answer. I actually will use that one snippet from your answer, thank you. This answer contains a correct explanation of how to see what git-log will actually see, using either printf in Bash itself (I tested) or Python (I tested) or Perl (I did not test, I trust it by extrapolation). This answer also emphasizes what is essentially going on in your useful snippet: Bash is adding the double-quote that it needs, so I shouldn't type it.
    – SerMetAla
    Commented Apr 13, 2018 at 7:46
  • 3
    I believe I already answered that in "Use arrays". Commented Apr 13, 2018 at 10:50
2

Why does your original command not work?

bash$ mydate="--date=format:\"%Y-%m-%d T%H\""
bash$ git log "$mydate"    # This works great.
bash$ git log $mydate
fatal: ambiguous argument 'T%H"': unknown revision or path not in the working tree.

You ask:

Why do I need the double-quotes? What exactly is git-log seeing after the variable is dereferenced without double-quotes?

If you don't use double quotes around $mydate, the variable will be expanded verbatim, and the shell line will be the following before being executed:

git log --date=format:"%Y-%m-%d T%H"
                      ^————————————^—————— literal quotes

Here, you (unnecessarily) added literal quotes by using \" in the variable assignment.

Since the command will undergo word splitting, git will receive three arguments, log, --date-format:"%Y-%m%-d and T%H", therefore complaining about not finding any commit or object named T%H".


What is the correct approach?

If you want to keep arguments together, if that argument contains whitespace, you have to wrap the argument in quotes. Generally, always wrap variables in double quotes.

This works even if there is a space inside the variable:

mydate="--date=format:%Y-%m-%d T%H"
git log "$mydate"

Now the third argument for git will be $mydate, including the space that you originally specified. All the quotes are stripped by the shell before being passed to git.

You simply don't need the additional quoting—if all you want is git to see one argument, wrap that argument in quotes when passing the variable "$mydate".


Also, you ask:

bash$ nospace="--date=format:\"%Y-%m-%d\""
bash$ git log $nospace        # Now THIS works great.
bash$ git log "$nospace"      # This kind of works, here is a snippet:

# From git-log output:
Date:   "2018-04-12"

Your question:

Yuck, why does the printed output have double-quotes in it now?

Because you've again included literal quotes in the argument (by escaping them), which are turned into, say, “real” quotes when you forget to quote the variable in your actual command. I say “forget” because using unquoted variables in shell commands is usually just getting you into trouble—and here it's reversing an error you've made while specifying the variable in the first place.

PS: I know this is all confusing, but that is Bash, and it follows some clear rules. There's no bug here. A related essay about filenames in shell is also very revealing, as it touches upon the issue of whitespace handling in Bash.

3
  • You said some true things, however, the following command does work and it is what I want: git log --date=format:"%Y-%m-%d T%H" You said it as if it doesn't work, but it does work. Also, there is no other way to make it work.
    – SerMetAla
    Commented Apr 13, 2018 at 7:26
  • Yes, of course that command works when you type it directly. It only doesn't work when you first assign the (date format) argument to a variable containing literal quotes, and then expand that variable.
    – slhck
    Commented Apr 13, 2018 at 7:27
  • @SerMetAla The critical distinction is between syntactic quotes (which go around data, and are what you want) and literal quotes (which are part of the data, and are what you get once you've put the quotes in a variable's value). Commented Apr 13, 2018 at 16:29

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .