8

I'm on Ubuntu 16.04, given this test.file content:

Hello \there

why does this (from the command line):

sed 's#\\there#where#' test.file

work, but this:

sed "s#\\there#where#" test.file

does not? Is it a matter of configuration?

The former succesfully replaces the pattern, while the latter doesn't seem to find any match.
I need to use a variable within the replacement text in a script, so I (guess I) need double quotes around the sed command.

5
  • There is no reason for this that I can see. You will need to expand your question to show real file data and search patterns.
    – AFH
    Commented Apr 5, 2018 at 11:36
  • @AFH You're both right, it seems it is a matter of backslashes in the input file -- see my edit.
    – watery
    Commented Apr 5, 2018 at 11:46
  • Yes, single- and double-quotes handle back-slashes differently.
    – AFH
    Commented Apr 5, 2018 at 12:01
  • Plus, sed does its own back-slash interpretation.
    – AFH
    Commented Apr 5, 2018 at 12:09
  • 1
    "need double quotes around the sed command" -- Your shell should concatenate quoted strings. Consider this trick: sed 's#\\there#'"$foo"'#' test.file Commented Apr 5, 2018 at 12:52

1 Answer 1

8

In bash and other shells, the back-slash character is handled differently within single- or double-quotes.

When you type sed 's#\\there#where#' test.file, what sed sees in its run string is s#\\there#where# test.file, because single quotes prevent all special character and escape-sequence interpretation: even \' is disallowed.

When you type sed "s#\\there#where#" test.file, what sed sees in its run string is s#\there#where# test.file, because double-quotes allow some escape sequences, and the shell has interpreted the first back-slash as escaping the second.

The further complication is that sed also allows escape sequence interpretation, similar to that in double-quoting, so in the first case (single-quoted) the search string becomes \there, as you want; whereas in the second case (double-quoted) the search string's first character becomes a Tab, followed by here.

The following extract from the bash manual defines these actions:-

   There are three quoting mechanisms: the escape character, single quotes, and double quotes.

   A non-quoted backslash (\) is the escape character.  It preserves the literal value of the next character that
   follows, with the exception of <newline>.  If a \<newline> pair appears,  and  the  backslash  is  not  itself
   quoted,  the  \<newline>  is  treated as a line continuation (that is, it is removed from the input stream and
   effectively ignored).

   Enclosing characters in single quotes preserves the literal value of each character within the quotes.  A sin‐
   gle quote may not occur between single quotes, even when preceded by a backslash.

   Enclosing  characters  in  double quotes preserves the literal value of all characters within the quotes, with
   the exception of $, `, \, and, when history expansion is enabled, !.  The characters $ and ` retain their spe‐
   cial meaning within double quotes.  The backslash retains its special meaning only when followed by one of the
   following characters: $, `, ", \, or <newline>.  A double quote may be quoted within double quotes by  preced‐
   ing  it  with  a  backslash.  If enabled, history expansion will be performed unless an !  appearing in double
   quotes is escaped using a backslash.  The backslash preceding the !  is not removed.

   The special parameters * and @ have special meaning when in double quotes (see PARAMETERS below).

   Words of the form $'string' are treated specially.  The word expands to string, with backslash-escaped charac‐
   ters  replaced  as  specified  by the ANSI C standard.  Backslash escape sequences, if present, are decoded as
   follows:
          \a     alert (bell)
          \b     backspace
          \e
          \E     an escape character
          \f     form feed
          \n     new line
          \r     carriage return
          \t     horizontal tab
          \v     vertical tab
          \\     backslash
          \'     single quote
          \"     double quote
          \nnn   the eight-bit character whose value is the octal value nnn (one to three digits)
          \xHH   the eight-bit character whose value is the hexadecimal value HH (one or two hex digits)
          \uHHHH the Unicode (ISO/IEC 10646) character whose value is the hexadecimal value HHHH (one to four hex
                 digits)
          \UHHHHHHHH
                 the  Unicode  (ISO/IEC  10646)  character  whose value is the hexadecimal value HHHHHHHH (one to
                 eight hex digits)
          \cx    a control-x character

   The expanded result is single-quoted, as if the dollar sign had not been present.

   A double-quoted string preceded by a dollar sign ($"string") will cause the string to be translated  according
   to  the  current  locale.   If the current locale is C or POSIX, the dollar sign is ignored.  If the string is
   translated and replaced, the replacement is double-quoted.
6
  • @KamilMaciorowski - You are quite correct. I had made a couple of references to the shell interpretation of Tab, which I found untrue when I checked, but I mistakenly deleted only one of them. Answer updated. Thanks.
    – AFH
    Commented Apr 5, 2018 at 18:18
  • Way to go. +1 then. Commented Apr 5, 2018 at 18:46
  • Thank you. I guess I should have mentioned that I'm trying to match the text 127\.\d+\.\d+\.\d+|::1|0:0:0:0:0:0:0:1.
    – watery
    Commented Apr 6, 2018 at 7:34
  • @watery - If you use single-quotes, you will have no problems with this string; if you use double-quotes, you need to double the back-slashes, just as in the sample string in your question needs to be sed "s#\\\\there#where#" test.file with double-quotes. I have had instances where I've needed to use eight back-slashes for each one in the string I am trying to define!
    – AFH
    Commented Apr 6, 2018 at 10:12
  • @watery - Note that the | will need to be escaped to specify an alternative match string if you are using basic regular expressions (no -r in the sed run string); otherwise, it will be treated as a literal. The reverse is true with -r (extended REs): | specifies alternative strings, while \| matches a literal |.
    – AFH
    Commented Apr 6, 2018 at 10:44

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .