0

The following syntax used in order to capture the word between <Name> in .xml file. I also use xargs to remove any spaces.

$> var=` find /tmp -name '*.xml' -exec sed -n 's/<Name>\([^<]*\)<\/Name>/\1/p' {} +  |  xargs `
$> echo $var
TOPIC
$>

Until now it seems to be OK. But printf shows something else:

$> printf "%q\n" "$var"
$'TOPIC\r'
$>

Let's drill down:

$> [[ TOPIC == $var ]] && echo they are equal
$>

No "they are equal" ever printed.

But when we echo $var we get:

$> echo $var
TOPIC
$>

The BIG BIG question is: how to remove the extra characters ($, \r) from the variable?

$'TOPIC\r'
0

1 Answer 1

1

$ is not in the variable, nor the literal \r. They are added to the output because you told printf to format this way: %q. The real extra character is "carriage return", code 0x0D, which escape sequence is \r.

The root of your problem is your .xml files seem to use CR+LF line endings from the DOS/Windows world. See this comparison on Wikipedia.

The document Extensible Markup Language (XML) 1.0 (Fifth Edition) says:

To simplify the tasks of applications, the XML processor must behave as if it normalized all line breaks in external parsed entities (including the document entity) on input, before parsing, by translating both the two-character sequence #xD #xA and any #xD that is not followed by #xA to a single #xA character.

Here #xD denotes CR, #xA denotes LF.

In your case the whole find … | xargs statement is your XML processor (let's put problems like this aside). If you want to fully comply to the specification, you should pass every .xml file through dos2unix in the very first place.

But since the real problem is with the content of the variable, this may be enough in your case:

var=`find … | dos2unix | xargs`

If you don't have dos2unix, tr -d '\r' would work as a replacement in this context (thank you @GordonDavisson for pointing this out).

1
  • 1
    If you don't have dos2unix, tr -d '\r' would work as a replacement in this context. Commented Jan 17, 2018 at 19:53

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .