1

Is there any way to get value(Yes or No) if i only have tag("answer_yes" or "answer_no") which are described in a xml file. I want to get them using bash.

<string tag="answer_yes" value="Yes"/>
<string tag="answer_no" value="No"/>
1
  • 1
    BTW -- I assume there's a larger XML file that this is taken from? These two lines aren't a valid document as given, because they're not under a single root. Commented Jan 24, 2018 at 16:18

4 Answers 4

3

Use an XML-aware tool. For this simple query, xmllint is enough:

answer=answer_yes
xmllint --xpath "//string[@tag='$answer']/@value" file.xml | cut -f2 -d\"

It seems not to expand entities, though, so if your real strings contain quotes, you'll have to replace &quot;, &amp;, and &lt; by ", &, and <, respectively.

xsh handles the entities for you:

xsh -aC 'open file.xml; $answer={$ARGV[0]};
         echo //string[@tag=$answer]/@value' "$answer"
0
2

To extract the value element of all strings with either tag="answer_yes" or tag="answer_no" in an XML document, XMLStarlet is an appropriate tool:

xmlstarlet sel -t -m '//string[@tag="answer_yes" or @tag="answer_no"]' -v '@value' -n

This will work in situations where naive regex-based approaches won't: It will recognize comments and CDATA as such and avoid trying to parse them; it will ignore answer_ content that isn't inside a string or a tag; it will recognize aliases brought in through your DTD; it will properly change &amp; to & in output; it's agnostic to whether the tag or the value is given first in the element; it doesn't care about whether the whitespace separating the element from its attributes is tabs/spaces/newlines/etc; and so forth.

0

In sed if your Input_file is same as shown sample then following may help you in same.

sed 's/.*answer_//;s/".*//'  Input_file
0
regex='tag="answer_yes"[[:space:]]+value="([^"]+)"'

if [[ '<string tag="answer_yes" value="Yes"/>' =~ $regex ]] ; then
    echo "${BASH_REMATCH[1]}" ;
fi

Feel free to expand on the regex for more accurate matching.

Sources:

4
  • Please don't link TLDP's documentation -- half of what we do in the Freenode #bash channel is helping people unlearn bad practices they picked up there. Commented Jan 24, 2018 at 16:15
  • 1
    (The bigger objection to this is that a regex can't understand XML syntax -- it won't ignore your string if it's in a comment; it won't understand that things inside CDATA sections are literal; it doesn't understand that an xmlns="http://example.com/" declaration makes something {example.com}string instead of string, etc). Commented Jan 24, 2018 at 16:20
  • (...back on documentation -- the Wooledge BashGuide was written, and is actively maintained, to be a more accuracy- and best-practices-focused alternative). Commented Jan 24, 2018 at 16:29
  • Oh, I didn't realize. Thanks for pointing me towards a better bash guide!
    – hjkatz
    Commented Jan 26, 2018 at 15:20

Not the answer you're looking for? Browse other questions tagged or ask your own question.