Is there any way to get value(Yes or No) if i only have tag("answer_yes" or "answer_no") which are described in a xml file. I want to get them using bash.
<string tag="answer_yes" value="Yes"/>
<string tag="answer_no" value="No"/>
Is there any way to get value(Yes or No) if i only have tag("answer_yes" or "answer_no") which are described in a xml file. I want to get them using bash.
<string tag="answer_yes" value="Yes"/>
<string tag="answer_no" value="No"/>
Use an XML-aware tool. For this simple query, xmllint
is enough:
answer=answer_yes
xmllint --xpath "//string[@tag='$answer']/@value" file.xml | cut -f2 -d\"
It seems not to expand entities, though, so if your real strings contain quotes, you'll have to replace "
, &
, and <
by "
, &
, and <
, respectively.
xsh handles the entities for you:
xsh -aC 'open file.xml; $answer={$ARGV[0]};
echo //string[@tag=$answer]/@value' "$answer"
To extract the value
element of all strings with either tag="answer_yes"
or tag="answer_no"
in an XML document, XMLStarlet is an appropriate tool:
xmlstarlet sel -t -m '//string[@tag="answer_yes" or @tag="answer_no"]' -v '@value' -n
This will work in situations where naive regex-based approaches won't: It will recognize comments and CDATA as such and avoid trying to parse them; it will ignore answer_
content that isn't inside a string
or a tag
; it will recognize aliases brought in through your DTD; it will properly change &
to &
in output; it's agnostic to whether the tag
or the value
is given first in the element; it doesn't care about whether the whitespace separating the element from its attributes is tabs/spaces/newlines/etc; and so forth.
In sed
if your Input_file is same as shown sample then following may help you in same.
sed 's/.*answer_//;s/".*//' Input_file
regex='tag="answer_yes"[[:space:]]+value="([^"]+)"'
if [[ '<string tag="answer_yes" value="Yes"/>' =~ $regex ]] ; then
echo "${BASH_REMATCH[1]}" ;
fi
Feel free to expand on the regex for more accurate matching.
Sources:
string
if it's in a comment; it won't understand that things inside CDATA
sections are literal; it doesn't understand that an xmlns="http://example.com/"
declaration makes something {example.com}string
instead of string
, etc).
Commented
Jan 24, 2018 at 16:20