0

I need to select particular set of tags which contains a particular value inside the tag. For example, below is the source.XML file

<main tag>
<subTag1>1298</subTag1>
<subTag2>fg</subTag2>
<subTag3>34</subTag3>
</main tag>

<main tag>
<subTag1>1299</subTag1>
<subTag2>cfinfo</subTag2>
<subTag3>43</subTag3>
</main tag>

<main tag>
<subTag1>1300</subTag1>
<subTag2>BBcycle</subTag2>
<subTag3>55</subTag3>
</main tag>

I need to select all the contents of main tag values which has subTag1 value 1300 in to another XML.The expected output if subTag1 value is given as 1300 is below.(Result.XML)

<main tag>
<subTag1>1300</subTag1>
<subTag2>BBcycle</subTag2>
<subTag3>55</subTag3>
</main tag>

Like wise I need to select the main tag elements with set of subTag1 values.The expected output if subTag1 value is given as 1299 & 1300 is below.(Result.XML)

    <main tag>
    <subTag1>1299</subTag1>
    <subTag2>cfinfo</subTag2>
    <subTag3>43</subTag3>
    </main tag>

    <main tag>
    <subTag1>1300</subTag1>
    <subTag2>BBcycle</subTag2>
    <subTag3>55</subTag3>
    </main tag>

PS:There are no line break between tags , added for better understanding. In real case scenario, There are lot of main tags and i have set of SubTag1 values to fetch that corresponding main tag blocks like the example above (to be created in a single resultant XML file). It would be appropriate if script user can give the set of subTag1 values for searching in SOURCE.XML.

I thought of using grep but it won't help in selecting set of tags. I need to do this using UNIX shell scripting.

2 Answers 2

0

You need an XML parsing tool. xmlstarlet is my favourite. After fixing up your invalid XML, we have

$ xmlstarlet ed -d '//main_tag[subTag1 != 1300]' file.xml
<?xml version="1.0"?>
<root_tag>
  <main_tag>
    <subTag1>1300</subTag1>
    <subTag2>BBcycle</subTag2>
    <subTag3>55</subTag3>
  </main_tag>
</root_tag>

and

$ xmlstarlet ed -d '//main_tag[subTag1 != 1300 and subTag1 != 1299]' file.xml
<?xml version="1.0"?>
<root_tag>
  <main_tag>
    <subTag1>1299</subTag1>
    <subTag2>cfinfo</subTag2>
    <subTag3>43</subTag3>
  </main_tag>
  <main_tag>
    <subTag1>1300</subTag1>
    <subTag2>BBcycle</subTag2>
    <subTag3>55</subTag3>
  </main_tag>
</root_tag>

I find this page a helpful tutorial for xpath.

1
  • Yes this can be used but it needs installation which i don't have permission nor ask for permission Commented Jul 9, 2015 at 4:48
0

I would go with

grep -A 3 -B 1 '<subTag1>1300</subTag1>' infile.xml > outfile.xml

-A select lines after context. -B select lines before context

which outputs nicely

<main tag>
<subTag1>1300</subTag1>
<subTag2>BBcycle</subTag2>
<subTag3>55</subTag3>
</main tag>

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .