0

I have looked around a lot on here but cannot find a solution to my problem and an a noob when using RegEx's.

I am trying to change a XML file with tags that look like this:

<key>Date Modified</key><date>2014-09-09T16:18:44Z</date>
<key>Date Added</key><date>2014-09-09T18:06:23Z</date>

To tags that look like this:

<key>Date Modified</key><date>2014-09-09T16:18:44Z</date>
<key>Date Added</key><date>2014-09-??T18:06:23Z</date>

Basically changing the Date Added field to 2014-09-?? for anything matching

<key>Date Modified</key><date>2014-09-09T16:18:44Z</date>

But the time "T16:18:44z" is always different. Only the date is the same.(i.e.)

<key>Date Modified</key><date>2014-09-09..........</date>

1 Answer 1

0

Regular expressions are the wrong tool for the job. See this famous diatribe on the subject: https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454

The right tool for the job is the XML transformation language XSLT. XSLT 2.0 can use regular expressions to manipulate the content of individual nodes, but it uses a proper XML parser to understand the markup. Here's a transformation rule you can include in your XSLT for this task:

<xsl:template match="date
                        [preceding-sibling::key[1]='Date Added']
                        [preceding-sibling::key[2]='Date Modified']
                        [starts-with(preceding-sibling::date[1],'2014-09-09')]">
  <date>
    <xsl:value-of select="concat(substring(.,1, 8), '??', substring(, 11))"/>
  </date>
</xsl:template> 

(The reason the match pattern is so complex is because the XML is so badly structured. There's no wrapper element that connects a key and a date, and none that connects the two key elements).

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .