28

I'm new to XSLT. I wonder if it is possible to select a substring of an item. I'm trying to parse an RSS feed. The description value has more text than what I want to show. I'd like to get a subtring of it based on the index of some substring. Basically, I want to show the result of a substring call passing indxOf('some_substring') and a length as parameters. Is this possible?

From comments:

I want to select the text of a string that is located after the occurrence of substring

2
  • 2
    Your question isn't clear. Do you want a substring of some length beginning from some matched substring?
    – user357812
    Commented Dec 1, 2010 at 17:58
  • I want to select the text of a string that is located after the occurrence of substring. Commented Dec 1, 2010 at 18:27

6 Answers 6

57

It's not clear exactly what you want to do with the index of a substring [update: it is clearer now - thanks] but you may be able to use the function substring-after or substring-before:

substring-before('My name is Fred', 'Fred')

returns 'My name is '.

If you need more detailed control, the substring() function can take two or three arguments: string, starting-index, length. Omit length to get the whole rest of the string.

There is no index-of() function for strings in XPath (only for sequences, in XPath 2.0). You can use string-length(substring-before($string, $substring))+1 if you specifically need the position.

There is also contains($string, $substring). These are all documented here. In XPath 2.0 you can use regular expression matching.

(XSLT mostly uses XPath for selecting nodes and processing values, so this is actually more of an XPath question. I tagged it thus.)

8
  • substring-after looks like the one I need. I just need to figure out how to use it. I tried "<span><xsl:value-of select="substring-after(rss/channel/item/description,"Reusable HTML:") disable-output-escaping="yes"/></span>". Doesn't work. Commented Dec 1, 2010 at 18:26
  • Got it. <span><xsl:value-of select="substring-after(description,'Reusable HTML:') disable-output-escaping="yes"/></span> Commented Dec 1, 2010 at 18:38
  • @tou: why are you using disable-output-escaping? It might be appropriate, but more often it's a sign of using a chisel as a screwdriver, or some such. If a character is going to be escaped in the output, it's probably because it needs to be escaped. Unless you're trying to migrate markup from one level to another (e.g. trying to parse a raw string of HTML into a DOM).
    – LarsH
    Commented Dec 1, 2010 at 20:16
  • 1
    @LarsH: There is index-of() function in XPath 2, only it operates on sequences. One can generalize contains() with index-of() and subsequence() Commented Dec 1, 2010 at 20:47
  • 1
    @Sprotty: your point is valid. And yes, Alejandro's (user357812's) post covers this aspect of the situation. Whether it's actually a bug or just a limitation depends on the range of possible inputs the user may have.
    – LarsH
    Commented Jul 14, 2014 at 0:48
9

Here is some one liner xpath 1.0 expressions for IndexOf( $text, $searchString ):

If you need the position of the FIRST character of the sought string, or 0 if it is not present:

contains($text,$searchString)*(1 + string-length(substring-before($text,$searchString)))

If you need the position of the first character AFTER the found string, or 0 if it is not present:

contains($text,$searchString)*(1 + string-length(substring-before($text,$searchString)) + string-length($searchString))

Alternatively if you need the position of the first character AFTER the found string, or length+1 if it is not present:

1 + string-length($right) - string-length(substring-after($right,$searchString))

That should cover most cases that you need.

Note: The multiplication by contains( ... ) causes the true or false result of the contains( ... ) function to be converted to 1 or 0, which elegantly provides the "0 when not found" part of the logic.

2
  • I believe there is an error in the implementation of your second formula. It should read: contains($text,$searchString)*(1 + string-length(substring-before($text,$searchString)) + string-length($searchString)) (the last parenthesis is in the wrong place). But thanks anyway, this really helped me! Commented Dec 9, 2017 at 13:40
  • Thank you @silentsurfer - well spotted!
    – MrWatson
    Commented Dec 11, 2017 at 16:29
7

I want to select the text of a string that is located after the occurrence of substring

You could use:

substring-after($string,$match)

If you want a subtring of the above with some length then use:

substring(substring-after($string,$match),1,$length)

But problems begin if there is no ocurrence of the matching substring... So, if you want a substring with specific length located after the occurrence of a substring, or from the whole string if there is no match, you could use:

substring(substring-after($string,substring-before($string,$match)),
          string-length($match) * contains($string,$match) + 1,
          $length) 
0
5

The following is the complete example containing both XML and XSLT where substring-before and substring-after are used

<?xml version="1.0" encoding="UTF-8"?>
<persons name="Group_SOEM">
    <person>
        <first>Joe Smith</first>
        <last>Joe Smith</last>
        <address>123 Main St, Anycity</address>
    </person>
</persons>    

The following is XSLT which changes value of first/last name by separating the value by space so that after applying this XSL the first name element will have value "Joe" and last "Smith".

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="first">
    <first>
        <xsl:value-of select="substring-before(.,' ')" />
    </first>
</xsl:template> 
<xsl:template match="last">
    <last>
        <xsl:value-of select="substring-after(.,' ')" />
    </last>
</xsl:template> 
<xsl:template match="@* | node()">
    <xsl:copy>
        <xsl:apply-templates select="@* | node()" />
    </xsl:copy>
</xsl:template>
</xsl:stylesheet>   
3

There is a substring function in XSLT. Example here.

0

I wrote my own index-of function, inspired by strpos() in PHP.

<xsl:function name="fn:strpos">
    <xsl:param name="haystack"/>
    <xsl:param name="needle"/>
    <xsl:value-of select="fn:_strpos($haystack, $needle, 1, string-length($haystack) - string-length($needle))"/>
</xsl:function>

<xsl:function name="fn:_strpos">
    <xsl:param name="haystack"/>
    <xsl:param name="needle"/>
    <xsl:param name="pos"/>
    <xsl:param name="count"/>
    <xsl:choose>
        <xsl:when test="$count &lt; 0">
            <!-- Not found. Most common is to return -1 here (or maybe 0 in XSL?). -->
            <!-- But this way, the result can be used with substring() without checking. -->
            <xsl:value-of select="string-length($haystack) + 1"/>
        </xsl:when>
        <xsl:when test="starts-with(substring($haystack, $pos), $needle)">
            <xsl:value-of select="$pos"/>
        </xsl:when>
        <xsl:otherwise>
            <xsl:value-of select="fn:_strpos($haystack, $needle, $pos + 1, $count - 1)"/>
        </xsl:otherwise>
    </xsl:choose>
</xsl:function>

Not the answer you're looking for? Browse other questions tagged or ask your own question.