12

Perl RegEx and PCRE (Perl-Compatible RegEx) amongst others have the shorthand \K to discard all matches to the left of it except for capturing groups, but Java doesn't support it, so what's Java's equivalent to it ?

5
  • Does left part contain variable-length patterns?
    – revo
    Commented Jul 23, 2017 at 13:37
  • There is no equivalent in Java. However you can obtain what you want using capture groups. Using a lookbehind is sometime possible but it is most of the time less efficient. Commented Jul 23, 2017 at 14:00
  • @revo Usually it does. Commented Jul 23, 2017 at 16:58
  • @rautamiekka: Please check the answer below. Commented Jul 24, 2017 at 21:07
  • @WiktorStribiżew Ya, been aware of it. Commented Jul 25, 2017 at 19:12

2 Answers 2

8

There is no direct equivalent. However, you can always re-write such patterns using capturing groups.

If you have a closer look at \K operator and its limitations, you will see you can replace this pattern with capturing groups.

See rexegg.com \K reference:

In the middle of a pattern, \K says "reset the beginning of the reported match to this point". Anything that was matched before the \K goes unreported, a bit like in a lookbehind.

The key difference between \K and a lookbehind is that in PCRE, a lookbehind does not allow you to use quantifiers: the length of what you look for must be fixed. On the other hand, \K can be dropped anywhere in a pattern, so you are free to have any quantifiers you like before the \K.

However, all this means that the pattern before \K is still a consuming pattern, i.e. the regex engine adds up the matched text to the match value and advances its index while matching the pattern, and \K only drops the matched text from the match keeping the index where it is. This means that \K is no better than capturing groups.

So, a value\s*=\s*\K\d+ PCRE/Onigmo pattern would translate into this Java code:

String s = "Min value = 5000 km";
Matcher m = Pattern.compile("value\\s*=\\s*(\\d+)").matcher(s);
if(m.find()) {
    System.out.println(m.group(1));
}

There is an alternative, but that can only be used with smaller, simpler patterns. A constrained width lookbehind:

Java accepts quantifiers within lookbehind, as long as the length of the matching strings falls within a pre-determined range. For instance, (?<=cats?) is valid because it can only match strings of three or four characters. Likewise, (?<=A{1,10}) is valid.

So, this will also work:

    m = Pattern.compile("(?<=value\\s{0,10}=\\s{0,10})\\d+").matcher(s);
    if(m.find()) {
        System.out.println(m.group());
    }
    

See the Java demo.

0
0

Alternatively, use PCRE2 engine in Java via https://pcre4j.org when you're looking for advanced features.

Not the answer you're looking for? Browse other questions tagged or ask your own question.