5

I have a large string which contains multiline-substrings between two constant marker-strings, which I can identify with a regex.

For simplification I named them abcdef and fedcba here:

abcdef Sed lobortis nisl sed malesuada bibendum. fedcba
...

abcdef Fusce odio turpis, accumsan non posuere placerat. 
1
2
3
fedcba

abcdef Aliquam erat volutpat. Proin ultrices fedcba

How can I get all the occurrences including the markers from the large string?

1
  • abcdef[\s\S]*?fedcba
    – SamWhan
    Commented Jan 11, 2017 at 15:01

2 Answers 2

6

Something like

Pattern r = Pattern.compile("abcdef[\\s\\S]*?fedcba");
Matcher m = r.matcher(sInput);
if (m.find( )) {
    System.out.println("Found value: " + m.group() );
}

where sInput is your string to search.

[\s\S]*? will match any number of any character up to the following fedcba. Thanks to the ? it's a non-greedy match, which means it won't continue until the last fedcba (as it would if it was greedy), thus giving you the separate strings.

4
  • This is a good start, but seems to find only the first match (line).
    – yglodt
    Commented Jan 11, 2017 at 15:13
  • 1
    I really don't speak java (just a googler ;), but I think you could simply replace the if (m.find( )) { with while (m.find()) {. Check this answer: stackoverflow.com/a/16817458/2064981
    – SamWhan
    Commented Jan 11, 2017 at 15:16
  • There is one more specification which I forgot to mention. abcdef always must be the beginning of a line. How can I add that to the pattern?
    – yglodt
    Commented Jan 11, 2017 at 15:41
  • I found it, like this: (?m)^ Check stackoverflow.com/questions/6143304/…
    – yglodt
    Commented Jan 11, 2017 at 15:50
0

REGEXP:

(?:\babcdef)(?:.*\n)*(?:\bfedcba)

JAVA:

import java.util.regex.Matcher;
import java.util.regex.Pattern;

final String regex = "(?:\\babcdef)(?:.*\\n)*(?:\\bfedcba)";
final String string = "patata\n"
     + "abcdef\n"
     + "Aliquam erat volutpat. Proin ultrices\n"
     + "Testing\n\n"
     + "test[](test)\n"
     + "Testing\n"
     + "fedcba\n"
     + "Testing\n\n\n\n";

final Pattern pattern = Pattern.compile(regex, Pattern.MULTILINE);
final Matcher matcher = pattern.matcher(string);

while (matcher.find()) {
    System.out.println("Full match: " + matcher.group(0));
    for (int i = 1; i <= matcher.groupCount(); i++) {
        System.out.println("Group " + i + ": " + matcher.group(i));
    }
}

ORIGINAL TEXT:

patata
abcdef
Aliquam erat volutpat. Proin ultrices
Testing

test[](test)
Testing
fedcba
Testing

RESULT:

abcdef
Aliquam erat volutpat. Proin ultrices
Testing

test[](test)
Testing
fedcba

See: https://regex101.com/r/xXaLgN/5

Enjoy.

Do not forget that if I help you, mark me as the answer to the question.

Not the answer you're looking for? Browse other questions tagged or ask your own question.