221

In Java, I am trying to return all regex matches to an array but it seems that you can only check whether the pattern matches something or not (boolean).

How can I use a regex match to form an array of all string matching a regex expression in a given string?

2
  • 3
    Good question. The information you seek should be part of the Java docs on Regex and Matcher. Sadly, it isn't.
    – Cheeso
    Commented Oct 7, 2015 at 20:12
  • 4
    A real shame. This functionality seems to exist out of the box in nearly every other language (that has regular expression support).
    – Ray Toal
    Commented Apr 2, 2016 at 4:22

6 Answers 6

335

(4castle's answer is better than the below if you can assume Java >= 9)

You need to create a matcher and use that to iteratively find matches.

 import java.util.regex.Matcher;
 import java.util.regex.Pattern;

 ...

 List<String> allMatches = new ArrayList<String>();
 Matcher m = Pattern.compile("your regular expression here")
     .matcher(yourStringHere);
 while (m.find()) {
   allMatches.add(m.group());
 }

After this, allMatches contains the matches, and you can use allMatches.toArray(new String[0]) to get an array if you really need one.


You can also use MatchResult to write helper functions to loop over matches since Matcher.toMatchResult() returns a snapshot of the current group state.

For example you can write a lazy iterator to let you do

for (MatchResult match : allMatches(pattern, input)) {
  // Use match, and maybe break without doing the work to find all possible matches.
}

by doing something like this:

public static Iterable<MatchResult> allMatches(
      final Pattern p, final CharSequence input) {
  return new Iterable<MatchResult>() {
    public Iterator<MatchResult> iterator() {
      return new Iterator<MatchResult>() {
        // Use a matcher internally.
        final Matcher matcher = p.matcher(input);
        // Keep a match around that supports any interleaving of hasNext/next calls.
        MatchResult pending;

        public boolean hasNext() {
          // Lazily fill pending, and avoid calling find() multiple times if the
          // clients call hasNext() repeatedly before sampling via next().
          if (pending == null && matcher.find()) {
            pending = matcher.toMatchResult();
          }
          return pending != null;
        }

        public MatchResult next() {
          // Fill pending if necessary (as when clients call next() without
          // checking hasNext()), throw if not possible.
          if (!hasNext()) { throw new NoSuchElementException(); }
          // Consume pending so next call to hasNext() does a find().
          MatchResult next = pending;
          pending = null;
          return next;
        }

        /** Required to satisfy the interface, but unsupported. */
        public void remove() { throw new UnsupportedOperationException(); }
      };
    }
  };
}

With this,

for (MatchResult match : allMatches(Pattern.compile("[abc]"), "abracadabra")) {
  System.out.println(match.group() + " at " + match.start());
}

yields

a at 0
b at 1
a at 3
c at 4
a at 5
a at 7
b at 8
a at 10
6
  • 4
    I wouldn't suggest using an ArrayList here since you don't know upfront the size and might want to avoid the buffer resizing. Instead, I would prefer a LinkedList -- though it's just a suggestion and doesn't make your answer less valid whatsoever.
    – Liv
    Commented May 16, 2011 at 16:33
  • 14
    @Liv, take the time to benchmark both ArrayList and LinkedList, the results may be surprising. Commented May 16, 2011 at 16:37
  • I hear what you're saying and I am aware of the execution speed and memory footprint in both cases;the problem with the ArrayList is that the default constructor creates a capacity of 10 -- if you go past that size with calls to add() you will have to bear with the memory allocation and array copy -- and that might happen a few times. Granted, if you expect just a few matches then your approach is the more efficient one; if however you find that the array "resizing" happens more than once I would suggest a LinkedList, even more so if you're dealing with a low latency app.
    – Liv
    Commented May 16, 2011 at 16:51
  • 12
    @Liv, If your pattern tends to produce matches with a fairly predictable size, and depending on whether the pattern matches sparsely or densely (based on the the sum of the lengths of allMatches vs yourStringHere.length()), you can probably precompute a good size for allMatches. In my experience, the cost of LinkedList memory and iteration efficiency-wise is not usually worth it so LinkedList is not my default posture. But when optimizing a hot-spot, it is definitely worth swapping list implementations to see if you get an improvement. Commented May 16, 2011 at 16:57
  • 2
    In Java 9, you can now use Matcher#results to get a Stream which you can use to generate an array (see my answer).
    – 4castle
    Commented Oct 21, 2017 at 1:08
131

In Java 9, you can now use Matcher#results() to get a Stream<MatchResult> which you can use to get a list/array of matches.

import java.util.regex.Pattern;
import java.util.regex.MatchResult;
String[] matches = Pattern.compile("your regex here")
                          .matcher("string to search from here")
                          .results()
                          .map(MatchResult::group)
                          .toArray(String[]::new);
                    // or .collect(Collectors.toList())
3
  • 1
    their is no results() method , please run this first
    – Bravo
    Commented Nov 29, 2018 at 3:21
  • 24
    @Bravo Are you using Java 9? It does exist. I linked to the documentation.
    – 4castle
    Commented Nov 29, 2018 at 3:54
  • 2
    :(( is there any alternative for java 8
    – logbasex
    Commented May 8, 2020 at 6:51
26

Java makes regex too complicated and it does not follow the perl-style. Take a look at MentaRegex to see how you can accomplish that in a single line of Java code:

String[] matches = match("aa11bb22", "/(\\d+)/g" ); // => ["11", "22"]
5
16

Here's a simple example:

Pattern pattern = Pattern.compile(regexPattern);
List<String> list = new ArrayList<String>();
Matcher m = pattern.matcher(input);
while (m.find()) {
    list.add(m.group());
}

(if you have more capturing groups, you can refer to them by their index as an argument of the group method. If you need an array, then use list.toArray())

2
  • pattern.matches(input) does not work. You have to pass your regex pattern (again!) --> WTF Java?! pattern.matches(String regex, String input); Do you mean pattern.matcher(input)?
    – El Mac
    Commented Mar 4, 2016 at 9:36
  • @ElMac Pattern.matches() is a static method, you shouldn't call it on a Pattern instance. Pattern.matches(regex, input) is simply a shorthand for Pattern.compile(regex).matcher(input).matches().
    – dimo414
    Commented Feb 25, 2017 at 21:04
5

From the Official Regex Java Trails:

        Pattern pattern = 
        Pattern.compile(console.readLine("%nEnter your regex: "));

        Matcher matcher = 
        pattern.matcher(console.readLine("Enter input string to search: "));

        boolean found = false;
        while (matcher.find()) {
            console.format("I found the text \"%s\" starting at " +
               "index %d and ending at index %d.%n",
                matcher.group(), matcher.start(), matcher.end());
            found = true;
        }

Use find and insert the resulting group at your array / List / whatever.

1
        Set<String> keyList = new HashSet();
        Pattern regex = Pattern.compile("#\\{(.*?)\\}");
        Matcher matcher = regex.matcher("Content goes here");
        while(matcher.find()) {
            keyList.add(matcher.group(1)); 
        }
        return keyList;

Not the answer you're looking for? Browse other questions tagged or ask your own question.