30

I have this regex

(?:$|^| )(one|common|word|or|another)(?:$|^| )

which matches fine unless the two words are next to each other.

One one's more word'word common word or another word more another 

More and more years to match one or more other strings

And common word things and or

In the above it matches one in line two but not the or just next to it. Same for common and word int the third line.

Live Example: http://regex101.com/r/hV3wQ3

I believe it's something to do with the non-matching groups' number. But, I am not sure how to achieve the end goal of matching all the list of words without any char around them.

I do not want the one in one's or the word in word'word to be matched.

2 Answers 2

61

Since your capture groups define explicitly one character on either side of the common word, it's looking for space word space and then when it doesn't find another space, it fails.

In this case, since you don't want to match all the characters word boundary's would catch (period, apostrophe, etc.) you need to use a bit of trickery with lookaheads, lookbehinds, and non-capture groups. Try this:

(?:^|(?<= ))(one|common|word|or|another)(?:(?= )|$)

http://regex101.com/r/cM9hD8

Word boundaries are still simpler to implement, so for reference sake, you could also do this (though it would include ', ., etc.).

\b(one|common|word|or|another)\b
2
  • I've updated the question. That's the reason I'm not using word boundary. It matches word'word and one's.
    – San
    Commented Jan 30, 2014 at 15:51
  • 1
    OK that makes sense - in that case I've updated the answer with a new expression and link for you. Commented Jan 30, 2014 at 16:56
4

You can use (?:[\s]|^)(one|common|word|or|another)(?=[\s]|$) instead.

It will not match one's , someone ,etc...

Check DEMO

3
  • I don't want to match one's. That's why I don't use \b
    – San
    Commented Jan 30, 2014 at 6:29
  • Now the problem is word'word being matched. :(
    – San
    Commented Jan 30, 2014 at 15:50
  • It doesn't work if the words are next to each other. it matches common in common word but not word. Remus's answer does the job exactly. Thanks for your suggestions.
    – San
    Commented Jan 31, 2014 at 8:07

Not the answer you're looking for? Browse other questions tagged or ask your own question.