Here is a way to do the job, this will replace all duplicate words evaneven if they are not contiguous:
- Ctrl+H
- Find what:
(?:^|\G)(\b\w+\b),?(?=.*\1)
- Replace with:
LEAVE EMPTY
- check Wrap around
- check Regular expression
- DO NOT CHECK
. matches newline
- Replace all
Explanation:
(?:^|\G) : non capture group, beginning of line or position of last match
(\b\w+\b) : group 1, 1 or more word character (ie. [a-zA-Z0-9_]), surrounded by word boundaries
,? : optional comma
(?=.*\1) : positive lookahead, check if thhere is the same word (contained in group 1) somewhere after
Given an input like:
dangerous,dangerous,hazardous,perilous,dangerous,dangerous,hazardous,perilous
We got:
dangerous,hazardous,perilous