I have several Word documents where line breaks (paragraph breaks) have been added purely for cosmetic reasons (probably by a human, but maybe by an OCR system or something similar). I want to remove these extra line breaks from the documents. Basically, an 'extra' line break is one that is surrounded by lower-case letters on either side (with optional whitespace). Unfortunately, though, if I want to find paragraph breaks in Word (^p) I can't use character classes to find only lower-case letters ([a-z]) and vice versa.

Basically I want to use a multiline regex on the document so I can find something like the following:


and replace the newline with a space. Is there any way I can search for both paragraph marks (^p in Word) and character classes (or just lower-case letters in general)?


This is some text.

would not match, but

this text is on one line and¶
goes on to the next line.

would match and the “¶” would be replaced by a space.

2 Answers 2


I can’t tell from what you’ve said whether you know that, if you click on More >> in the Microsoft Word “Find and Replace” dialog box, you get a “Search Options” panel that includes a “Use wildcards” option.  Note that it supports an arcane wildcard language, not regular expression notation.  To begin with this option, use [a-z]^13[a-z].  For some reason, you can’t use ^p in a wildcard search, but ^13 is the wildcard-enabled equivalent of ^p.

The whitespace is a little trickier.  The best I can come up with is that you have to do the search four times, using

  • [a-z]^13[a-z]
  • [a-z][^t ]{1,99}^13[a-z]
  • [a-z]^13[^t ]{1,99}[a-z] ,   and
  • [a-z][^t ]{1,99}^13[^t ]{1,99}[a-z]

since, oddly enough, ^t works in wildcard mode.  \s and * don’t mean what they mean in regular expressions.  {n,m} does work, but n has to be positive.  And note that you can’t just replace matches with a space, since the last preceding letter and the first following letter are included in the match, and would get clobbered.

For extra credit: You might want to look for a - (hyphen) as the last printing character before the line break; but be sure to address these two (different) cases:

                                                          … surrounded by lower-¶
case letters on either side (with optional whitespace).  Unfor-¶
tunately, though, …

  • ^13 was exactly what I was looking for. Thanks! So strange that it makes you change it when turning on wildcards. It would be nice if Word supported full regex, but this wildcard language will work for now. PS - do you have a link to a list of all those numbered wildcards? I couldn't easily find one.
    – Drewmate
    Commented Jun 28, 2013 at 15:19
  • "... an arcane wildcard language, not regular expression notation." Well put!
    – Sabuncu
    Commented May 19, 2015 at 17:15

^13 is the wildcard-enabled equivalent of ^p.

This is almost true, but note there is a slight difference between ^13 and ^p. Paragraph breaks replaced with ^13 seem to lose the double space that you get with a normal-style paragraph break in Word.

first paragraph¶

second paragraph¶

third paragraph¶


first paragraph¶
second paragraph¶
third paragraph¶

To solve this, be sure to use ^p paragraph marks in the replace portion of the find and replace dialog. The restriction on ^p with wildcards only applies to the find portion of the dialog.

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .