3

Hi i'm looking for a notepad++ regex pattern match in a text, and bring some replacements. But so far what I have come up with does not seem to work.

Sample text:

1 blablablabla. blablnsnsnns, blalblblbl: blablaa; balal blala. 2 blblb'blbµµ77777µµlblb blblb, blslsµµ105µµnlsllsl.
3 blalblblbl: blablaa; balal blala. 4 blblb'blbµµ9999µµlblb . Blblb, blslsnlsllsl. 5 jsjjsjj; gggbqbqbq:   ghshhqhhqh !. Gsgsjjsskksk. 6 fshhhshs, nnsnnsns! nsnnsn. 7 blalallallal7600hhzhz ; nmmkzjzbzbzb34fspmmm :
blslslslsavccacac,
hkkdlfmfmmf56balalala.
hdfmmfm87kdkkkkfkf.
8 blalalallajhshduie.
9 bslslslslls :
blslsllsllls,
bslslllsllsls.
nsnsnnsnsnnsnns,
hsbbbslslsllsllsls.
10 bslsllsllsllsllslls à sbsbbsbbsb , snsnnsnnsnnsn.

Pattern search:
I'm looking for the beginning of any number followed by a space and some texts and stops before the next number preceded by a space is encountered.

grouping should look like:

  • groupe 1 : a number
  • groupe 2 : the space character following the number
  • groupe 3 : some text which can also include some numbers but these numbers are not between space characters,instead they are between (µµ) .

Expected results:

<VERSETAG=1>blablablabla. blablnsnsnns, blalblblbl: blablaa; balal blala.</VERSETAG>
<VERSETAG=2> blblb'blbµµ77777µµlblb blblb, blslsµµ105µµnlsllsl.</VERSETAG>
<VERSETAG=3>blalblblbl: blablaa; balal blala.</VERSETAG>
<VERSETAG=4>blblb'blbµµ9999µµlblb . Blblb, blslsnlsllsl.</VERSETAG>
<VERSETAG=5>jsjjsjj; gggbqbqbq:   ghshhqhhqh !. Gsgsjjsskksk.</VERSETAG>
<VERSETAG=6>fshhhshs, nnsnnsns! nsnnsn.</VERSETAG>
<VERSETAG=7>blalallallal7600hhzhz ; nmmkzjzbzbzb34fspmmm :
blslslslsavccacac,
hkkdlfmfmmf56balalala.
hdfmmfm87kdkkkkfkf.</VERSETAG>
<VERSETAG=8>blalalallajhshduie.</VERSETAG>
<VERSETAG=9>bslslslslls :</VERSETAG>
<VERSETAG=10>bslsllsllsllsllslls à sbsbbsbbsb , snsnnsnnsnnsn.</VERSETAG>

Tested with this Regex thanks to @Toto (but so far not working as expected, see result bellow):

Look for :
(?:^\D*|\G )(\d+)\s+(.+?)\R?(?=\s\d+\s|\z) 

replace with :
<VERSETAG=$1>$2</VERSETAG>\n

Result of the test :

<VERSETAG=1>blablablabla. blablnsnsnns, blalblblbl: blablaa; balal blala.</VERSETAG>
 2 blblb'blbµµ77777µµlblb blblb, blslsµµ105µµnlsllsl.
<VERSETAG=3>blalblblbl: blablaa; balal blala.</VERSETAG>
<VERSETAG=4>blblb'blbµµ9999µµlblb . Blblb, blslsnlsllsl.</VERSETAG>
<VERSETAG=5>jsjjsjj; gggbqbqbq:   ghshhqhhqh !. Gsgsjjsskksk.</VERSETAG>
<VERSETAG=6>fshhhshs, nnsnnsns! nsnnsn.</VERSETAG>
 7 blalallallal7600hhzhz ; nmmkzjzbzbzb34fspmmm :
blslslslsavccacac,
hkkdlfmfmmf56balalala.
hdfmmfm87kdkkkkfkf.
8 blalalallajhshduie.
9 bslslslslls :
<VERSETAG=10>bslsllsllsllsllslls à sbsbbsbbsb , snsnnsnnsnnsn.</VERSETAG>

Thank you very much in advance!

4
  • Could you tell us, in words, what your pattern is supposed to achieve?
    – harrymc
    Commented Nov 13, 2018 at 10:09
  • Hi,As described in the expected result: When it finds a number that is followed by a space , it should put a tag there with that number inside the (<VERSETAG=1>) and before encountering the next number followed by a space, it should put the closing tag.(</VERSETAG>)
    – Pmicezjk
    Commented Nov 13, 2018 at 10:31
  • I just run with your new test case, it works pretty fine. Have you checked . matches newline?
    – Toto
    Commented Nov 13, 2018 at 18:21
  • Yes Indeed now working perfectly!! Awesome ! Great help !
    – Pmicezjk
    Commented Nov 13, 2018 at 18:34

1 Answer 1

2

Update according to comments:

  • Ctrl+H
  • Find what: (?:^\D*|\G )(\d+)\s+(.+?)\R?(?=\s\d+\s|\z)
  • Replace with: <VERSETAG=$1>$2</VERSETAG>\n
  • check Wrap around
  • check Regular expression
  • CHECK . matches newline
  • Replace all

Explanation:

(?:^\D*|\G )    # non capture group, beginning of line followed by 0 or more non digits  or restart from the last match position
(\d+)           # group 1, 1 or more digits
\s+             # 1 or more spaces
(.+?)           # group 2, 1 or more any character including new line, not greedy
\R?             # any kind of linebreak, optional
(?=\s\d+\s|\z)  # positive lookahead, make sure we have after 1 or more digits surround with spaces or end of file

Result for given example:

<VERSETAG=1>blablablabla. blablnsnsnns, blalblblbl: blablaa; balal blala.</VERSETAG>
<VERSETAG=2>blblb'blbµµ77777µµlblb blblb, blslsµµ105µµnlsllsl.</VERSETAG>
<VERSETAG=3>blalblblbl: blablaa; balal blala.</VERSETAG>
<VERSETAG=4>blblb'blbµµ9999µµlblb . Blblb, blslsnlsllsl.</VERSETAG>
<VERSETAG=5>jsjjsjj; gggbqbqbq:   ghshhqhhqh !. Gsgsjjsskksk.</VERSETAG>
<VERSETAG=6>fshhhshs, nnsnnsns! nsnnsn.</VERSETAG>
<VERSETAG=7>blalallallal7600hhzhz ; nmmkzjzbzbzb34fspmmm:
blslslslsavccacac,
hkkdlfmfmmf56balalala.
hdfmmfm87kdkkkkfkf.</VERSETAG>
<VERSETAG=8>blalalallajhshduie.</VERSETAG>

Capture screen

enter image description here

9
  • Awesome Toto! We are almost there! but it seems not to be working when the line starts with some texts: For example, the followin is not working : 1 blablablabla. blablnsnsnns, blalblblbl: blablaa; balal blala. 2 blblb'blbµµ77777µµlblb blblb, blslsµµ105µµnlsllsl. 3 blalblblbl: blablaa; balal blala. 4 blblb'blbµµ9999µµlblb . Blblb, blslsnlsllsl. 5 jsjjsjj; gggbqbqbq: ghshhqhhqh !. Gsgsjjsskksk. 6 fshhhshs, nnsnnsns! nsnnsn. dlflflf4532llflflf, vjjsjsjss8776nndndn54; qcqaataab. 7 hhhshhshhshhshhshhs, bslslslsllslsllsll.
    – Pmicezjk
    Commented Nov 13, 2018 at 15:53
  • @Pmicezjk: just replace (?:^|\G ) with (?:^\D*|\G ) at the beginning of regex. See my edit.
    – Toto
    Commented Nov 13, 2018 at 17:08
  • Still not working between (check around versetag 7 and 8). This is what I got when I replace (?:^|\G ) with (?:^\D*|\G ) : <VERSETAG=7>blalallallal7600hhzhz ; nmmkzjzbzbzb34fspmmm:</VERSETAG> blslslslsavccacac, hkkdlfmfmmf56balalala. hdfmmfm87kdkkkkfkf. <VERSETAG=8>blalalallajhshduie.</VERSETAG>
    – Pmicezjk
    Commented Nov 13, 2018 at 17:17
  • 1
    @Pmicezjk: I've completly rewrite the regex, not just add \D*
    – Toto
    Commented Nov 13, 2018 at 17:44
  • 1
    @Pmicezjk: Sorry, I don't get you, it seems working for me. Have you Checked . matches newline? Please, edit your question and add some input lines that don't work for you.
    – Toto
    Commented Nov 13, 2018 at 18:00

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .