0

I am parsing a list of code numbers, they have a pattern of 12345.1211. They are space delimited. They sometimes will have a space followed by one - three addition number patterns like: 1221.121 11 111.111111 874.95 1211

I have a regex: [0-9]+\.[0-9]+ **

It finds a decimated number like 12345.1211 as . I encapsulate the regex with ( & ) and use \1\n to break each code with a newline.

I am using Notepad++ with find an replace. But the regex falls short with the space-included numbers. The extra numbers endup on the same line as the next pattern.

Example:

1221.121 11 111.111111  874.95 1211 456.155

I got:

1221.121
11 111.111111
874.95
1211 456.155

Is there anything I can do to optionally include the extra numbers separated by a space?

5
  • 1
    The list is space delimited, yet you have codes that include spaces?
    – Excellll
    Commented Feb 1, 2012 at 14:28
  • 1
    If it is space delimited but the pattern needs to sometimes (but only sometimes) include spaces, then you will need to rigorously define when a space needs to be included and when it does not. If the definition is precise and accurate enough, then you could probably write a regex, but without that definition it isn't really possible to write one that would be 100% correct.
    – EBGreen
    Commented Feb 1, 2012 at 14:54
  • Also posting the exact regex that you are currently using will help us understand better what you are trying to do and where a fix might need to go.
    – EBGreen
    Commented Feb 1, 2012 at 14:57
  • The regex that I am using is listed, and highlighted. The list is space delimited, because it was copied from a webpage. In my example data sets, a valid CODE number shouldn't have a space in it. However my employer has made exceptions to that rule. Anytime there is a CODE number with a space, it typically has one space followed by two numbers. But there have been examples where it trailed with one or three digits. But those are even rarer.
    – TheSavo
    Commented Feb 1, 2012 at 16:39
  • I think I may have found the answer my self. If it run the regex: ([0-9]+) , which leads with a space, and trails with a space will only find a number pattern that has a space in front and behind it. Since valid Code Numbers will have a decimal, this will only find, what would be the extra digits. I wrapped it in parens, and use a back reference, lead it with an underscore, and trail it with a space. '\1 ' So it will attach the extra digits to the parent, and delimit the whole string with a space. I then add and undersore to my regex pattern like so. "([0-9]+\.[0-9]+) "
    – TheSavo
    Commented Feb 1, 2012 at 17:44

2 Answers 2

0

On your test data this regex matches all the numbers perfectly for me;

[0-9]+[.]?[0-9]+

0
  • Ctrl+H
  • Find what: \b\d+(?:\.\d+)?\K\h+
  • Replace with: \n        or \r\n for windows linebreak
  • check Wrap around
  • check Regular expression
  • Replace all

Explanation:

\b              # word boundary
\d+             # 1 or more digits
(?:             # start non capture group
    \.          # a dot
    \d+         # 1 or more digits
)?              # end group, optional
\K              # forget all we have seen until this position
\h+             # 1 or more horizontal spaces

Screen capture (before):

enter image description here

Screen capture (after):

enter image description here

0

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .