46

Say I have a string

"3434.35353"

and another string

"3593"

How do I make a single regular expression that is able to match both without me having to set the pattern to something else if the other fails? I know \d+ would match the 3593, but it would not do anything for the 3434.35353, but (\d+\.\d+) would only match the one with the decimal and return no matches found for the 3593.

I expect m.group(1) to return:

"3434.35353"

or

"3593"

5 Answers 5

79

You can put a ? after a group of characters to make it optional.

You want a dot followed by any number of digits \.\d+, grouped together (\.\d+), optionally (\.\d+)?. Stick that in your pattern:

import re
print re.match("(\d+(\.\d+)?)", "3434.35353").group(1)
3434.35353
print re.match("(\d+(\.\d+)?)", "3434").group(1)
3434
0
5

This regex should work:

\d+(\.\d+)?

It matches one ore more digits (\d+) optionally followed by a dot and one or more digits ((\.\d+)?).

3

use (?:<characters>|). replace <characters> with the string to make optional. I tested in python shell and got the following result:

>>> s = re.compile('python(?:3|)')
>>> s
re.compile('python(?:3|)')
>>> re.match(s, 'python')
<re.Match object; span=(0, 6), match='python'>
>>> re.match(s, 'python3')
<re.Match object; span=(0, 7), match='python3'>```
1
  • 1
    I prefer this method when I don't desire to use the nested capture groups. For example, if would like to capture all of the digits & '.' inside parentheses for '(1)' or '(333.333.333)'. I would like the entire string (and more things after). I can treat them both as one group for later iteration.
    – Clay
    Commented Oct 17, 2019 at 1:43
2

Use the "one or zero" quantifier, ?. Your regex becomes: (\d+(\.\d+)?).

See Chapter 8 of the TextWrangler manual for more details about the different quantifiers available, and how to use them.

1

Read up on the Python RegEx library. The link answers your question and explains why.

However, to match a digit followed by more digits with an optional decimal, you can use

re.compile("(\d+(\.\d+)?)")

In this example, the ? after the .\d+ capture group specifies that this portion is optional.

Example

Not the answer you're looking for? Browse other questions tagged or ask your own question.