0

I am trying to swap h and w in "hello World" as "wello horld" using back referencing. I am able to capture the group but something goes wrong when I refer the group in sub() method.

import re
st = "hello world"
t = re.compile(r"(\w).+\s(\w).+")
res = t.sub(r"\2 \1",st)
print(res)

I get output as "w h" instead of the desired string. What am I missing ?

2
  • 5
    `r"\2 \1"`` is the complete string you want output... so you'll need to also capture the parts you're not swapping and output those... Commented Jul 3, 2022 at 9:46
  • 2
    So that is t = re.compile(r"(\w)(.+\s)(\w)") and replace with r"\3\2\1" Commented Jul 3, 2022 at 9:47

2 Answers 2

1

Your regex approach has a problem. You should capture, for each word, two groups, the first character and rest of the word. Actually, we can just capture the first letter of the second word.

st = "hello world"
output = re.sub(r'(\w)(\w*) (\w)', r'\3\2 \1', st)
print(output)  # wello horld
0
-1

You are not capturing the first match correctly. The following will work for you:

import re st = "hello world" t = re.compile(r"(?<=\w)\.(?=\w)") res = t.sub(r"\2 \1",st) print(res)

Explanation: (?<=\w) is a positive look-behind assertion which ensures that we have at least one word character before us while matching any char but a whitespace or end of line, and similarly (?=\w) is a positive look-ahead assertion which makes sure that there's a word char after our current position. Also note how . matches both a literal dot as well as a special regex metacharacter. Hope it helps!

2
  • 3
    Why is all the code in the same line?
    – mkrieger1
    Commented Jul 3, 2022 at 9:49
  • 2
    There is no dot in hello world, why are you matching a dot? Commented Jul 3, 2022 at 10:37

Not the answer you're looking for? Browse other questions tagged or ask your own question.