Use Powershell to replace subsection of regex result

Question

Using Powershell, I know how to search a file for a complicated string using a regex, and replace that with some fixed value, as in the following snippet:

Get-ChildItem  "*.txt" |
Foreach-Object {
    $c = ($_ | Get-Content)
    $c = $c -replace $regexA,'NewText'
    [IO.File]::WriteAllText($_.FullName, ($c -join "`r`n"))
}

Now I'm trying to figure out how to replace a subsection of each match of a regex. Can this be done in one smooth step like above? Or do you have to extract each match of the larger regex, search and replace within it, and then somehow stick that result back into the original text?

To clarify with an example, suppose that in the following test text I want to find only the 14xx-numbered instances like "TEST=*1404" in the following text, and replace the 14xx with 16xx?

A 2180 1830 12 0 3 3 TEST=C1404
A 900 1830 12 0 3 3 TEST=R1413
A 400 1830 12 0 3 3 TEST=R1411
A 1090 1970 12 0 3 3 TEST=U1400
A 1090 1970 12 0 3 3 TEST=CSA1400
A 1090 1970 12 0 3 3 TEST=CSA1414
A 1090 1970 12 0 3 3 TEST=CSA140
A 1090 1970 12 0 3 3 TEST=CSA14001
A 1090 1970 12 0 3 3 TEST=CSA17001

I.e. I'd like the resulting text to be as follows, where you'll note that only the first 6 lines should change:

A 2180 1830 12 0 3 3 TEST=C1604
A 900 1830 12 0 3 3 TEST=R1613
A 400 1830 12 0 3 3 TEST=R1611
A 1090 1970 12 0 3 3 TEST=U1600
A 1090 1970 12 0 3 3 TEST=CSA1600
A 1090 1970 12 0 3 3 TEST=CSA1614 <- Second instance of '14' shouldn't change
A 1090 1970 12 0 3 3 TEST=CSA140 <- Shorter numbers shouldn't change
A 1090 1970 12 0 3 3 TEST=CSA14001 <- Longer numbers shouldn't change
A 1090 1970 12 0 3 3 TEST=CSA17001

The following regex seems to do the job of finding the larger strings where I need to make replacements, but I don't know what functionality in Powershell (replace?) to use to just replace the substring of the results. Also, feel free to suggest a better regex if that would help.

$regexA = "\bTEST=\b[A-Za-z]+14\d\d\r"

I'd rather not have to hard-code an exhaustive list of the stuff that can come between the '=' and the numbers, like 'R', 'C', "CSA", etc.

I've been working on something for an hour or so where I get all the matches for the regex, search within them to replace 14 with 16, then run replace on the original text with the old and new values, e.g. replace($myText,"TEST=CSA1400","TEST=CSA1600"), but this is not covering off the special cases very well, and it feels like I'm heading down the rabbit-hole.

Related post - Replace substring in PowerShell , Partial String Replacement using PowerShell & Replacing Part of String at Position in Powershell — RBT, Commented Apr 11, 2018 at 9:48

Ansgar Wiechers · Accepted Answer · 2019-09-13 11:27:00Z

34

You need to group the sub-expressions you want to preserve (i.e. put them between parentheses) and then reference the groups via the variables $1 and $2 in the replacement string. Try something like this:

$regexA = '( TEST=[A-Za-z]+)14(\d\d)$'

Get-ChildItem '*.txt' | ForEach-Object {
    $c = (Get-Content $_.FullName) -replace $regexA, '${1}16$2' -join "`r`n"
    [IO.File]::WriteAllText($_.FullName, $c)
}

edited Sep 13, 2019 at 11:27

answered Nov 11, 2013 at 22:30

Ansgar Wiechers

198k26 gold badges274 silver badges344 bronze badges

+1 I was scratching my head over escaping that replacement agument.
– mjolinor
Commented Nov 12, 2013 at 3:52
In your method, is there any danger to adding a -raw to the Get-Content and removing the join from the replacement part?
– SSilk
Commented Nov 12, 2013 at 14:54
@SSilk That should work, too, but you need to replace the $ in the regular expression with something like (\r|$).
– Ansgar Wiechers
Commented Nov 12, 2013 at 15:19
Another follow-up question: what if the value I'm sticking in the middle is itself another variable, external to the regex, rather than a fixed value. E.g. if the 16 in your sample code were some variable $number, what do I need to do to get it recognized as such? When I try that with your code, it literally prints the variable name with $. Thanks.
– SSilk
Commented Nov 12, 2013 at 19:28
5

@SSilk If you want to use regular variables in the replacement string you need to use double quotes instead of single quotes, and escape the variables referencing the groups from the regular expression: "`$1$number`$2".
– Ansgar Wiechers
Commented Nov 12, 2013 at 19:42

| Show 4 more comments

mjolinor · Accepted Answer · 2013-11-12 04:28:03Z

Here's an example using a scriptblock delegate (sometimes called an evaluator):

$regex = [regex]'( TEST=\D+)14(\d{2})\s*$'
$evaluator = { '{0}16{1}' -f $args[0].Groups[1..2] }
filter set-number { $regex.Replace($_, $evaluator) }

foreach ($file in Get-ChildItem  "*.txt")
 {
   ($file | get-content) | set-number | Set-Content $file.FullName
 }

It's arguably more complex than the -replace operator, but lets you use powershell operators to construct the replacement text, so you can do anything you can put in a script block.

Keith Hill · Accepted Answer · 2013-11-11 23:13:12Z

2

Try this:

Get-ChildItem  "*.txt" |
Foreach-Object {
  $c = $_ | Get-Content | Foreach {$_ -replace '(?<=TEST=\D+)14(?=\d{2}(\D+|$))','16'}
  $c | Out-File $_.FullName -Enc Ascii
}

edited Nov 11, 2013 at 23:13

answered Nov 11, 2013 at 22:37

Keith Hill

199k44 gold badges360 silver badges375 bronze badges

1

$f = $_.FullName; (Get-Content $f) -replace ... | Out-File $f ... is probably a more elegant approach.
– Ansgar Wiechers
Commented Nov 12, 2013 at 11:47

Add a comment |

Collectives™ on Stack Overflow

Use Powershell to replace subsection of regex result

3 Answers 3

Not the answer you're looking for? Browse other questions tagged
regex
string
powershell
replace
or ask your own question.

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Not the answer you're looking for? Browse other questions tagged regexstringpowershellreplace or ask your own question.

Linked

Related

Not the answer you're looking for? Browse other questions tagged
regex
string
powershell
replace
or ask your own question.