2

I'm trying to match first letter, in this example "B", and second column "2". When match is found replace characters [38-41] with white spaces.

Here is the data I'm trying to modify:

A1234A123 1 2 12345.12345 1234.1234.112341234

B1234A123 2 2 12345.12345 1234.1234.112341234

A1234A123 2 2 12345.12345 1234.1234.112341234

I can match the conditions with awk using:

awk '/^B/ && $2=="2" {print}' 

and I can modify the lines with sed using:

sed -r 's/^(.{37})(.{4})/\1    /' 

I'm trying to find the lines in the file which contains the two conditions and then modify the characters, while still printing the entire line of lines that don't match. Can you combine the two commands in order to introduce some sort of if/then statement?

I've tried to combine the commands, but it edited all of the lines:

awk '/^B/ && $2=="2" {print}' ¦ sed -r 's/^(.{37})(.{4})/\1    /' data

Resulting data should look like this:

A1234A123 1 2 12345.12345 1234.1234.112341234

B1234A123 2 2 12345.12345 1234.1234.1    1234

A1234A123 2 2 12345.12345 1234.1234.112341234

Thanks in advance.

4
  • You never need to combine sed and awk (or grep and awk). sed is an excellent tool for simple substitutions on a single line, for any other text manipulation just use awk.
    – Ed Morton
    Commented Dec 29, 2013 at 12:37
  • OK @Ed, thanks for the advice and corrections to others posts. I was thinking the solution was more difficult than it ended up being. The more I read on AWK, the more I realize it's potential. I'll keep studying! Thanks again.
    – fryman84
    Commented Dec 29, 2013 at 18:46
  • stackoverflow.com/questions/1632113/…
    – fryman84
    Commented Dec 29, 2013 at 19:09
  • The discussion on that link is all fine but THE important thing to know about the 2 tools is that sed was invented before awk. Once awk was invented in the mid-1970s most of seds language constructs became obsolete so today the only useful sed constructs are s, g, and p (with the -n option) and any time you're using hold space or pattern space or whatever other "space" sed supports, you are using the wrong tool. sed is an excellent tool for simple substitutions on a single line - that's it.
    – Ed Morton
    Commented Dec 29, 2013 at 19:17

4 Answers 4

3

You can use single awk to combine both commands:

awk '/^B/ && $2=="2"{$0=substr($0, 1, 37) "    " substr($0, 38, 4)} 1' file
A1234A123 1 2 12345.12345 1234.1234.112341234
B1234A123 2 2 12345.12345 1234.1234.1    1234
A1234A123 2 2 12345.12345 1234.1234.112341234
4
  • Got it! Thanks! I actually just changed the character location of the last substr so that it would retain the last four characters. awk '/^B/ && $2=="2"{$0=substr($0, 1, 37) " " substr($0, 42, 4)} 1' file
    – fryman84
    Commented Dec 29, 2013 at 7:36
  • You're welcome, yes substr($0, 42, 4) will also return 1234 in output.
    – anubhava
    Commented Dec 29, 2013 at 7:48
  • You don't need the , 4 arg in the last substr().
    – Ed Morton
    Commented Dec 29, 2013 at 12:29
  • Yes if picking right most part of a string then 2nd argument isn't needed in substr
    – anubhava
    Commented Dec 29, 2013 at 12:34
2

You may instruct sed to replace only the matching line (/^B[^ ]* 2/) by prepending the regex:

sed -r '/^B[^\s]*\s2\s/s/^(.{37}).{4}/\1    /' data
4
  • Whatchout 2 could match 20. No need to group the bit you are going to throw away.
    – potong
    Commented Dec 29, 2013 at 10:47
  • @potong you are absolutely correct. I just copied the OP's regex.
    – Tomas
    Commented Dec 29, 2013 at 10:48
  • Always dangerous to copy the code of the one person in the thread who you KNOW doesn't know how to solve the problem :-).
    – Ed Morton
    Commented Dec 29, 2013 at 13:29
  • @EdMorton not really dangerous in this case ;-)
    – Tomas
    Commented Dec 29, 2013 at 13:30
1

With GNU awk:

gawk '/^B/ && $2=="2" {print gensub(/(.{37}).{4}/,"\\1    ","")}' data
0

In Gnu Awk version 4 you could try:

gawk 'BEGIN { FIELDWIDTHS = "1 9 1 26 4 20"; OFS="" }
$1=="B" && $3=="2" {
    $5="    "
} 1' file

with output:

A1234A123 1 2 12345.12345 1234.1234.112341234
B1234A123 2 2 12345.12345 1234.1234.1    1234
A1234A123 2 2 12345.12345 1234.1234.112341234

Not the answer you're looking for? Browse other questions tagged or ask your own question.