Extract substring from a string using awk

Question

I have string which can be one of the following two formats :

dts12931212112 : some random message1 : abc, xyz
nodts : some random message2

I need to extract substring from these two string which doesn't have 'dts' part i.e. it should return :

some random message1 : abc, xyz
some random message2

I need to do this inside a bash script.

Can you help me with the awk command, which does this operation for both kind of strings?

Avinash Raj · Accepted Answer · 2015-03-09 05:13:30Z

2

Through awk's gsub function.

$ awk '{gsub(/^[^:]*dts[^:]*:|:[^:]*dts[^:]*/, "")}1' file
 some random message1 : abc, xyz
 some random message2
$ awk '{gsub(/^[^:]*dts[^:]*:[[:blank:]]*|:[^:]*dts[^:]*/, "")}1' file
some random message1 : abc, xyz
some random message2

You could apply the same regex in sed also, but you need to enable -r --regexp-extended parameter.

^ asserts that we are at the start. [^:]* negated character class which matches any character but not of :, zero or more times. So this ^[^:]*dts[^:]*: would match the substring at the start which contain dts. It it won't touch if the substring is present at the middle. This :[^:]*dts[^:]* pattern matches the middle or last substring which has dts. Finally replacing the matched chars with an empty string will give you the desired output.

Update:

$ awk '{gsub(/^[^[:space:]]*dts[^[:space:]]*[[:space:]:]*|[[:space:]:]*[^[:space:]]*dts[^[:space:]]*/, "")}1' file
some random message1 : abc, xyz
some random message2

edited Mar 9, 2015 at 5:13

answered Mar 9, 2015 at 4:58

Avinash Raj

174k30 gold badges241 silver badges284 bronze badges

I forgot to mention - consider any white space as separator after dts number (colon ':' may or may not be present ) . In that case this command doesn't work .
– Dharmendra
Commented Mar 9, 2015 at 5:06
Yes. Thanks. Btw is there any difference in awk in using [:blank:] and [:space:]?
– Dharmendra
Commented Mar 9, 2015 at 5:18
1

you could use [[:blank:]] also instead of [[:space:]] where blank matches any horizontal white space character( whitespace,tab) where space matches both horizontal and vertical whitespace character (space,tab,newline,carriage return).
– Avinash Raj
Commented Mar 9, 2015 at 5:21

Add a comment |

Jotne · Accepted Answer · 2015-03-09 07:49:17Z

1

Here is another awk

awk -F" : " '{$1="";sub(FS,"")}1' OFS=" : " file
some random message1 : abc, xyz
some random message2

Just remove first field when separated by :

Another version:

awk -F" : " '{print substr($0,index($0,$2))}' file
some random message1 : abc, xyz
some random message2

Get all data from second field and out separated by :

edited Mar 9, 2015 at 7:49

answered Mar 9, 2015 at 6:26

Jotne

41.2k13 gold badges53 silver badges58 bronze badges

Add a comment |

Collectives™ on Stack Overflow

Extract substring from a string using awk

2 Answers 2

Not the answer you're looking for? Browse other questions tagged
regex
string
bash
shell
awk
or ask your own question.

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Not the answer you're looking for? Browse other questions tagged regexstringbashshellawk or ask your own question.

Related

Not the answer you're looking for? Browse other questions tagged
regex
string
bash
shell
awk
or ask your own question.