2

I have string which can be one of the following two formats :

dts12931212112 : some random message1 : abc, xyz
nodts : some random message2

I need to extract substring from these two string which doesn't have 'dts' part i.e. it should return :

some random message1 : abc, xyz
some random message2

I need to do this inside a bash script.

Can you help me with the awk command, which does this operation for both kind of strings?

2 Answers 2

2

Through awk's gsub function.

$ awk '{gsub(/^[^:]*dts[^:]*:|:[^:]*dts[^:]*/, "")}1' file
 some random message1 : abc, xyz
 some random message2
$ awk '{gsub(/^[^:]*dts[^:]*:[[:blank:]]*|:[^:]*dts[^:]*/, "")}1' file
some random message1 : abc, xyz
some random message2

You could apply the same regex in sed also, but you need to enable -r --regexp-extended parameter.

^ asserts that we are at the start. [^:]* negated character class which matches any character but not of :, zero or more times. So this ^[^:]*dts[^:]*: would match the substring at the start which contain dts. It it won't touch if the substring is present at the middle. This :[^:]*dts[^:]* pattern matches the middle or last substring which has dts. Finally replacing the matched chars with an empty string will give you the desired output.

Update:

$ awk '{gsub(/^[^[:space:]]*dts[^[:space:]]*[[:space:]:]*|[[:space:]:]*[^[:space:]]*dts[^[:space:]]*/, "")}1' file
some random message1 : abc, xyz
some random message2
3
  • I forgot to mention - consider any white space as separator after dts number (colon ':' may or may not be present ) . In that case this command doesn't work .
    – Dharmendra
    Commented Mar 9, 2015 at 5:06
  • Yes. Thanks. Btw is there any difference in awk in using [:blank:] and [:space:]?
    – Dharmendra
    Commented Mar 9, 2015 at 5:18
  • 1
    you could use [[:blank:]] also instead of [[:space:]] where blank matches any horizontal white space character( whitespace,tab) where space matches both horizontal and vertical whitespace character (space,tab,newline,carriage return). Commented Mar 9, 2015 at 5:21
1

Here is another awk

awk -F" : " '{$1="";sub(FS,"")}1' OFS=" : " file
some random message1 : abc, xyz
some random message2

Just remove first field when separated by :


Another version:

awk -F" : " '{print substr($0,index($0,$2))}' file
some random message1 : abc, xyz
some random message2

Get all data from second field and out separated by :

Not the answer you're looking for? Browse other questions tagged or ask your own question.