Get the character that precede each occurrence of given character/pattern in a string

Question

I'm trying to get the character that precede each occurrence of given character/pattern in a string using standard bash tools as grep, awk/gawk, sed ...

Step I: get the character that precede each occurrence of the character :

Example:

String 1 => :hd:fg:kl:

String 2 => :df:lkjh:

String 3 => :glki:l:s:d:

Expected results

Result 1 => dgl

Result 2 => fh

Result 3 => ilsd

I tried many times with awk but without success

Step II: Insert a given character between each character of the resulting string

Example with /

Result 1 => d/g/l

Result 2 => f/h

Result 3 => i/l/s/d

I have an awk expression for this step awk -F '' -v OFS="/" '{$1=$1;print}'

I don't know if it is possible to do Step I with awk or sed and why not do Step I and Step II in once.

Kind Regards

You might find this question helpful stackoverflow.com/questions/2777579/… — Leonard, Commented Jul 4, 2018 at 1:01
You should include a case of back-to-back colons in your sample input/output (e.g. foo::bar) if it can occur as that could be hard to handle depending on your requirements for doing so. Is the output o or o: or something else? If it cannot happen then add a statement to your question saying so. — Ed Morton, Commented Jul 4, 2018 at 11:34
@RavinderSingh13 I apologize for my late answer because I was offline — moocan, Commented Jul 7, 2018 at 1:12
@Ed Morton, that can not happen in my case ... but it's a very good advice — moocan, Commented Jul 7, 2018 at 1:14

Lacobus · Accepted Answer · 2018-07-04 02:10:34Z

1

What about:

awk 'BEGIN{FS=":"}{for(i=1;i<NF;i++){if(i>2)printf"/";printf substr($i,length($i))}print""}' input.txt

input.txt:

:hd:fg:kl:
:df:lkjh:
:glki:l:s:d:

Output:

d/g/l
f/h
i/l/s/d

edited Jul 4, 2018 at 2:10

answered Jul 4, 2018 at 2:06

Lacobus

1,63014 silver badges22 bronze badges

Add a comment |

RavinderSingh13 · Accepted Answer · 2018-07-04 02:30:34Z

1

Solution 1st: Could you please try following and let me know if this helps you.

awk -F":" '
{
  for(i=1;i<=NF;i++){
    if($i){ val=(val?val:"")substr($i,length($i)) }
  }
  print val;
  val=""
}' Input_file

Output will be as follows.

dgl
fh
ilsd

Solution 2nd: With a / in between output strings.

awk '
BEGIN{
  OFS="/";
  FS=":"
}
{
  for(i=1;i<=NF;i++){
    if($i){
      val=(val?val OFS:"")substr($i,length($i))
    }}
  print val;
  val=""
}' Input_file

Output will be as follows.

d/g/l
f/h
i/l/s/d

Solution 3rd: With match utility of awk.

awk '
{
  while(match($0,/[a-zA-Z]:/)){
    val=(val?val:"")substr($0,RSTART,RLENGTH-1)
    $0=substr($0,RSTART+RLENGTH)
   }
  print val
  val=""
}'  Input_file

edited Jul 4, 2018 at 2:30

answered Jul 4, 2018 at 2:07

RavinderSingh13

133k14 gold badges58 silver badges95 bronze badges

1

In my question, I always finished my examples with ":" which is a mistake from me because it can also end with any letter. In the case of a pattern such as ":hfd:l:jh:m", the output is "dlhm" for your first solution and "d/l/h/m" for the second solution. Your third solution works well because the output is "dlh".
– moocan
Commented Jul 7, 2018 at 1:48
@moocan, sue thanks for letting me know, please try to keep question's samples as per your requirement only because solutions will be given as per your samples, cheers and happy learning.
– RavinderSingh13
Commented Jul 7, 2018 at 2:19

Add a comment |

potong · Accepted Answer · 2018-07-04 01:43:42Z

0

This might work for you (GNU sed):

sed -r 's/[^:]*([^:]):+|:+/\1/g;s/\B/\//g' file

Replace zero or more non :'s followed by a single character followed by a : or a lone : by the single character globally throughout the line. Then replace insert a / between each character.

answered Jul 4, 2018 at 1:43

potong

57.5k6 gold badges52 silver badges86 bronze badges

in the case of a pattern such as ":hfd:l:jh:m", the output is "d/l/ h/m". In my question, I always finished my examples with ":" which is a mistake from me because it can also end with any letter
– moocan
Commented Jul 7, 2018 at 1:32

Add a comment |

James Brown · Accepted Answer · 2018-07-04 03:34:32Z

0

Perl and negative lookahead:

$ perl -p -e 's/.(?!:)//g' file
dgl
fh
ilsd

answered Jul 4, 2018 at 3:34

James Brown

37.1k8 gold badges49 silver badges62 bronze badges

Add a comment |

Sundeep · Accepted Answer · 2018-07-04 03:40:44Z

0

This is easier to do with perl

$ cat ip.txt
:hd:fg:kl:
:df:lkjh:
:glki:l:s:d:

$ perl -lne 'print join "/", /.(?=:)/g' ip.txt
d/g/l
f/h
i/l/s/d

/.(?=:)/g get all characters preceding :
- (?=:) is a lookahead construct
the resulting matches are then printed using / as delimiter string

edited Jul 4, 2018 at 3:40

answered Jul 4, 2018 at 3:35

Sundeep

23.5k2 gold badges31 silver badges113 bronze badges

works very well with all my test pattern even if the pattern is not ending with ":" but with any letter. Thanks
– moocan
Commented Jul 7, 2018 at 2:01

Add a comment |

ctac_ · Accepted Answer · 2018-07-04 08:56:24Z

0

With all sed with ERE

sed -E 's#[^:]*(.):#\1/#g;s/^.|.$//g' infile

answered Jul 4, 2018 at 8:56

ctac_

2,4612 gold badges8 silver badges18 bronze badges

Add a comment |

Chris Noyes · Accepted Answer · 2018-07-04 23:29:39Z

0

Using GNU sed:

sed -E 's/[^:]*([^:]):/\1/g; s/([^:])/\/\1/g; s/^:\///'

The first command, s/[^:]*([^:]):/\1/g matches strips out the extra characters and the colons (except the first one), so yields this:

:dgl
:fh
:ilsd

The second command s/([^:])/\/\1/g inserts a / before each character, yielding:

:/d/g/l
:/f/h
:/i/l/s/d

The last command s/^:\/// simply removes the :/ from the beginning of each line:

d/g/l
f/h
i/l/s/d

answered Jul 4, 2018 at 23:29

Chris Noyes

565 bronze badges

Add a comment |

dr-who · Accepted Answer · 2019-06-29 04:25:56Z

0

You could iterate across each line starting at the second character with gawk. Everytime the iterator hits a colon print the previous character.

$ awk <file.txt '{for(i=2;i<=length($0);i++) { \
                    if (substr($0,i,1)==":") printf substr($0,i-1,1);} printf "\n";}'
dgl
fh
ilsd

answered Jun 29, 2019 at 4:25

dr-who

1896 bronze badges

Add a comment |

Collectives™ on Stack Overflow

Get the character that precede each occurrence of given character/pattern in a string

8 Answers 8

Not the answer you're looking for? Browse other questions tagged
bash
awk
sed
or ask your own question.

Linked

Hot Network Questions

Collectives™ on Stack Overflow

8 Answers 8

Not the answer you're looking for? Browse other questions tagged bashawksed or ask your own question.

Linked

Related

Not the answer you're looking for? Browse other questions tagged
bash
awk
sed
or ask your own question.