1

I am trying to parse out to groups of numerics that match a certain criteria.

Text Sample

KBOS 052354Z 19012KT 10SM FEW075 BKN110 OVC200 24/14 A2975 RMK AO2 SLP074 T02390144 10289 20239 55002

Needed groups to be parsed

10289 20239

Attempted Code

echo "KBOS 052354Z 19012KT 10SM FEW075 BKN110 OVC200 24/14 A2975 RMK AO2 SLP074 T02390144 10289 20239 55002" | grep -E '^1[0-9][0-9][0-9][0-9]'
echo "KBOS 052354Z 19012KT 10SM FEW075 BKN110 OVC200 24/14 A2975 RMK AO2 SLP074 T02390144 10289 20239 55002" | grep -E '^2[0-9][0-9][0-9][0-9]'

What am I doing wrong?

1 Answer 1

1

You are trying to match only at the start of the string with ^ anchor at the start. Besides, you are not extracting matches, you only get the lines matching the pattern.

Use

grep -oE '\b[12][0-9]{4}\b'

It will fetch you the matched substrings only thanks to -o and the pattern will only match

  • \b - word boundary
  • [12] - 1 or 2
  • [0-9]{4} - any four digits
  • \b - word boundary.

See an online grep demo:

s="KBOS 052354Z 19012KT 10SM FEW075 BKN110 OVC200 24/14 A2975 RMK AO2 SLP074 T02390144 10289 20239 55002"
grep -oE '\b[12][0-9]{4}\b' <<< "$s"
# Or grep -oE '\<[12][0-9]{4}\>' <<< "$s"

Output:

10289
20239
1
  • 1
    Thank you Wiktor! This is precisely what I was looking for
    – arnpry
    Commented Jun 6, 2019 at 13:46

Not the answer you're looking for? Browse other questions tagged or ask your own question.