Grep for a line containing only 5 or 6 numbers

Question

How would you grep for a line containing only 5 or 6 numbers? Something like this.

case 1 (has leading space)

           10      2       12      1       13

case 2 (no leading space)

   1       2       3       4       5        6

I thought something like this would work.

grep -E '[0-9]{5}'

@WarrenYoung no. I'm just trying to filter through massive debugging files. — cokedude, Commented Dec 4, 2014 at 7:20
@cokedude Can you post some more example lines, preferably a representative sample as to what should and should not match? — muru, Commented Dec 4, 2014 at 7:27
@muru yours grabs everything I want. It also creates a line separator from my original numbers which makes it even more readable than I expected. — cokedude, Commented Dec 4, 2014 at 8:03
@cokedude what about if line contained more than six numbers in it 1 2 3 4 5 6 7? Should this line must present in output or not? — αғsнιη, Commented Dec 4, 2014 at 9:59

muru · Accepted Answer · 2014-12-04 07:21:16Z

9

grep -E '[0-9]{5}' is looking for numbers with at least 5 digits. What you need is 5 numbers with at least one digit:

grep -E '[0-9]+([^0-9]+[0-9]+){4}'

[0-9]+ - a number of at least one digit
[^0-9]+[0-9]+ - a number with at least one digit, preceded by at least one non-digit character. We then repeat this 4 times to get 5 numbers separated by non-digits.
If the requirement is exactly 5, you might want to surround this regex with [^0-9] so that the entire line is matched (with the anchors, of course).
Depending on what you want here (does 1,2,3,4,6 qualify?), you might look at other separators. For example, a proper scientific notation real number would look like: [+-]?(([0-9]+(\.[0-9]+)?)|([0-9]?\.[0-9]+))([eE][+-][0-9]+)? So separators may not include ., e, etc. They may only be whitespace, as mikeserv notes. Or they maybe commas, if it's a CSV record. Or depending on the locale, a comma would be the decimal separator. Vary [^0-9] as per your need.

edited Dec 4, 2014 at 7:21

answered Dec 4, 2014 at 7:05

muru

73.9k14 gold badges204 silver badges303 bronze badges

2

I don't have any commas between my numbers. I just have a bunch of space between my numbers. I did make a couple of trivial tweaks to also grep for fault and Fault. grep -E '[0-9]+([^0-9]+[0-9]+){4}|fault|Fault'
– cokedude
Commented Dec 4, 2014 at 8:07
yours works the I want. I just added more description to clarify for cuonglm.
– cokedude
Commented Dec 4, 2014 at 8:29
@muru this returns 1 2 3 4 5 6 7 and 111 222 333 444 A555 as result but those are not actually contain only 5 or 6 numbers.
– αғsнιη
Commented Dec 4, 2014 at 16:32
@KasiyA, see the 3rd bullet-point in the answer.
– Stéphane Chazelas
Commented Dec 4, 2014 at 16:45
1

@KasiyA if you insist: grep -P '^\s*\d+(\s+\d+){4,5}\s*$'
– muru
Commented Dec 4, 2014 at 16:51

| Show 2 more comments

Joseph R. · Accepted Answer · 2014-12-04 06:57:32Z

3

I would go with something a little more powerful than grep. This can do it in perl:

perl -ne 'print if s/\d+/$&/g == 5' your_file

The regex substitution replaces all groups of one or more digits (\d+) with themselves ($&): it does nothing. It is used merely for side effect since the s/// operator returns the number of times it managed to substitute for its regex. Thus, the line is printed only if s/// found 5 groups of digits.

answered Dec 4, 2014 at 6:57

Joseph R.

39.8k8 gold badges111 silver badges145 bronze badges

This will match line like 1a 1 1 1 1
– cuonglm
Commented Dec 4, 2014 at 7:07
@cuonglm I thankfully don't have any numbers like that. All of my important numbers are on there own line and separated by space.
– cokedude
Commented Dec 4, 2014 at 8:10
@josephr Can you please explain how yours works? It doesn't grab this massive list of numbers 1 2 3 4 5 6 7 8 9 10 11, but it does grab this 1 2 3 4 5 6. It does chop off the 1 though.
– cokedude
Commented Dec 4, 2014 at 8:13

Add a comment |

cuonglm · Accepted Answer · 2014-12-04 08:35:18Z

2

Another perl:

$ perl -MList::Util=first -Tnle '
  s/^\s+|\s+$//g;
  @e = split /\s+/;
  print if @e == 5 || @e == 6 and !first {/\D/} @e;
' file
10      2       12      1       13

Explanation

s/^\s+|\s+$//g trim the line.
@e = split /\s+/ split the line into array @e.
We will print the line if:
- array @e contains 5 or 6 elements.
- And None of its elements contain non-digit characters (\D match non-digit characters).

edited Dec 4, 2014 at 8:35

answered Dec 4, 2014 at 7:26

cuonglm

155k39 gold badges335 silver badges412 bronze badges

How does it work?
– muru
Commented Dec 4, 2014 at 7:54
@muru: Added explanation.
– cuonglm
Commented Dec 4, 2014 at 8:12
@cuonglm is \D wrong? I wanted it to match digits. Right now it doesn't print anything.
– cokedude
Commented Dec 4, 2014 at 8:18
@cokedude: Is your input contain leading spaces?
– cuonglm
Commented Dec 4, 2014 at 8:20
@cuonglm sorry I wasn't clear. The 5 word lines have several leading spaces. The 6 word lines have no leading space. And can you add one easy piece to also match fault or Fault?
– cokedude
Commented Dec 4, 2014 at 8:31

Add a comment |

Laurentiu Roescu · Accepted Answer · 2014-12-04 15:50:14Z

2

grep -E '^(\s*[0-9]+\s+){4,5}[0-9]+\s*$'

answered Dec 4, 2014 at 15:50

Laurentiu Roescu

6495 silver badges8 bronze badges

Add a comment |

Community · Accepted Answer · 2017-04-13 12:37:02Z

2

awk '{l=$0; n = gsub(/[0-9]+/, "", l)}; n == 5 || n == 6'

(same principle as in Joseph's answer)

edited Apr 13, 2017 at 12:37

CommunityBot

1

answered Dec 4, 2014 at 15:08

Stéphane Chazelas

554k92 gold badges1.1k silver badges1.6k bronze badges

I take it the principle is the same as in this answer?
– muru
Commented Dec 4, 2014 at 16:56
it matches with 4a too..
– JigarGandhi
Commented Dec 10, 2014 at 7:29
@JigarGandhi, yes, nobody said it shouldn't or how the numbers may or may not be separated.
– Stéphane Chazelas
Commented Dec 10, 2014 at 8:17

Add a comment |

Community · Accepted Answer · 2020-06-11 12:04:56Z

0

A way with awk that is customisable for different numbers of fields.
Also whitespace does not matter.

awk 'NF~/^(5|6)$/{x=0;for(i=1;i<=NF;i++)x+=($i~/^[0-9]+$/)}x==NF' file

This checks the number of fields is 5 or 6 although more numbers of fields could be added if your requirements ever change.

Then it sets a counter to 0

Then loops checking each field is a number and if it is adds 1 to the counter

If the counter equals the number of fields it prints the line.

example

input

  1       2       3       4       5        6
        2       3       4       5        6
3       4       5        6
blah  1       2       3       4       5
      4       3324       4       53        6

output

  1       2       3       4       5        6
        2       3       4       5        6
      4       3324       4       53        6

edited Jun 11, 2020 at 12:04

CommunityBot

1

answered Dec 4, 2014 at 8:47

user78605

why does it print out a bunch of x's? The very last line is right but the rest is just x's.
– cokedude
Commented Dec 4, 2014 at 8:54
What do you mean ? It shouldn't print any xs
– user78605
Commented Dec 4, 2014 at 8:57
pixhost.org/show/1864/24202911_awk_test.jpg
– cokedude
Commented Dec 4, 2014 at 9:01
@cokedude, whats the input and can you show me how you executed the command, also it shouldn't make any difference but check the version of awk you are using using awk --version
– user78605
Commented Dec 4, 2014 at 9:03
Is gnu awk 3.0.4 to old to do what I want? Its the awk that came with git bash.
– cokedude
Commented Dec 4, 2014 at 9:04

| Show 3 more comments

JigarGandhi · Accepted Answer · 2014-12-10 08:08:30Z

I know I have not solved using sed or awk or any of shell commands, However tcl did work out well.

I have kept the script simple and user friendly. elements like 4a abc etc will be taken care

command

script.tcl file

Script

#!/usr/bin/tclsh

if {$argv == ""} {
        puts "please enter the arguement"
        exit ;
}

set tar_fl [lindex $argv 0]
if {![file exists $tar_fl]} {
        puts "$tar_fl doesnot exist"
        exit ;
}

set flptr [open $tar_fl r]

while {[gets $flptr line] >=0 } {
        if {[llength $line] !=5} {continue ;}
        if {[llength $line ] == 5} {
                if {[lsearch -regexp  $line {[^0-9]}]> -1} {continue;}
                puts $line
        }

}

close $flptr

Output

           10      2       12      1       13
1       2       3       4       5

Stack Exchange Network

Grep for a line containing only 5 or 6 numbers

7 Answers 7

example

command

Script

Output

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged
grep
.

Hot Network Questions

Grep for a line containing only 5 or 6 numbers

7 Answers 7

example

command

Script

Output

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged grep.

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
grep
.