Delete lines that are not n consecutive lines starting with the same id

Question

I have a file that looks like this:

194104,41.8,38.3
194104,46.7,39.6
194104,47.4,39.7
194104,49.8,44.3
194104,50.8,47.5
194136,39.9,36.3
194136,45.2,37.8
194170,46.9,42.2
...

I want to keep the six first lines, starting with 194104, and then delete the next two lines, because there are only two lines starting with that number. And so on for the rest of the file.

Can this be done with sed/awk/grep or other unix tools?

What is there's a ID with 7 lines?
– glenn jackman
Commented Mar 30, 2016 at 9:59 — glenn jackman, Commented Mar 30, 2016 at 9:59
@glennjackman In my case, that will not happen.
– Vegard Stikbakke
Commented Mar 30, 2016 at 10:00 — Vegard Stikbakke, Commented Mar 30, 2016 at 10:00

RedGrittyBrick · Accepted Answer · 2016-03-30 09:48:52Z

Can this be done with sed/awk/grep or other unix tools?

Yes.

...

It can be done using tools like awk or perl in about 20 lines of code.

$ cat t.txt
194104,41.8,38.3
194104,46.7,39.6
194104,47.4,39.7
194104,49.8,44.3
194104,50.8,47.5
194136,39.9,36.3
194136,45.2,37.8
194170,46.9,42.2

$ perl t.pl t.txt
194104,41.8,38.3
194104,46.7,39.6
194104,47.4,39.7
194104,49.8,44.3
194104,50.8,47.5

$ wc -l t.pl
19 t.pl

The basic ideas I used were

loop over input a line at a time
append the lines to a buffer
check the value of the first word
keep a count of how many times it had been seen
if different, decide if to print and flush buffer, reset count

Pseudocode

This corresponds line by line with my perl code but perl is a bit terser (and I cuddle my elses even though Larry disapproves).

let my minimum be 5
let my buffer be blank
let my count be zero
let my prior first word be blank

while read a line

   if there is a numeric first word followed by a comma 
   then
      if that first word was the same as my prior first word
      then
         increment my count
      otherwise
         if my count is greater than or equal to my minimum
         then
           print my buffer
         end if
         empty my buffer
         let my count be one
      end if
      let my prior first word be the one I just read
      append the line I just read to my buffer
   end if
end while

It can probably be done in fewer lines or a longish one-liner.

Should have said "How can this be done?", I guess, haha. Thanks a lot! — Vegard Stikbakke, Commented Mar 30, 2016 at 9:51

user556625 · Accepted Answer · 2016-03-30 10:02:25Z

0

The specification may be ambiguous a bit because it's not clear if you wish exactly or at least six lines with the same prefix. On the other hand, in your example there are only 5 lines of that kind in the head which caused some confusion (I should count before I shoot) when I tested this:

$ cat 6lines.awk
$1 == prev {
   ++cnt
   block = block $0 RS
   if (cnt == 6) {
      printf block
      cnt = 0
      block = ""
   }
   next
}

{
   block = $0 RS
   prev = $1
   cnt = 1
}

awk -F, -f 6lines.awk input

We exploit that awk takes everything not assigned like an empty string (prev here).

answered Mar 30, 2016 at 10:02

user556625

4,2801 gold badge18 silver badges16 bronze badges

Oh wow, you're exactly right, I should have counted those properly! Thanks a lot!
– Vegard Stikbakke
Commented Mar 31, 2016 at 7:40

Add a comment |

glenn jackman · Accepted Answer · 2016-03-31 00:52:32Z

0

This seems to do the trick:

perl -F, -ane '
    if ($. > 1) {
        if (@q == 6) { print @q; undef @q }
        elsif ($F[0] ne $prev) { undef @q }
    }
    push @q, $_;
    $prev = $F[0];
    END { if (@q == 6) {print @q} }
'

answered Mar 31, 2016 at 0:52

glenn jackman

26.8k7 gold badges47 silver badges73 bronze badges

Add a comment |

Stack Exchange Network

Delete lines that are not n consecutive lines starting with the same id

3 Answers 3

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged
command-line
bash
unix
.

Hot Network Questions

Delete lines that are not n consecutive lines starting with the same id

3 Answers 3

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged command-linebashunix.

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
command-line
bash
unix
.