3

I have a CSV file containing the following registers:

Name,Phone,Country
John,N/A,USA
Max,N/A,USA

Name,Color,Size
John,Blue,M
Max,Red,S

How can I read only the registers from Name,Color,Size and onwards, using bash?

Additionally, how can I limit the output until it reaches either the EOF or a blank line? So this:

Name,Phone,Country
John,N/A,USA
Max,N/A,USA

Name,Color,Size
John,Blue,M
Max,Red,S

Dummy,Dummy,Dummy
Foo,Foo,Bar

Wouldn't output this:

John,Blue,M
Max,Red,S

Dummy,Dummy,Dummy
Foo,Foo,Bar

But rather only this:

John,Blue,M
Max,Red,S

I have already tried grep and sed but without luck, I also tried tail but the number of lines is unknown until the file is read.

2
  • stackoverflow.com/questions/1560393/…
    – STTR
    Commented Apr 10, 2015 at 2:36
  • Since both cases may appear similar, the one being proposed on the question you refer actually includes the indentifier, however, the data being parsed on my case is only separated from the rest by the "header" of the table, no data after that has any field that can be used to retrieve it, with grep, for example.
    – arielnmz
    Commented Apr 10, 2015 at 2:45

2 Answers 2

2

Using awk

$ awk '/^$/{f=0} f{print} /Name,Color,Size/{f=1}' file
John,Blue,M
Max,Red,S

How it works

The awk script has one variable, f, which serves as a flag to identify when we are within a Name,Color,Size block.

  • /^$/{f=0}

    On a blank line, set f=0 to signal that we are out of the Name,Color,Size block.

  • f{print}

    When we are in the block, f==1, print the line.

  • /Name,Color,Size/{f=1}

    When we reach the Name,Color,Size header, set f=1 to signal that we are in the block.

Using GNU sed

$ sed -n '/Name,Color,Size/{:a; n; /./{p; ba;}}' file
John,Blue,M
Max,Red,S

How it works

  • -n

    Tell sed not to print anything unless we explicitly ask it to.

  • /Name,Color,Size/{...}

    If the line contains the Name,Color,Size header, then execute the commands in the braces:

    • :a;

      This defines a label a.

    • n;

      This reads in the next line.

    • /./{p; ba;}

      If this next line is not blank, then print it (p) and branch (b) back to label a.

    In this way, all lines within the block will be read and printed and the printing stops with the first empty line.

1

You can use sed to show only the things after a certain line by doing something like

sed -e '0,/Name,Color,Size/d' <file>

so you'll only see the lines that come after Name,Color,Size

3
  • Awesome, spot-on answer! Could you please elaborate a bit more on the syntax of that sed command? What's the meaning of 0,? Additionally, how could I limit the output to the first blank line? In case there were other "tables" on the same file?
    – arielnmz
    Commented Apr 10, 2015 at 2:51
  • 1
    @EricRenouf After you wrote your answer, the OP clarified his requirements: as shown in the example in the question, he wants the output to stop with the first blank line.
    – John1024
    Commented Apr 10, 2015 at 4:50
  • @arielnmz it looks like you've got some good answers already, but to answer your question about what my sed command is doing, here goes. By default sed will print every line. What I do is delete everything from the 0th line (the 0 in the command) up to the line that matches the pattern we want. So it will delete all those lines from the output and by default print all the rest. Commented Apr 10, 2015 at 12:32

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .