110

I need some help to figure out how to use the sed command to only show the first column and last column in a text file. Here is what I have so far for column 1:

cat logfile | sed 's/\|/ /'|awk '{print $1}'

My feeble attempt at getting the last column to show as well was:

cat logfile | sed 's/\|/ /'|awk '{print $1}{print $8}'

However this takes the first column and last column and merges them together in one list. Is there a way to print the first column and last columns clearly with sed and awk commands?

Sample input:

foo|dog|cat|mouse|lion|ox|tiger|bar
1
  • 9
    Please provide some sample input.
    – jasonwryan
    Commented Jun 13, 2014 at 4:52

6 Answers 6

147

Almost there. Just put both column references next to each other.

cat logfile | sed 's/|/ /' | awk '{print $1, $8}'

Also note that you don't need cat here.

sed 's/|/ /' logfile | awk '{print $1, $8}'

Also note you can tell awk that the column separators is |, instead of blanks, so you don't need sed either.

awk -F '|' '{print $1, $8}' logfile

As per suggestions by Caleb, if you want a solution that still outputs the last field, even if there are not exactly eight, you can use $NF.

awk -F '|' '{print $1, $NF}' logfile

Also, if you want the output to retain the | separators, instead of using a space, you can specify the output field separators. Unfortunately, it's a bit more clumsy than just using the -F flag, but here are three approaches.

  • You can assign the input and output field separators in awk itself, in the BEGIN block.

    awk 'BEGIN {FS = OFS = "|"} {print $1, $8}' logfile
    
  • You can assign these variables when calling awk from the command line, via the -v flag.

    awk -v 'FS=|' -v 'OFS=|' '{print $1, $8}' logfile
    
  • or simply:

    awk -F '|' '{print $1 "|" $8}' logfile
    
5
  • 6
    Good job breaking down how this problem can be simplified. You might add a note about how to use | as an output separator instead of the default space for string concatenation. Also you could explain to use $NF instead of hard coding $8 to get the last column.
    – Caleb
    Commented Jun 13, 2014 at 7:29
  • after that how to update the file? Commented Aug 26, 2020 at 6:22
  • @pankajprasad Write to a new file with > then overwrite the old one, or use sponge. This is really a new question though.
    – Sparhawk
    Commented Aug 26, 2020 at 6:26
  • @Sparhawk it works, but reaming content is erased. how to deal with that? Commented Aug 26, 2020 at 7:21
  • @pankajprasad You need to ask a new question. Click the big blue button up the top that says "Ask Question".
    – Sparhawk
    Commented Aug 26, 2020 at 9:43
31

You are using awk anyway:

awk '{ print $1, $NF }' file
6
  • 4
    Wouldn't you need to specify the input field separator (since in this case it seems to be | rather that space) with -F\| or similar? Also what if he wanted to use the same delimiter for output?
    – Caleb
    Commented Jun 13, 2014 at 7:22
  • 1
    @Caleb Probably: I was waiting for the OP to confirm what exactly the input looked like, rather than trying to guess based on the non-working examples...
    – jasonwryan
    Commented Jun 13, 2014 at 7:28
  • 1
    Note that that assumes the input contains at least 2 fields. Commented Jun 13, 2014 at 7:56
  • @StéphaneChazelas OP clearly stated in code that it has eight fields, always. Commented Jun 13, 2014 at 7:58
  • 3
    @michaelb958 I think "clearly" is overstating the case, just a little :)
    – jasonwryan
    Commented Jun 13, 2014 at 7:59
19

Just replace from the first to last | with a | (or space if you prefer):

sed 's/|.*|/|/'

Note that though there's no sed implementation where | is special (as long as extended regular expressions are not enabled via -E or -r in some implementations), \| itself is special in some like GNU sed. So you should not escape | if you intend it to match the | character.

If replacing with space and if the input may already contain lines with only one |, then, you'll have to treat that specially as |.*| won't match on those. That could be:

sed 's/|\(.*|\)\{0,1\}/ /'

(that is make the .*| part optional) Or:

sed 's/|.*|/ /;s/|/ /'

or:

sed 's/\([^|]*\).*|/\1 /'

If you want the first and eighth fields regardless of the number of fields in the input, then it's just:

cut -d'|' -f1,8


(all those would work with any POSIX compliant utility assuming the input forms valid text (in particular, the sed ones will generally not work if the input has bytes or sequences of bytes that don't form valid characters in the current locale like for instance printf 'unix|St\351phane|Chazelas\n' | sed 's/|.*|/|/' in a UTF-8 locale)).

11

If you find yourself awk- and sed-less, you can achieve the same thing with coreutils:

paste <(           cut -d'|' -f1  file) \ 
      <(rev file | cut -d'|' -f1 | rev)
2
  • 1
    cut is cleaner and more compact than awk/sed when you are just interested in the first column, or if the delimeters are fixed (i.e. not a variable number of spaces). Commented Aug 28, 2019 at 23:03
  • Pretty elegant!
    – Rolf
    Commented Jul 20, 2020 at 10:04
3

It seems like you are try to get the first and last fields of text which are delimited by |.

I assumed your log file contains the text like below,

foo|dog|cat|mouse|lion|ox|tiger|bar
bar|dog|cat|mouse|lion|ox|tiger|foo

And you want the output like,

foo bar
bar foo

If yes, then here comes the command for your's

Through GNU sed,

sed -r 's~^([^|]*).*\|(.*)$~\1 \2~' file

Example:

$ echo 'foo|dog|cat|mouse|lion|ox|tiger|bar' | sed -r 's~^([^|]*).*\|(.*)$~\1 \2~'
foo bar
3
  • The columns are not delimited by a pipe | but they are in columns, I am interested in using sed but not using the awk command like you did in your command: sed -r 's~^([^|]*).*\|(.*)$~\1 \2~' file
    – user70573
    Commented Jun 16, 2014 at 0:46
  • "The columns are not delimited by a pipe | but they are in columns", you mean columns are separated by spaces? Commented Jun 16, 2014 at 0:50
  • A sample input and an output would be better. Commented Jun 16, 2014 at 0:51
1

You should probably do it with sed - I would anyway - but, just cause no one has written this one yet:

while IFS=\| read col1 cols
do  printf %10s%-s\\n "$col1 |" " ${cols##*|}"
done <<\INPUT
foo|dog|cat|mouse|lion|ox|tiger|bar
INPUT

OUTPUT

     foo | bar

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .