14

I have a file as below.

"ID" "1" "2"
"00000687" 0 1
"00000421" 1 0

I want to make it as below.

00000687 0 1
00000421 1 0

I want to

  1. remove the first line and
  2. remove double quotes from fields on any other lines.  FWIW, double quotes appear only in the first column.

I think cut -c would work, but cannot make it.  What should I do?

2
  • 1
    Just to make sure: are you trying to 1) remove the first line and 2) remove double quotes from fields on any other lines? Can double quotes appear anywhere else than the first field? And should they only be removed from the first field, as your question's title suggests, if they do?
    – fra-san
    Commented May 21, 2020 at 11:21
  • 1
    Thank you fra-san. I want to 1) remove the first line and 2) remove double quotes from fields on any other lines. Yes, double quotes appear only in the first column. Commented May 21, 2020 at 11:24

6 Answers 6

24

tail +tr:

tail -n +2 file | tr -d \"

tail -n+2 prints the file starting from line two to the end. tr -d \" deletes all double quotes.

1
  • Great - this also worked. Thanks a lot! Commented May 21, 2020 at 12:22
17

This should work:

sed -i '1d;s/"//g' filename

Explanation:

  • -i will modify the file in place
  • 1d will remove the first line
  • s/"//g will remove every " in the file

You can first try without -i and the output will be printed to stdout.

2
  • 1
    Thank you. sed 's/"//g' really worked! Commented May 21, 2020 at 11:41
  • 1
    +1 I find the sed answer easiest to read.
    – jorfus
    Commented Feb 3, 2021 at 21:38
7

Solving the issue as it is presented in the title, i.e. removing double quotes from the first space-delimited column, only:

awk -F ' ' '{ gsub("\"", "", $1) }; NR > 1' file

This uses the gsub() command to remove all double quotes from the first field on each line. The NR > 1 at the end makes sure that the first line is not printed.

To remove the double quotes from the first field, but only if they appear as the first and last character of the field:

awk -F ' ' '$1 ~ /^".*"$/ { $1 = substr($1, 2, length($1) - 2) }; NR > 1' file

This uses a regular expression, ^".*"$ to detect whether there are double quotes at the start and end of the first field, and if there are, a block that extracts the internal part of the string with substr() is triggered. Any internal double quotes in the field are retained.

4

Using Perl:

perl -ne ' { s/"//g; print if $. > 1 }' file

OR

perl -ne ' { if ($.>1) {s/"//g;print}  }' file

s/"//g; => Removes all the double quotes in the current line of the file (stored in $_ by default)

if $. > 1 => If the current line number is greater than 1

2

There can be many ways to get your desired output, one can be with cut -c. Just you need to define the range of characters to extract and pipe the output to tail --lines=+2 command to remove the header (the first line). Such as:

cut -c2-9,11-14 <your_file_name> | tail --lines=+2

The -c2-9,11-14 option defines the range of character positions from 2 to 9 (position of characters for ID column) and from 11 to 14 (position of characters for the rest of characters excluding the ' " ').

The tail --lines=+2 command prints all lines from your file, but starting from line two.

For more inforamtion on cut command, you can visit this site.

0

One can do it also in gedit manually. Remove first line and then replace " by nothing.

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .