I have a csv file that has 7 columns. It has empty cells and some spaces between cells. How can I replace the empty cells with NA and remove extra spaces? Thank you very much!

Here is what my file looks like, but here it seems to warp around when I copy and past it.

130070078,PPW0001,1,4,4HW             ,2,15.61943874
120040039,PPW0002,0,0,                ,0,0
120040043,PPW0003,1,3,3WE             ,1,14.43394935
  • Please be careful: if your CSV file contains spaces and commas (for example foo,"bar, baz",bar - it has two cells: foo, bar, baz and bar) it isn't easy to parse (and change) with sed or awk.
    – uzsolt
    Commented Mar 12, 2017 at 17:49
  • Thank you. based on @Cyrus previous comment (which now seems to have been removed). I did this followed by replace empty space with NA and it worked: sed 's/ *,/,/g' file1 | sed 's/,,/,NA,/g' > file2
    – Elham
    Commented Mar 12, 2017 at 20:24
  • @uzsolt2, how can I know if my file has this problem and how do I resolve it. Because I think one of my other files has this problem, so when I use awk to get one column printed (the last one in the file), it returns an empty column.
    – Elham
    Commented Mar 15, 2017 at 14:02
  • if the count of commas is greater then your number of columns. Or... many cases. The other question (how do resolve): I'm using "psv", "pipe separated values", the separator character is "|". It's rarely used character in texts or numbers :)
    – uzsolt
    Commented Mar 15, 2017 at 17:19

2 Answers 2


Your answer:

sed 's/ *,/,/g' file1 | sed 's/,,/,NA,/g' > file2

To get 'NA' in the last field if blank:

sed 's/ *,/,/g' file1 | sed 's/,,/,NA,/g' | awk -F, 'OFS="," {if ($NF == "") $NF = "NA"; print}'> file2

You could also use :

sed 's/,,/,NA,/g' file1 | tr -d ' ' | awk -F, 'OFS="," {if ($NF == "") $NF = "NA"; print}'

αғsнιη's answer worked for me, but I'd just like to explain it a bit.

I was trying something like this:

echo "1,,2,,,3,,,4,,,,5,,,,,,,,,,6" | sed 's/,,/,-,/g'

Which outputs


Because of the repeated empty fields the last comma is part of the first replacement and the start of the next desired replacement, so you just get every second empty field replaced.

Now you could do something like:

echo "1,,2,,,3,,,4,,,,5,,,,,,,,,,6" | sed -e 's/,,/,-,/g' -e 's/,,/,-,/g'


sed 's/,,/,-,/g;s/,,/,-,/g'

Which will replace all the cells, as the second command will get the ones that are missed, but it's a bit messy.

αғsнιη's command does essentially the same thing, using a label and a jump, which I was not aware you could do.

sed ':MYLABEL; s/,,/,-,/g; t MYLABEL;'



So the first part of the command creates a label.

Then we have the same substitution.

Then we have the t command which means jump to label if the previous substitution command was successful.

More information: http://www.grymoire.com/Unix/Sed.html#uh-59

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .