282

Is there any official way to allow a CSV formatted file to allow comments, either on its own line OR at the end of a line?

I tried checking wikipedia on this and also RFC 4180 but both do not mention anything which leads me to believe that it's not part of the file format so it's bad luck to me and I should then use a seperate ReadMe.txt file thingy to explain the file.

Lastly, i know it's easy for me to add my own comments in, but i was hoping that something like Excel could just import it straight away with no need for a consumer to have to customize the import process.

So, thoughts?

5
  • 1
    What would you comment on? The values in each line or the file itself? Is XML file an alternative for you? Commented Dec 25, 2009 at 12:20
  • 3
    The preposal was shot down for Python.
    – new123456
    Commented Jul 21, 2011 at 17:24
  • 3
    Maybe a version string for the data @SquareRigMaster . Just like I am trying to do now?
    – Rob Wells
    Commented Nov 19, 2013 at 14:44
  • 2
    @SquareRigMaster – Or a copyright statement. Commented Mar 25, 2020 at 3:18
  • @SquareRigMaster it would be useful in embedded applications like Arduino, where you want to log data to the terminal, but also print status messages. The status messages could be "comments" which are ignored when the terminal session is saved to a CSV file and read by something like Excel. In any case, I guess you could prefix comment lines with whichever character you like and filter these out in Excel, so the answer to "can you add comments" needs to be interpreted according to who is going to read the CSV file.
    – pdr0663
    Commented Jun 29, 2023 at 0:10

8 Answers 8

168

The CSV "standard" (such as it is) does not dictate how comments should be handled, no, it's up to the application to establish a convention and stick with it.

7
  • 26
    RFC 4180 is the standard now.
    – vipw
    Commented Aug 16, 2011 at 6:34
  • 47
    RFC 4180 is not a standard, rfc4180 tells: "This memo provides information for the Internet community. It does not specify an Internet standard of any kind. Distribution of this memo is unlimited." Commented Nov 17, 2014 at 12:16
  • 22
    OK, can we say is a de facto standard? Commented Mar 31, 2015 at 8:37
  • 7
    Yah ... that's not true. There are standards track documents and non-standard track (informational) documents. The entire process, including descriptions, processes and rules for IETF issued documents is defined by RFC2026 with some follow on amendments. Every RFC will specify at the beginning which track it is on.
    – Steve Hole
    Commented Jun 24, 2015 at 19:13
  • 7
    RFC is an acronym that stands for "Request For Comments," meaning it is intended on gathering feedback from the community. That being said, almost the entire internet runs on unratified RFCs, or less. The CSV "standard" itself is essentially undefined without RFC4180. It is the most definitive model we have although it might change someday. As it stands, RFC4180 has no provisions for inserting comments. If you add your own commenting mechanism to the format, don't expect interoperability with other reader/writers that follow RFC4180.
    – IAmNaN
    Commented Jan 7, 2018 at 20:53
57

In engineering data, it is common to see the # symbol in the first column used to signal a comment.

I use the ostermiller CSV parsing library for Java to read and process such files. That library allows you to set the comment character. After the parse operation you get an array just containing the real data, no comments.

6
  • 2
    Some parsers (Matlab included) support detecting lines starting in a prefix character and handling this differently as comments etc. For example adding some form of 'meta' versioning for optimising/guiding the code interpreting the data can be achieved via comment and '#' is what I have more often seen and used: #Csv/Version 1.9 Time,ValueA,ValueB 0.0, 123, 456 0.1, 123, 349
    – Crog
    Commented Aug 25, 2020 at 8:31
  • 3
    With emacs, csv-comment-start defaults to #
    – dat
    Commented Dec 14, 2020 at 5:18
  • 3
    The use of # is also a de facto standard in TSV formats ("CoNLL formats") in language technology. These formats pre-date the current CSV spec by more than a decade. Main difference to CSV is that they require the separator to be TAB (or, earlier, SPACE) rather than comma, but technically, that's still regarded a CSV format.
    – Chiarcos
    Commented Jul 20, 2021 at 11:28
  • 2
    Microsoft IIS log files use the # for comments. Commented Feb 23, 2022 at 19:24
  • 1
    Upvoted in order to raise the probability that this becomes the de facto standard Commented Aug 7, 2022 at 22:51
35

No, CSV doesn't specify any way of tagging comments - they will just be loaded by programs like Excel as additional cells containing text.

The closest you can manage (with CSV being imported into a specific application such as Excel) is to define a special way of tagging comments that Excel will ignore. For Excel, you can "hide" the comment (to a limited degree) by embedding it into a formula. For example, try importing the following csv file into Excel:

=N("This is a comment and will appear as a simple zero value in excel")
John, Doe, 24

You still end up with a cell in the spreadsheet that displays the number 0, but the comment is hidden.

Alternatively, you can hide the text by simply padding it out with spaces so that it isn't displayed in the visible part of cell:

                              This is a sort-of hidden comment!,
John, Doe, 24

Note that you need to follow the comment text with a comma so that Excel fills the following cell and thus hides any part of the text that doesn't fit in the cell.

Nasty hacks, which will only work with Excel, but they may suffice to make your output look a little bit tidier after importing.

10

I think the best way to add comments to a CSV file would be to add a "Comments" field or record right into the data.

Most CSV-parsing applications that I've used implement both field-mapping and record-choosing. So, to comment on the properties of a field, add a record just for field descriptions. To comment on a record, add a field at the end of it (well, all records, really) just for comments.

These are the only two reasons I can think of to comment a CSV file. But the only problem I can foresee would be programs that refuse to accept the file at all if any single record doesn't pass some validation rules. In that case, you'd have trouble writing a string-type field description record for any numeric fields.

I am by no means an expert, though, so feel free to point out any mistakes in my theory.

2
  • 2
    Aaand, I just read that you didn't want to customize the import process. Sorry 'bout that. Hopefully somebody finds this useful, then. Commented Jun 23, 2012 at 14:20
  • 1
    Good post. Another reason I can think of for why you might want comments is to add some meta-data about the file as a whole. Adding a whole column or row just for one cell with this info this feels a bit awkward. Commented Jul 17, 2019 at 17:20
6

A Comma Separated File is really just a text file where the lines consist of values separated by commas.

There is no standard which defines the contents of a CSV file, so there is no defined way of indicating a comment. It depends on the program which will be importing the CSV file.

Of course, this is usually Excel. You should ask yourself how does Excel define a comment? In other words, what would make Excel ignore a line (or part of a line) in the CSV file? I'm not aware of anything which would do this.

2
  • 2
    There is no standard which defines the contents of a CSV file False. Commented Oct 10, 2014 at 21:04
  • 6
    @Qix - from section 2 of the referenced document: "While there are various specifications and implementations for the CSV format (for ex. [4], [5], [6] and [7]), there is no formal specification in existence" Commented Dec 3, 2014 at 17:09
5

If you need something like:

  │ A                              │ B
──┼────────────────────────────────┼───
1 │ #My comment, something else    │
2 │ 1                              │ 2

Your CSV may contain the following lines:

"#My comment, something else"
1,2

Pay close attention at the 'quotes' in the first line.

When converting your text to columns using the Excel wizard, remember checking the 'Treat consecutive delimiters as one', setting it to use 'quotes' as delimiter.

Thus, Excel will split the text at the commas, keeping the 'comment' line as a single column value (and it will remove the quotes).

5

CSV is not designed to have comments. I often make a comment as a separate column in EXCEL. When dumping data from my embedded program, when I (for example) really need two data columns, by adding extra comma, I create one extra (third) column just for the comments, like this:

27,120,,
28,112,,
29,208,This is my comment,
30,85,,
4

If you're parsing the file with a FOR command in a batch file a semicolon works (;)

REM test.bat contents

for /F "tokens=1-3 delims=," %%a in (test.csv) do @Echo %%a, %%b, %%c

;test.csv contents (this line is a comment)

;1,ignore this line,no it shouldn't

2,parse this line,yes it should!

;3,ignore this line,no it shouldn't

4,parse this line,yes it should!

OUTPUT:

2, parse this line, yes it should!

4, parse this line, yes it should!
0

Not the answer you're looking for? Browse other questions tagged or ask your own question.