1

I have a tab-delimited CSV file (test.txt) with content similar to the following (my CSV has no header):

12.33   Apple   Orange  "this is" great "to eat"
10.99   Pear    Lemon   "this" is an - "aquired taste"

I've tried both of the following to import the data into an array:

$Headers = "price","item1","item2","desc"
[array]$arrCSVobjects = import-csv "test.txt" -Delimiter "`t" -Header $Headers

(or)

$Headers = "price","item1","item2","desc"
[array]$arrCSVobjects = Get-Content -Path "test.txt " | Out-String | ConvertFrom-CSV -Delimiter "`t" -Header $Headers

No matter how I approach this, PS keeps wanting to remove the starting (leading) quotation marks from the DESC elements in the array (and I end up with results looking like this – which are not the same as the original data):

price    item1    item2    desc
-----    -----    -----    ----
12.33    Apple    Orange   this is great "to eat"
10.99    Pear     Lemon    this is an - "aquired taste"

When compared with the original data, you can see that some of the quotation marks are missing. How can I stop (prevent) PS from removing quotation marks from the elements like this? I need to import the CSV, manipulate the data and then export it back out to another CSV. Every time I search the internet for an answer, I keep getting results on how to remove quotation marks but I can’t seem to find how to keep them. I don’t want the quotation marks removed on either the import or export process.

Thanks in advance,

STGdb

2
  • Are those quotes actually necessary? After all they're probably just there to mark that it's an argument as a whole and desc seems to contain just that, the argument without the marking.
    – Seth
    Commented Mar 10, 2017 at 7:13
  • I understand but yes, the quotes are needed. What I had posted originally was just an example using a similar CSV format. The scenario is that the "desc" element contains words that are quoted in addition to editorial comments, i.e., (Today, the mayor said "We can't increase the budget", but the Governor replied "blah, blah, etc."). So in this example, you need the quotes to distinguish between spoken comments and editorial comments. But good point though - thanks
    – SOSidb
    Commented Mar 10, 2017 at 19:11

1 Answer 1

3

Your file isn't valid CSV. (Well, there is no official standard, but there is a de facto one.) Double quotes have a special meaning in CSV for surrounding fields. The ones that aren't "surrounders" have to be escaped in some way, usually by using doubling them, like this:

12.33   Apple   Orange  """this is"" great ""to eat"""

You also have to surround the whole field with double quotes otherwise the parser gets confused. That's why the field starts and ends with three double quotes.

If you can't change the format of the file, you could just parse it yourself by splitting on the tabs. This will of course only work if you can guarantee that there won't be any tabs inside the actual fields. The tabs must solely be used as field separators.

gc .\test.txt | 
    % { $f = $_ -split "`t"; [pscustomobject]@{price=$f[0];item1=$f[1];item2=$f[2];desc=$f[3]} }

Output:

price item1 item2  desc
----- ----- -----  ----
12.33 Apple Orange "this is" great "to eat"
10.99 Pear  Lemon  "this" is an - "aquired taste"
1
  • 1
    Works perfectly - thank you. Exactly what I needed but for some reason when I tried it originally, I couldn't figure out the proper syntax to get split to work. Awesome!!
    – SOSidb
    Commented Mar 10, 2017 at 19:06

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .