2

I have this large csv (semicolon separated) file that I need to split into about 300 files based on the value in second column (file has header names).The file has about + 3 million rows and headers for 54 columns I have tried using this script with Powershell but it seems not to run. I can see in task manager that memory is building for a while but there is no output at all or error messages. I find this script here Splitting a xlsx/csv file based on column value and change the script to fit my needs. Have I missed something or what is wrong with this script as no output is executed...

What is missing?

$csv = Import-Csv C:\Data\Points\Se\Test\Data191017.csv -Delimiter ';'

$names = $csv | select -ExpandProperty OFFICENAME -Unique

foreach ($name in $names) { $csv | where {$_.name -like $name} | export-csv C:\Data\Points\Se\Test\20191028_$name.csv }

2 Answers 2

2

I finally got it to work, I added -Encoding to both import and export line (not sure that is needed). Now it runs correctly with characters and all so I'm happy :)

Thanks for helping out and getting me in right directions, below is the script that works for my purposes ...

$Source = 'Data_191107.csv'
$Dest = 'C:\temp\test\'
Import-Csv $Source -Delimiter ';' -Encoding 'ISO-8859-1' | %{
$_ | Export-csv -Path "$Dest$($_.OFFICENAME).csv" -Encoding 'ISO-8859-1' -NoTypeInformation -Delimiter ';' -Append
}
1
  • Glad you figured that part out. While aware of encoding issues from following forums, not much kpractical experience troubleshooting. Commented Nov 8, 2019 at 18:53
0

Not an expert, but from what I've read, you want to avoid intermeidate variables, especially with large objecgts, and leverage the pipeline. The following worked with a small sample:

$Source = 'C:\Data\Points\Se\Test\Data191017.csv'
$Dest = 'C:\Data\Points\Se\Test\'

Import-Csv $Source -Delimiter ';' | %{
   $_ | Export-csv -Path "$Dest$($_.OFFICENAME).csv" -NoTypeInformation -Delimiter ';' -Append
}

Splatting the parameters makes it easier to read, but not sure of the effect on execution time for a large data set:

$Source = 'C:\Data\Points\Se\Test\Data191017.csv'
$Dest = 'C:\Data\Points\Se\Test\'

Import-Csv $Source -Delimiter ';' | %{
   $splat = @{
      'Path'              = "$Dest$($_.Country).csv"
      'Delimiter'         = ';'
      'InputObject'       = $_
      'Append'            = $True
      'NoTypeInformation' = $True
   }
   Export-csv @splat
}
3
  • I have tried both of your suggestions but still the same problem = no output at all. I even let it "run" overnight but no result. I will try to make a smaller sample as the file is about 450Mb in size if that is the case. I thought that Powershell is good with large files. I need the header to be with the split as well not only the data....
    – QGIS-user
    Commented Nov 4, 2019 at 17:48
  • Have you tried just running the first step, Import-CSV, to see if that is running to completion? Commented Nov 4, 2019 at 20:56
  • I did change the path to a simpler one "C:\temp\" and now its working :) But I have a slight problem as the characters get distorted as the original file has local Å Ä and Ö and they get distorted as it saves in UTF8, the large file is in ANSI format any solution to this?... Thanks
    – QGIS-user
    Commented Nov 8, 2019 at 7:23

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .