0

So I've created a batch script that outputs some text to a CSV. When I open the CSV in Notepad, it looks like this, which is how I want it.

173030,8,2793,2821244
173032,1,2793,2820692
173033,1,2793,2821120
173035,2,2793,2820788

However, in Excel it appears as:

173030   8
2793
2821244

173032   1
2793
2820692

173033   1
2793
2821120

173035   2
2793
2820788

The point being that there are some unwanted line breaks. It appears that Notepad only recognize a CRLF line break, but there are some other line breaks that are either only CR or LF that Excel sees that Notepad doesn't. What's the best way to take care of this? Since I'm already creating it via a Batch script, it'd be best if there's a Batch command that could remove the unwanted line breaks. But if there's maybe an excel setting or something I can change, that could work too.

2
  • If you can share an example text file somewhere, I will test it. Commented Jun 11, 2015 at 20:54
  • @JulianKnight here's the .csv db.tt/A7pkFnNi It's got some extra spaces I need to take care of, but the behavior of the line endings is the same even if I manually remove the spaces
    – Michael
    Commented Jun 11, 2015 at 21:05

2 Answers 2

1

If your example is accurate, you may not be creating something that Excel really can make sense of. I would recommend trying to put a row containing headings as the first row to see if that helps.

Also check your line endings to make sure you are adding \r\n (e.g. Windows line endings) rather than just \n - Excel should cope with the later but it can be hard to know.

Finally, make certain that your batch output is not including spurious characters. For example, are you sure that the numbers don't have leading or trailing characters such as a binary code <32?

UPDATE:

OK, that file you shared is not a valid CSV. You must take the odd characters out. I've just opened it in the Brackets text editor and the lines are split there. That is why Excel is doing the same.

On the first line for example, after the number 8, you have 13 spaces followed by a carriage return (\n)

UPDATE 2: Options for cleaning up the output. Here are a few possibilities that you could choose from depending on your knowledge, background and the allowed capabilities on your desktop:

  • PowerShell - you can use the features of PS to fairly easily cleanse the data and output to a valid CSV (PS has native CSV capabilities). That may mean learning PS of course.
  • Node.JS - my favourite tool for all platforms! A cross-platform JavaScript host that lets you run both command line and "server" processes. It has good native file system access and has vast arrays of libraries for doing pretty much everything. Great if you already know JavaScript from web development or want to minimise the number of languages you have to deal with. Also you can use node-webkit for creating cross-platform UI's with the power of Node behind them.
  • Windows Scripting Host - also capable of running JavaScript (well JScript anyway which is close). You can also use VBScript if you must. Can be run from the command line as long as it hasn't been disabled by zealous admins. You can use the same code in a HTA if you need a UI.
  • Google Refine - a very useful ETL tool for tidying up and transforming data. A bit long in the tooth now and rarely updated but it works. Free.
  • Excel PowerQuery - an addin from Microsoft that is hugely useful for ETL as long as the L is Excel! I use this all the time now. Free, part of Microsoft's BI tools.

There are many others of course, these are just a few as examples.

3
  • Thanks for your help. The problem is that it's being created with a batch file (the numbers are system usage stats), and I'm really hoping the batch file can clean up the csv itself. I may have to look at other solutions if not.
    – Michael
    Commented Jun 11, 2015 at 21:23
  • You might try using using PowerShell rather than batch commands, you can use that to tidy up the text. Commented Jun 11, 2015 at 21:28
  • 1
    @Michael Why don't you edit your question to include the batch file? We can have a look at fixing that (which is the real cause of your problem).
    – DavidPostill
    Commented Jun 12, 2015 at 12:30
0

Edit control behaviour is to break on LF (unlike RTF and programming languages). If the LF is preceded by a CR both are removed. Though CMD will be writing correct line endings unless your code is unusual.

Set Arg = WScript.Arguments
set WshShell = createObject("Wscript.Shell")
Set Inp = WScript.Stdin
Set Outp = Wscript.Stdout
If LCase(Arg(1)) = "cr" then 
    Do Until Inp.AtEndOfStream
        Line=Inp.readline
        Line=Replace(Line, vbcr, "")
        Line=Replace(Line, vblf, "")
        outp.writeline Line
    Loop
End If
If LCase(Arg(1)) = "lf" then 
    Do Until Inp.AtEndOfStream
        Line=Inp.readline
        Line=Replace(Line, vbcr, vbcrlf)
        outp.writeline Line
    Loop
End If

To use

LineBreaks

filter fix {lf|cr}

Fixes problems with line endings.

Different programs regard line endings differently. Also Notepad inserts the odd extra CR that it ignores but other programs don't.

Lines are broken at the line feed character. If preceded by a carriage return both are removed. However a single carriage return without a line feed does not break the line.

cr - removes all stray CR and LF left on each line.
lf - add a LF to any CR embeded in the middle of the line.

Example

Fixes win.ini, not that it needs fixing, and sends it to the screen

cscript //nologo filter.vbs fix cr < "%systemroot%\win.ini"

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .