0

A very odd situation is occurring:

(Running Windows 7 Enterprise, SP1, 64bit)

A GUI I use at work (built specifically for this purpose and intended to run in a Windows environment) generates .dat files. I need to edit one slightly before I use it. I open it in plain old Notepad, make my edit and save it (everything looks fine). When I REOPEN it with any program (Notepad, Notepad++, etc) all "new lines"/"enters"/line-breaks have been removed - everything appears to be jumbled together as one long line.

If I simply open it and close it in Notepad without saving, nothing changes and the line-breaks are where they should be. Opening the doc in Notepad++ or another program and saving it does not affect the line-breaks. Copying the contents to Notepad++ and back to Notepad also fixes this problem - subsequent saves in Notepad do not bread the Linebreaks.

What makes this an even more awkward problem is that this behavior does not apply to ALL the .dat files my GUI produces. Just some of them.

Any good ideas on what is happening and how to fix it?

If the solution to this is to modify the way my GUI produces the files, that is an acceptable answer as the GUI is something I can submit a bug-report for.

However... it seems unlikely as I don't think my boss has ever had this problem before and he uses the same version of the GUI and Notepad to make slight modifications to the files. I also have only had this happen recently and inconsistently: a file that had this problem previously does NOT loose its linebreaks when saved with Notepad in this current iteration of files.

Edit: more info: I sent the file to my boss and had him open and save with Notepad on his computer and nothing funny happened - all linebreaks remained after saving, closing and reopening. Either the process of sending it fixed something in the file, or it is something funny with my computer.

Looking at the hex of the saved and unsaved files: As far as I can tell, the unsaved version has 0D0A between lines and the saved version is missing all instances of 0D0A except for a single instance at the very end (I wonder if this was added by Notepad++ as the file was opened/converted).

Edit again: the hex version of the "unsaved" file after editing out sensitive info:

2320504C4541534520434845434B3A20
544845524D5F43617020616E64204465
70436170206265666F72652072756E6E
696E67202121210D0D0A0D0D0A706172
616D20696E697469616C203A3D20313B
0D0D0A706172616D2054203A3D313735
32303B0D0D0A706172616D206474203A
3D20333630303B0D0D0A0D0D0A706172
616D204950505F4F524F203A3D20302E
313233343B0D0D0A706172616D205448
45524D5F4F524F203A3D20302E313233
343B0D0D0A706172616D20544845524D
5F436170203A3D2031323334353B0D0D
0A706172616D20446570436170203A3D
31323334353B0D0D0A234D572C204465
70656E6461626C652043617061636974
79206F662073747566660D0D0A090909
234E756D6265727320666F7220726566
6572656E63653A207468696E67732E0D
0D0A09090923446570656E6461626C65
204361703A2073747566660D0D0A7061
72616D20425546464552203A3D20303B
0D0D0A706172616D20636F6E76657274
203A3D20312E303B0D0D0A0D0D0A7061
72616D09525245534E504F494E545309
3A3D20353B0D0D0A706172616D095252
4553424B50093A3D0D0D0A31092D3132
33343530200D0D0A32092D3132333435
3030200D0D0A330930200D0D0A340931
32333435200D0D0A3509313233343520
0D0D0A3B0D0D0A0D0D0A706172616D09
525245534C4F5045093A3D0D0D0A3109
2D3132333435452D30350D0D0A32092D
3132333435452D30350D0D0A33092D31
32333435452D30350D0D0A3409313233
34350D0D0A3B0D0D0A0D0D0A2357696E
64792073747566660D0D0A706172616D
2057494E445F49433A3D0D0D0A706C61
63650931323334350D0D0A3B0D0D0A0D
0D0A706172616D207468696E67793A3D
0D0D0A706C6163650931323334350D0D
0A3B0D0D0A0D0D0A706172616D204F50
545F5265733A3D0D0D0A7468696E6709
300D0D0A3B0D0D0A0D0A

This is the hex code after saving in notepad and reopening:

2320504C4541534520434845434B3A20
544845524D5F43617020616E64204465
70436170206265666F72652072756E6E
696E6720212121706172616D20696E69
7469616C203A3D20313B706172616D20
54203A3D31373532303B706172616D20
6474203A3D20333630303B706172616D
204950505F4F524F203A3D20302E3132
33343B706172616D20544845524D5F4F
524F203A3D20302E313233343B706172
616D20544845524D5F436170203A3D20
31323334353B09706172616D20446570
436170203A3D383632362E313B234D57
2C20446570656E6461626C6520436170
6163697479206F662073747566660909
09234E756D6265727320666F72207265
666572656E63653A207468696E677309
090923446570656E6461626C65204361
703A207374756666706172616D204255
46464552203A3D20303B706172616D20
636F6E76657274203A3D20312E303B70
6172616D09525245534E504F494E5453
093A3D20353B706172616D0952524553
424B50093A3D31092D31323334353020
32092D31323334353030203309302034
0931323334352035093132333435203B
706172616D09525245534C4F5045093A
3D31092D3132333435452D303532092D
3132333435452D303533092D31323334
35452D303534092D3132333435452D30
353B2357696E64792073747566667061
72616D2057494E445F49433A3D706C61
63650931323334353B706172616D2074
68696E67793A3D706C61636509313233
34353B706172616D204F50545F526573
3A3D7468696E6709303B0D0A

1 Answer 1

1

Some background might help. Notepad requires a file to contain both <CR><LF> in order to determine it is a line ending and perform line breaking. If either of these characters are missing it will skip them and display everything on a single line. <CR><LF> is the standard line break sequence on DOS/Windows machines whereas Unix/Linux and derivatives use <LF> as line break. The file you are opening most likely only has <LF> in it thus, notepad can't display it correctly.

Most likely these files were created by a program that isn't following Windows/DOS conventions for text formatting. If this is only intended to run on Windows, I think a bug report is in order especially if the files are supposed to be edited in notepad.

In the meantime, I recommend only opening and editing the files in Notepad++.

If you want to debug the problem, download a hex editor to see what is contained at the end of a line. For proper windows formatting it should contain hex codes 0d 0a. If either are missing or in a different order, that will create a problem. Try this on a new file that has never been saved from notepad and one that has.

15
  • Windows is the only platform it has ever run on. I considered bad encoding but the inconsistency is bothering me. I generated the same set of files several weeks ago and "File1.dat" had this problem. Now I generated more files today, and "File2.dat" has this problem, but "File1.dat" does not. No update to my GUI has occurred in this time. I am using the same computer. Commented Mar 22, 2017 at 18:24
  • Also, if there is a problem with the encoding, wouldn't it show the lack of linebreaks from the beginning? Opening the file initially works fine, it is the act of saving it in Notepad that seems to mess up the linebreaks. Commented Mar 22, 2017 at 18:26
  • Added some debug info to the original post
    – RayG
    Commented Mar 22, 2017 at 18:38
  • as far as I can tell, the unsaved version has 0d 0a between lines and the saved version is missing both 0d and 0a. (using some online hex editor and pasting my text in - can't install random programs on my work comp.) Commented Mar 22, 2017 at 18:43
  • 1
    I believe that 0d0d0a pattern is the problem and Notepad is stripping them out. Two <CR> in a row doesn't make sense, it is redundant and does not conform to Windows standard. Notice that there is a proper <CR><LF> at the end after it is saved which is proof that Notepad sees all of the original text as one long line, skipping and removing the control characters just as you described in your original post. You need to fix the program that is generating text with two CR in a row.
    – RayG
    Commented Mar 22, 2017 at 22:02

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .