1

A comma-delimited file is created when exporting google contacts into what they term "Google CSV format (for importing into a Google account)". The issue is this format handles notes on multiple lines by inserting the text in quotes, and allowing CRLF with those quotes.

In other words, imagine a record with Name,Note,Email when it has a multi-line note appears as follows in the .csv file:

Name,"Note FirstLine\r\n

SecondNoteLine\r\n

Lastnoteline",[email protected]\r\n

The Same record with no note field appears as follows, and is in a single line (More standard):

Name,,[email protected]\r\n

I'm trying to form the correct regex expression, and have tried to glean it out of How to use regular expressions in Notepad++ (tutorial) to no avail.

The closest I'v gotten (not very close ) is
,\".*,\"

with . Matching newline.

The expression I'm trying to match is:

"Select the text between ," and ", only when there are one or more /r/n " "and replace with NUL"

So that in the ablove examples both records would be identical and I can get each contact record to appear on a single line, and be able to import it into excel.

At this point, my eyes are bleeding, and any help would be appreciated.

1 Answer 1

2

The below worked for me with Notepad++ just as you explain you need, and with the example data you provided in your question as well.

Lights . . .

enter image description here

Camera . . .

  1. Find What: ((?:^|\r\n)[^"]*+"[^\r\n"]*+)\r\n([^"]*+")
  2. Replace with: $1 $2
  3. Be sure the Regular expression option is checked
  4. Be sure the Wrap Around option is checked
  5. Press Replace All as many times as you need to get the final and expected results for your records

enter image description here

Action . . .

enter image description here


Explanation:

(
  (?:^|\r\n)     Begin at start of file or before the CRLF before the start of a record
  [^"]*+         Consume all chars up to the opening "
  "              Consume the opening "
  [^\r\n"]*+     Consume all chars up to either the first CRLF or the closing "
)                Save as capturing group 1 (= everything in record before the target CRLF)
\r\n             Consume the target CRLF without capturing it
(
  [^"]*+         Consume all chars up to the closing "
  "              Consume the closing "
)                Save as capturing group 2 (= the rest of the string after the target CRLF)

Note: The *+ is a possessive quantifier. Use them appropriately to speed up execution.

Update:

This more general version of the regex will work with any line break sequence (\r\n, \r or \n):

((?:^|[\r\n]+)[^"]*+"[^\r\n"]*+)[\r\n]+([^"]*+")

Source

2
  • Wow! Thanks so much for both the solution, and the explaination which will let me run amok! thanks!!!
    – EdinTexas
    Commented Oct 10, 2016 at 13:56
  • @EdinTexas - I'm glad to hear you solved your problem, when you get a chance, please feel free to press the little check mark to make it green in the upper left hand side of my answer to accept it as the accepted answer if it helped you resolve your inquiry, and to close the loop on your inquiry... Check out [Accepting An Answer] for a visual of what to check, etc. in case you're not already familiar. Thanks!! Commented Oct 11, 2016 at 2:05

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .