Revisions to SQL Server 2016 EscapeCharacter problems

new information was given,

Source Link

edited Jul 1, 2016 at 21:35

1.3k
8
10

Unless you your environment shows the offending characters, you will be forced to manually copy and paste from the result sets and replace the string from there. The issue is how the data is stored inside the data source to being with.

Therefore, if it is possibleNow, cleansewhat you mention in your recent update got me considering why you feel the need to transform your data ofin the special characters beforehandfirst place. But if that is not possible, you can use CASEDATES statements in yourdo not have formatting by default, so unless SELECTJSON statement and/or use theis incompatible with handling REPLACESQL function in your query.

I'm sure you knowdates, but do not use any function on any predicatethere is really no need to transform this data inside ON/WHEREJSON unlessif you have no choice (such as inconsistent databaseyour target tables) enforce the correct format.

Because this does happen even in CSV files (i once hadSo unless there is still a concern for the characters embedded inside full names!)truncation of data, from an ETL perspective there are two ways you probably should keep your solution separately for easy access.can accomplish this:

JSON_QUERY((
SELECT [LegacyContactID]
      ,[NameType]
      ,[LastName]
      ,[FirstName]
      ,[Active]
      ,[Primary]
      ,CONVERT(VARCHAR,REPLACE(REPLACE([StartDate], '/', ''), ' ', ' '),101) AS [StartDate]
      ,CONVERT(VARCHAR,REPLACE(REPLACE([EndDate], '/', ''), ' ', ' '),101) AS [EndDate]
FROM [LTSS].[ConsumerFile_02_ContactName]
WHERE [LegacyContactID] = ContactList.[LegacyContactID]
FOR JSON AUTO, WITHOUT_ARRAY_WRAPPER
)) AS ContactName

1 - USE STAGING TABLES

Staging tables can either be temporary tables, CTEs, or actual empty tables you use to extract, cleanse, and transform your data.

Advantages: You are only dealing with the rows being inserted, do not have to be concerned with constraints, and can easily modify OUTSIDE JSON any corruption or non-structured aspect of your data.

Disadvantages: Staging tables may represent more object in your database, depending how repetitive the need for them is. Thus, finding better, consistent structured data is preferable.

I actually included the CLRF text in the code, which may be overkill.2 - ALTER YOUR TABLE TO USE STRINGS

Here you enforce the business rules cleansing the data AFTER insertion into the persistent table.

Advantages: You save on space, simplify the cleansing process, and can still use indexes. SQL Server is pretty efficient at parsing through DATE strings, still take advantage of EXISTS() and possible SARGS to check for not-dates when running your insert.

Disadvantages: You lose a primary integrity check on your table while the dates are now stored as strings, opening up possibilities of dirty data being exposed. Your UPDATE statements will be forced to use the entire table, which can drag on performances.

    JSON_QUERY((
SELECT [LegacyContactID]
      ,[NameType]
      ,[LastName]
      ,[FirstName]
      ,[Active]
      ,[Primary]
      ,[StartDate] --it already is in a dateformat
      ,[EndDate]
FROM [LTSS].[ConsumerFile_02_ContactName]
WHERE [LegacyContactID] = ContactList.[LegacyContactID]
FOR JSON AUTO, WITHOUT_ARRAY_WRAPPER
)) AS ContactName

Unless you your environment shows the offending characters, you will be forced to manually copy and paste from the result sets and replace the string from there. The issue is how the data is stored inside the data source to being with.

Therefore, if it is possible, cleanse the data of the special characters beforehand. But if that is not possible, you can use CASE statements in your SELECT statement and/or use the REPLACE function in your query.

I'm sure you know, but do not use any function on any predicate ON/WHERE unless you have no choice (such as inconsistent database tables).

Because this does happen even in CSV files (i once had the characters embedded inside full names!), you probably should keep your solution separately for easy access.

JSON_QUERY((
SELECT [LegacyContactID]
      ,[NameType]
      ,[LastName]
      ,[FirstName]
      ,[Active]
      ,[Primary]
      ,CONVERT(VARCHAR,REPLACE(REPLACE([StartDate], '/', ''), ' ', ' '),101) AS [StartDate]
      ,CONVERT(VARCHAR,REPLACE(REPLACE([EndDate], '/', ''), ' ', ' '),101) AS [EndDate]
FROM [LTSS].[ConsumerFile_02_ContactName]
WHERE [LegacyContactID] = ContactList.[LegacyContactID]
FOR JSON AUTO, WITHOUT_ARRAY_WRAPPER
)) AS ContactName

I actually included the CLRF text in the code, which may be overkill.

Unless you your environment shows the offending characters, you will be forced to manually copy and paste from the result sets and replace the string from there.

Now, what you mention in your recent update got me considering why you feel the need to transform your data in the first place. DATES do not have formatting by default, so unless JSON is incompatible with handling SQL dates, there is really no need to transform this data inside JSON if you your target tables enforce the correct format.

So unless there is still a concern for the truncation of data, from an ETL perspective there are two ways you can accomplish this:

1 - USE STAGING TABLES

Staging tables can either be temporary tables, CTEs, or actual empty tables you use to extract, cleanse, and transform your data.

Advantages: You are only dealing with the rows being inserted, do not have to be concerned with constraints, and can easily modify OUTSIDE JSON any corruption or non-structured aspect of your data.

Disadvantages: Staging tables may represent more object in your database, depending how repetitive the need for them is. Thus, finding better, consistent structured data is preferable.

2 - ALTER YOUR TABLE TO USE STRINGS

Here you enforce the business rules cleansing the data AFTER insertion into the persistent table.

Advantages: You save on space, simplify the cleansing process, and can still use indexes. SQL Server is pretty efficient at parsing through DATE strings, still take advantage of EXISTS() and possible SARGS to check for not-dates when running your insert.

Disadvantages: You lose a primary integrity check on your table while the dates are now stored as strings, opening up possibilities of dirty data being exposed. Your UPDATE statements will be forced to use the entire table, which can drag on performances.

    JSON_QUERY((
SELECT [LegacyContactID]
      ,[NameType]
      ,[LastName]
      ,[FirstName]
      ,[Active]
      ,[Primary]
      ,[StartDate] --it already is in a dateformat
      ,[EndDate]
FROM [LTSS].[ConsumerFile_02_ContactName]
WHERE [LegacyContactID] = ContactList.[LegacyContactID]
FOR JSON AUTO, WITHOUT_ARRAY_WRAPPER
)) AS ContactName

Properly answers the problem.

Source Link

edited Jun 29, 2016 at 23:05

clifton_h

1.3k
8
10

AhUPDATED: Ah, yes, escape and CLRF characters.

Unless you know specificaly the character (assuming your environment shows it)the offending characters, you will be forced to manually copy and paste from the result sets and replace the string from there.

Note that The issue is how the escape characters are not actually spaces, so functions like LTRIM, CHARINDEX will not workdata is stored inside the data source to being with.

It may also be appropriate to transformTherefore, if it is possible, cleanse the data of the special characters beforehand. (If you can, do so) But if that is not possible, you can use CASE statements within your CHARINDEX()SELECT to identifystatement and then REPLACE the characters. Or just/or use the REPLACE function in your query.

I'm sure you know, but do not use any function on any predicate ON/WHERE unless you have no choice (such as inconsistent database tables).

Because this does happen even in CSV files (i once had the characters embedded betweeninside full names!), you probably should keep thisyour solution separately for easy access.

JSON_QUERY((
SELECT [LegacyContactID]
      ,[NameType]
      ,[LastName]
      ,[FirstName]
      ,[Active]
      ,[Primary]
      ,CONVERT(VARCHAR,REPLACE(REPLACE([StartDate], '/', ''), ' ', ' '),101) AS [StartDate]
      ,CONVERT(VARCHAR,REPLACE(REPLACE([EndDate], '/', ''), ' ', ' '),101) AS [EndDate]
FROM [LTSS].[ConsumerFile_02_ContactName]
WHERE [LegacyContactID] = ContactList.[LegacyContactID]
FOR JSON AUTO, WITHOUT_ARRAY_WRAPPER
)) AS ContactName

I actually included the IF CHARINDEX('-PASTED_STRING-', expression) <> 0 REPLACE(-variable-, -pasted_string, expression)CLRF text in the code, which may be overkill.

Ah, yes, escape and CLRF characters. Unless you know specificaly the character (assuming your environment shows it), you will be forced to manually copy and paste from the result sets and replace the string from there.

Note that the escape characters are not actually spaces, so functions like LTRIM, CHARINDEX will not work.

It may also be appropriate to transform the data beforehand. (If you can, do so) But if that is not possible, you can use CASE statements with CHARINDEX() to identify and then REPLACE the characters. Or just use the REPLACE function in your query.

Because this does happen even in CSV files (i had the characters embedded between full names), you probably should keep this solution separately for easy access.

IF CHARINDEX('-PASTED_STRING-', expression) <> 0 REPLACE(-variable-, -pasted_string, expression)

UPDATED: Ah, yes, escape and CLRF characters.

Unless you your environment shows the offending characters, you will be forced to manually copy and paste from the result sets and replace the string from there. The issue is how the data is stored inside the data source to being with.

Therefore, if it is possible, cleanse the data of the special characters beforehand. But if that is not possible, you can use CASE statements in your SELECT statement and/or use the REPLACE function in your query.

I'm sure you know, but do not use any function on any predicate ON/WHERE unless you have no choice (such as inconsistent database tables).

Because this does happen even in CSV files (i once had the characters embedded inside full names!), you probably should keep your solution separately for easy access.

JSON_QUERY((
SELECT [LegacyContactID]
      ,[NameType]
      ,[LastName]
      ,[FirstName]
      ,[Active]
      ,[Primary]
      ,CONVERT(VARCHAR,REPLACE(REPLACE([StartDate], '/', ''), ' ', ' '),101) AS [StartDate]
      ,CONVERT(VARCHAR,REPLACE(REPLACE([EndDate], '/', ''), ' ', ' '),101) AS [EndDate]
FROM [LTSS].[ConsumerFile_02_ContactName]
WHERE [LegacyContactID] = ContactList.[LegacyContactID]
FOR JSON AUTO, WITHOUT_ARRAY_WRAPPER
)) AS ContactName

I actually included the CLRF text in the code, which may be overkill.

Source Link

created Jun 29, 2016 at 22:50

clifton_h

1.3k
8
10

Ah, yes, escape and CLRF characters. Unless you know specificaly the character (assuming your environment shows it), you will be forced to manually copy and paste from the result sets and replace the string from there.

Note that the escape characters are not actually spaces, so functions like LTRIM, CHARINDEX will not work.

It may also be appropriate to transform the data beforehand. (If you can, do so) But if that is not possible, you can use CASE statements with CHARINDEX() to identify and then REPLACE the characters. Or just use the REPLACE function in your query.

Because this does happen even in CSV files (i had the characters embedded between full names), you probably should keep this solution separately for easy access.

IF CHARINDEX('-PASTED_STRING-', expression) <> 0 REPLACE(-variable-, -pasted_string, expression)

Collectives™ on Stack Overflow

Return to Answer