Unless you your environment shows the offending characters, you will be forced to manually copy and paste from the result sets and replace the string from there. The issue is how the data is stored inside the data source to being with.
Therefore, if it is possibleNow, cleansewhat you mention in your recent update got me considering why you feel the need to transform your data ofin the special characters beforehandfirst place. But if that is not possible, you can use CASEDATES
statements in yourdo not have formatting by default, so unless SELECTJSON
statement and/or use theis incompatible with handling REPLACESQL
function in your query.
I'm sure you knowdates, but do not use any function on any predicatethere is really no need to transform this data inside ON/WHEREJSON
unlessif you have no choice (such as inconsistent databaseyour target tables) enforce the correct format.
Because this does happen even in CSV
files (i once hadSo unless there is still a concern for the characters embedded inside full names!)truncation of data, from an ETL perspective there are two ways you probably should keep your solution separately for easy access.can accomplish this:
JSON_QUERY((
SELECT [LegacyContactID]
,[NameType]
,[LastName]
,[FirstName]
,[Active]
,[Primary]
,CONVERT(VARCHAR,REPLACE(REPLACE([StartDate], '/', ''), ' ', ' '),101) AS [StartDate]
,CONVERT(VARCHAR,REPLACE(REPLACE([EndDate], '/', ''), ' ', ' '),101) AS [EndDate]
FROM [LTSS].[ConsumerFile_02_ContactName]
WHERE [LegacyContactID] = ContactList.[LegacyContactID]
FOR JSON AUTO, WITHOUT_ARRAY_WRAPPER
)) AS ContactName
1 - USE STAGING TABLES
- Staging tables can either be
temporary tables
,CTEs
, or actual empty tables you use toextract, cleanse, and transform
your data. - Advantages: You are only dealing with the rows being inserted, do not have to be concerned with constraints, and can easily modify OUTSIDE
JSON
any corruption or non-structured aspect of your data. - Disadvantages: Staging tables may represent more object in your database, depending how repetitive the need for them is. Thus, finding better, consistent structured data is preferable.
I actually included the CLRF
text in the code, which may be overkill.2 - ALTER YOUR TABLE TO USE STRINGS
- Here you enforce the business rules cleansing the data AFTER insertion into the persistent table.
- Advantages: You save on space, simplify the cleansing process, and can still use indexes. SQL Server is pretty efficient at parsing through DATE strings, still take advantage of
EXISTS()
and possibleSARGS
to check for not-dates when running your insert. - Disadvantages: You lose a primary integrity check on your table while the dates are now stored as strings, opening up possibilities of dirty data being exposed. Your UPDATE statements will be forced to use the entire table, which can drag on performances.
JSON_QUERY(( SELECT [LegacyContactID] ,[NameType] ,[LastName] ,[FirstName] ,[Active] ,[Primary] ,[StartDate] --it already is in a dateformat ,[EndDate] FROM [LTSS].[ConsumerFile_02_ContactName] WHERE [LegacyContactID] = ContactList.[LegacyContactID] FOR JSON AUTO, WITHOUT_ARRAY_WRAPPER )) AS ContactName