2

Looking for a way to read any JSON without knowing the internals with SQL Server's JSON methods, I came up with an approach I'd like to share.

Here is the question bringing up this issue.

The question is: How can I transform an unknown JSON into a structured EAV format while keeping all the information about sort order and nest levels.

The ideal output should carry the original row's id as the entity, the Json's key and value as the attribute and value together with the JsonPath for the specific object in a sorted list.

Find the MCVE (sample data from the linked question) embedded into my self-answer.

1 Answer 1

2

First we create a declared table variable and fill it with some sample JSONs to simulate the issue (I added some arrays to the samples to reflect JSON paths for arrays):

DECLARE @table TABLE(ID INT IDENTITY, AnyJSON NVARCHAR(MAX));
INSERT INTO @table VALUES
(N' {
    "correlationId": "c3xOeEEQQCCA9sEx7-u6FA",
    "eventCreateTime": "2020-05-12T15:38:23.717Z",
    "time": 1589297903717,
    "owner": {
        "ownergeography": {
            "city": "abc",
            "country": "abc"
        },
        "ownername": {
            "firstname": "abc",
            "lastname": "def"
        },
        "clientApiKey": "xxxxx",
        "businessProfileApiKey": null,
        "userId": null
    },
    "campaignType": "Mobile push"
}')
,(N'[{
    "correlationIds": [
        {
            "campaignId": [1,2,3],
            "correlationId": [{"a":"b"},{"c":"d"},{"e":"f"}]
        }
    ],
    "variantId": 1278915,
    "utmCampaign": "",
    "ua.os.major": "8"
    }
    ,{
    "correlationIds": [
        {
            "campaignId": [1,2,3],
            "correlationId": [{"a":"b"},{"c":"d"},{"e":"f"}]
        }
    ],
    "variantId": 1278915,
    "utmCampaign": "",
    "ua.os.major": "8"
    }]')
,(N'{
    "correlationId": "ls7XmuuiThWzktUeewqgWg",
    "eventCreateTime": "2020-05-12T12:40:20.786Z",
    "time": 1589287220786,
    "modifiedBy": {
        "clientId": null,
        "clientApiKey": "xxx",
        "businessProfileApiKey": null,
        "userId": null
    },
    "campaignType": "Mobile push"
}');

--The query

WITH recCTE AS
(
    SELECT ID
          ,NestLevel   = 0 
          ,ObjectIndex = CAST(1 AS bigint)                                                          
          ,SortString  = CAST(N'sort'                       COLLATE DATABASE_DEFAULT AS NVARCHAR(MAX)) 

          ,JsonPath    = CAST(N'$'                          COLLATE DATABASE_DEFAULT AS NVARCHAR(MAX))
          ,JsonKey     = CAST(N'$'                          COLLATE DATABASE_DEFAULT AS NVARCHAR(MAX)) 
          ,JsonValue   = CAST(AnyJSON                       COLLATE DATABASE_DEFAULT AS NVARCHAR(MAX)) 
          ,JsonType    = CAST(CASE WHEN LEFT(TRIM(AnyJSON),1)=N'[' THEN 4 ELSE 0 END AS TINYINT)
          ,NestedJSON  = CAST(CASE WHEN ISJSON(AnyJSON)=1 
                                   THEN AnyJSON 
                                   ELSE NULL END            COLLATE DATABASE_DEFAULT AS NVARCHAR(MAX)) 
    FROM @table t

    UNION ALL

    SELECT r.ID
          ,r.NestLevel+1
          ,ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) 
          ,CAST(CONCAT(r.SortString,REPLACE(STR(ROW_NUMBER() OVER(ORDER BY (SELECT NULL)),5),' ','0')) COLLATE DATABASE_DEFAULT AS NVARCHAR(MAX))

          ,CAST(CONCAT(r.JsonPath, CASE WHEN r.JsonType=4 --<-- see the docs for OPENJSON()
                                        THEN CONCAT('[',A.[key],']') 
                                        ELSE '.' + A.[key] END)                       COLLATE DATABASE_DEFAULT AS NVARCHAR(MAX))
          ,CAST(A.[key]                                                               COLLATE DATABASE_DEFAULT AS NVARCHAR(MAX))
          ,CAST(r.JsonValue                                                           COLLATE DATABASE_DEFAULT AS NVARCHAR(MAX))
          ,A.[type] 
          ,CAST(A.[value]                                                             COLLATE DATABASE_DEFAULT AS NVARCHAR(MAX))
    FROM recCTE r
    CROSS APPLY OPENJSON(r.NestedJSON) A
    WHERE ISJSON(r.NestedJSON)=1
)
SELECT ID
      ,NestLevel
      ,ObjectIndex
      ,JsonPath
      ,JsonKey
      ,NestedJSON AS JsonValue
      ,SortString --<-- just to illustrate the sorting, not needed in the output
FROM recCTE 
WHERE ISJSON(NestedJSON)=0
ORDER BY ID,SortString;

The result

+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| ID | JsonPath                                  | JsonKey         | JsonValue                | SortString                      |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 1  | $.correlationId                           | correlationId   | c3xOeEEQQCCA9sEx7-u6FA   | 0    1                          |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 1  | $.eventCreateTime                         | eventCreateTime | 2020-05-12T15:38:23.717Z | 0    2                          |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 1  | $.time                                    | time            | 1589297903717            | 0    3                          |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 1  | $.owner.ownergeography.city               | city            | abc                      | 0    4    1    1                |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 1  | $.owner.ownergeography.country            | country         | abc                      | 0    4    1    2                |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 1  | $.owner.ownername.firstname               | firstname       | abc                      | 0    4    2    1                |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 1  | $.owner.ownername.lastname                | lastname        | def                      | 0    4    2    2                |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 1  | $.owner.clientApiKey                      | clientApiKey    | xxxxx                    | 0    4    3                     |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 1  | $.campaignType                            | campaignType    | Mobile push              | 0    5                          |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 2  | $[0].correlationIds[0].campaignId[0]      | 0               | 1                        | 0    1    1    1    1    1      |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 2  | $[0].correlationIds[0].campaignId[1]      | 1               | 2                        | 0    1    1    1    1    2      |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 2  | $[0].correlationIds[0].campaignId[2]      | 2               | 3                        | 0    1    1    1    1    3      |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 2  | $[0].correlationIds[0].correlationId[0].a | a               | b                        | 0    1    1    1    2    1    1 |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 2  | $[0].correlationIds[0].correlationId[1].c | c               | d                        | 0    1    1    1    2    2    1 |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 2  | $[0].correlationIds[0].correlationId[2].e | e               | f                        | 0    1    1    1    2    3    1 |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 2  | $[0].variantId                            | variantId       | 1278915                  | 0    1    2                     |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 2  | $[0].utmCampaign                          | utmCampaign     |                          | 0    1    3                     |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 2  | $[0].ua.os.major                          | ua.os.major     | 8                        | 0    1    4                     |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 2  | $[1].correlationIds[0].campaignId[0]      | 0               | 1                        | 0    2    1    1    1    1      |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 2  | $[1].correlationIds[0].campaignId[1]      | 1               | 2                        | 0    2    1    1    1    2      |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 2  | $[1].correlationIds[0].campaignId[2]      | 2               | 3                        | 0    2    1    1    1    3      |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 2  | $[1].correlationIds[0].correlationId[0].a | a               | b                        | 0    2    1    1    2    1    1 |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 2  | $[1].correlationIds[0].correlationId[1].c | c               | d                        | 0    2    1    1    2    2    1 |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 2  | $[1].correlationIds[0].correlationId[2].e | e               | f                        | 0    2    1    1    2    3    1 |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 2  | $[1].variantId                            | variantId       | 1278915                  | 0    2    2                     |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 2  | $[1].utmCampaign                          | utmCampaign     |                          | 0    2    3                     |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 2  | $[1].ua.os.major                          | ua.os.major     | 8                        | 0    2    4                     |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 3  | $.correlationId                           | correlationId   | ls7XmuuiThWzktUeewqgWg   | 0    1                          |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 3  | $.eventCreateTime                         | eventCreateTime | 2020-05-12T12:40:20.786Z | 0    2                          |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 3  | $.time                                    | time            | 1589287220786            | 0    3                          |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 3  | $.modifiedBy.clientApiKey                 | clientApiKey    | xxx                      | 0    4    2                     |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+
| 3  | $.campaignType                            | campaignType    | Mobile push              | 0    5                          |
+----+-------------------------------------------+-----------------+--------------------------+---------------------------------+

The idea in short:

  • we use a recursive CTE to walk this down.
  • The query will test any fragment ([value] coming from OPENJSON) for being valid JSON.
  • If the fragment is valid, this walks deeper and deeper.
  • The column SortString is needed to get a final sort order.
  • The CAST() and COLLATE helps to avoid data type mismatch. Recursive CTEs are very picky with this...

Hint: If you deal with bigger JSONs you might need to set OPTION (MAXRECURSION 0) at the end of your query.

Enjoy :-)

Something similar for XML

Here is a similar answer about how to read an unknown XML.

4
  • @JohnCappelletti, Well, time is the shortage :-) This approach is quite close to the recursive XML function we both know quite well :-) Commented May 14, 2020 at 16:33
  • Just for fun, here is how I convert JSON to a hierarchy. Any inputs/tweaks welcome dbfiddle.uk/… Commented May 14, 2020 at 16:42
  • @JohnCappelletti, great! Did you poste this here anywhere? Before I opened this self-answered question I did some research without any success... Commented May 14, 2020 at 16:44
  • @Shnugo No, I did not post that (I believe) Commented May 14, 2020 at 19:25

Not the answer you're looking for? Browse other questions tagged or ask your own question.