I'm currently working on a project that requires the generation of complex JSON documents using Large Language Models (LLMs). These documents, representing rich text content, can contain various nested structures such as bullet points, multiple columns, images, videos, and embeds.
To illustrate, below is a simplified version of the JSON document that I am trying to generate:
{
"type": "doc",
"content": [
{
"type": "heading",
"content": [
{
"text": "Lorem ipsum"
}
]
},
{
"type": "columns",
"content": [
{
"type": "bulletList",
"content": [
{
"type": "paragraph",
"content": [
{
"text": "Lorem ipsum dolor sit amet, consectetur adipiscing elit. "
}
]
}
]
}
]
}
]
}
Notably, the schema of the JSON document can be complex due to indefinite nesting (e.g., bullet lists within bullet lists). While a schema definition is challenging, I have a Node.js function available that validates whether a document is correct or not.
I've tried straightforward approaches, like asking GPT-4 to generate the JSON, few shots prompt engineering, but it didn't produce the output conforming to my schema. I also explored solutions like zod-gpt (https://github.com/dzhng/zod-gpt), but it struggled with nested structures and failed to produce a valid JSON document.
How can I effectively use LLMs to generate these complex and nested JSON documents? Are there any strategies, libraries, or methodologies that are particularly effective for this kind of task?