JSON Mode
In JSON mode, LlamaParse will return a data structure representing the parsed object.
Installation
Usage
For Json mode, you need to use loadJson
. The resultType
is automatically set with this method.
More information about indexing the results on the next page.
Output
The result format of the response, written to jsonObjs
in the example, follows this structure:
Page objects
Within page objects, the following keys may be present depending on your document.
page
: The page number of the document.text
: The text extracted from the page.md
: The markdown version of the extracted text.images
: Any images extracted from the page.items
: An array of heading, text and table objects in the order they appear on the page.
JSON Mode with SimpleDirectoryReader
All Readers share a loadData
method with SimpleDirectoryReader
that promises to return a uniform Document with Metadata. This makes JSON mode incompatible with SimpleDirectoryReader.
However, a simple work around is to create a new reader class that extends LlamaParseReader
and adds a new method or overrides loadData
, wrapping around JSON mode, extracting the required values, and returning a Document object.
Now we have documents with page number as metadata. This new reader can be used like any other and be integrated with SimpleDirectoryReader. Since it extends LlamaParseReader
, you can use the same params.
You can assign any other values of the JSON response to the Document as needed.