Image Retrieval
LlamaParse json
mode supports extracting any images found in a page object by using the getImages
function. They are downloaded to a local folder and can then be sent to a multimodal LLM for further processing.
Installation
Usage
We use the getImages
method to input our array of JSON objects, download the images to a specified folder and get a list of ImageNodes.
Multimodal Indexing
You can create an index across both text and image nodes by requesting alternative text for the image from a multimodal LLM.
We use two helper functions to create documents from the text and image nodes provided.
Text Documents
To create documents from the text nodes of the json object, we just map the needed values to a new Document
object. In this case we assign the text as text and the page number as metadata.
Image Documents
To create documents from the images, we need to use a multimodal LLM to generate alt text.
For this we create ImageNodes
and add them as part of our message.
We can use the createMessageContent
function to simplify this.
The returned imageDocs
have the alt text assigned as text and the image path as metadata.
You can see the full example file here.
API Reference
Last updated on