Skip to main content


LlamaCloud is a new generation of managed parsing, ingestion, and retrieval services, designed to bring production-grade context-augmentation to your LLM and RAG applications.

Currently, LlamaCloud supports

  • Managed Ingestion API, handling parsing and document management
  • Managed Retrieval API, configuring optimal retrieval for your RAG system


We are opening up a private beta to a limited set of enterprise partners for the managed ingestion and retrieval API. If you’re interested in centralizing your data pipelines and spending more time working on your actual RAG use cases, come talk to us.

If you have access to LlamaCloud, you can visit LlamaCloud to sign in and get an API key.

Create a Managed Index

Currently, you can't create a managed index on LlamaCloud using LlamaIndexTS, but you can use an existing managed index for retrieval that was created by the Python version of LlamaIndex. See the LlamaCloudIndex documentation for more information on how to create a managed index.

Use a Managed Index

Here's an example of how to use a managed index together with a chat engine:

import { stdin as input, stdout as output } from "node:process";
import readline from "node:readline/promises";

import { ContextChatEngine, LlamaCloudIndex } from "llamaindex";

async function main() {
const index = new LlamaCloudIndex({
name: "test",
projectName: "Default",
baseUrl: process.env.LLAMA_CLOUD_BASE_URL,
apiKey: process.env.LLAMA_CLOUD_API_KEY,
const retriever = index.asRetriever({
similarityTopK: 5,
const chatEngine = new ContextChatEngine({ retriever });
const rl = readline.createInterface({ input, output });

while (true) {
const query = await rl.question("User: ");
const stream = await{ message: query, stream: true });
for await (const chunk of stream) {


API Reference