Using other LLM APIs

By default LlamaIndex.TS uses OpenAI's LLMs and embedding models, but we support lots of other LLMs including models from Mistral (Mistral, Mixtral), Anthropic (Claude) and Google (Gemini).

If you don't want to use an API at all you can run a local model

Using another LLM

You can specify what LLM LlamaIndex.TS will use on the Settings object, like this:

import { MistralAI, Settings } from "llamaindex";

Settings.llm = new MistralAI({
  model: "mistral-tiny",
  apiKey: "<YOUR_API_KEY>",
});

You can see examples of other APIs we support by checking out "Available LLMs" in the sidebar of our LLMs section.

Using another embedding model

A frequent gotcha when trying to use a different API as your LLM is that LlamaIndex will also by default index and embed your data using OpenAI's embeddings. To completely switch away from OpenAI you will need to set your embedding model as well, for example:

import { MistralAIEmbedding, Settings } from "llamaindex";

Settings.embedModel = new MistralAIEmbedding();

We support many different embeddings.

Full example

This example uses Mistral's mistral-tiny model as the LLM and Mistral for embeddings as well.

import * as fs from "fs/promises";
import {
  Document,
  MistralAI,
  MistralAIEmbedding,
  Settings,
  VectorStoreIndex,
} from "llamaindex";

// Update embed model
Settings.embedModel = new MistralAIEmbedding();
// Update llm to use MistralAI
Settings.llm = new MistralAI({ model: "mistral-tiny" });

async function rag(query: string) {
  // Load essay from abramov.txt in Node
  const path = "node_modules/llamaindex/examples/abramov.txt";

  const essay = await fs.readFile(path, "utf-8");

  // Create Document object with essay
  const document = new Document({ text: essay, id_: path });

  const index = await VectorStoreIndex.fromDocuments([document]);

  // Query the index
  const queryEngine = index.asQueryEngine();
  const response = await queryEngine.query({ query });
  return response.response;
}

(async () => {
  // embeddings
  const embedding = new MistralAIEmbedding();
  const embeddingsResponse = await embedding.getTextEmbedding(
    "What is the best French cheese?",
  );
  console.log(
    `MistralAI embeddings are ${embeddingsResponse.length} numbers long\n`,
  );

  // chat api (non-streaming)
  const llm = new MistralAI({ model: "mistral-tiny" });
  const response = await llm.chat({
    messages: [{ content: "What is the best French cheese?", role: "user" }],
  });
  console.log(response.message.content);

  // chat api (streaming)
  const stream = await llm.chat({
    messages: [
      { content: "Who is the most renowned French painter?", role: "user" },
    ],
    stream: true,
  });
  for await (const chunk of stream) {
    process.stdout.write(chunk.delta);
  }

  // rag
  const ragResponse = await rag("What did the author do in college?");
  console.log(ragResponse);
})();

Using another LLM​

Using another embedding model​

Full example​

Using another LLM

Using another embedding model

Full example