Using a local model via Ollama

If you're happy using OpenAI, you can skip this section, but many people are interested in using models they run themselves. The easiest way to do this is via the great work of our friends at Ollama, who provide a simple to use client that will download, install and run a growing range of models for you.

Install Ollama

They provide a one-click installer for Mac, Linux and Windows on their home page.

Pick and run a model

Since we're going to be doing agentic work, we'll need a very capable model, but the largest models are hard to run on a laptop. We think mixtral 8x7b is a good balance between power and resources, but llama3 is another great option. You can run it simply by running

ollama run mixtral:8x7b

The first time you run it will also automatically download and install the model for you.

Switch the LLM in your code

There are two changes you need to make to the code we already wrote in 1_agent to get Mixtral 8x7b to work. First, you need to switch to that model. Replace the call to Settings.llm with this:

Settings.llm = ollama({
  model: "mixtral:8x7b",
});

Run local agent

You can also create local agent by importing agent from llamaindex.

import { agent } from "llamaindex";
 
const workflow = agent({
  tools: [getWeatherTool],
});
 
const workflowContext = workflow.run(
  "What's the weather like in San Francisco?",
);

Next steps

Now you've got a local agent, you can add Retrieval-Augmented Generation to your agent.