Using a local model via Ollama
If you're happy using OpenAI, you can skip this section, but many people are interested in using models they run themselves. The easiest way to do this is via the great work of our friends at Ollama, who provide a simple to use client that will download, install and run a growing range of models for you.
Install Ollama
They provide a one-click installer for Mac, Linux and Windows on their home page.
Pick and run a model
Since we're going to be doing agentic work, we'll need a very capable model, but the largest models are hard to run on a laptop. We think mixtral 8x7b
is a good balance between power and resources, but llama3
is another great option. You can run it simply by running
The first time you run it will also automatically download and install the model for you.
Switch the LLM in your code
There are two changes you need to make to the code we already wrote in 1_agent
to get Mixtral 8x7b to work. First, you need to switch to that model. Replace the call to Settings.llm
with this:
Run local agent
You can also create local agent by importing agent
from llamaindex
.
Next steps
Now you've got a local agent, you can add Retrieval-Augmented Generation to your agent.
Last updated on