Low-Level LLM Execution
Sometimes your need more control over LLM interactions than what high-level agents provide. The llm.exec
method makes it simple for you to make a single LLM call with tools but hides the complexity of executing the tools and generating the tool messages.
When to Use llm.exec
Use llm.exec
when you need to:
- Build custom agent logic in workflow steps
- Have precise control over message handling and tool execution
Basic Usage
The llm.exec
method takes messages and tools as parameter and executes one LLM call.
The LLM might either request to call one or more of the tools or generate an assistant message as result.
For each tool call that is requested, llm.exec
executes it and generates the two tool call messages (call and result). If no tool call is requested, just the assistant message is returned.
import { openai } from "@llamaindex/openai";
import { ChatMessage, tool } from "llamaindex";
import z from "zod";
const llm = openai({ model: "gpt-4.1-mini" });
const messages = [
{
content: "What's the weather like in San Francisco?",
role: "user",
} as ChatMessage,
];
const { newMessages, toolCalls } = await llm.exec({
messages,
tools: [
tool({
name: "get_weather",
description: "Get the current weather for a location",
parameters: z.object({
address: z.string().describe("The address"),
}),
execute: ({ address }) => {
return `It's sunny in ${address}!`;
},
}),
],
});
// Add the new messages (including tool calls and responses) to your conversation
messages.push(...newMessages);
newMessages
is an array as each tool call generates two messages: a tool call message and the tool call result message.
Agent Loop Pattern
A common pattern is to use llm.exec
in a loop until the LLM stops making tool calls:
import { openai } from "@llamaindex/openai";
import { ChatMessage, tool } from "llamaindex";
import z from "zod";
async function runAgentLoop() {
const llm = openai({ model: "gpt-4.1-mini" });
const messages = [
{
content: "What's the weather like in San Francisco?",
role: "user",
} as ChatMessage,
];
let exit = false;
do {
const { newMessages, toolCalls } = await llm.exec({
messages,
tools: [
tool({
name: "get_weather",
description: "Get the current weather for a location",
parameters: z.object({
address: z.string().describe("The address"),
}),
execute: ({ address }) => {
return `It's sunny in ${address}!`;
},
}),
],
});
console.log(newMessages);
messages.push(...newMessages);
// Exit when no more tool calls are made
exit = toolCalls.length === 0;
} while (!exit);
}
Streaming Support
For real-time responses, use the stream
option to get the assistant's response as streamed tokens:
import { openai } from "@llamaindex/openai";
import { tool } from "llamaindex";
import z from "zod";
async function streamingAgentLoop() {
const llm = openai({ model: "gpt-4o-mini" });
const messages = [
{
content: "What's the weather like in San Francisco?",
role: "user",
} as ChatMessage,
];
let exit = false;
do {
const { stream, newMessages, toolCalls } = await llm.exec({
messages,
tools: [
tool({
name: "get_weather",
description: "Get the current weather for a location",
parameters: z.object({
address: z.string().describe("The address"),
}),
execute: ({ address }) => {
return `It's sunny in ${address}!`;
},
}),
],
stream: true,
});
// Stream the response token by token
for await (const chunk of stream) {
process.stdout.write(chunk.delta);
}
messages.push(...newMessages());
exit = toolCalls.length === 0;
} while (!exit);
}
newMessages
is a function when streaming. The reason is that the result only is available after streaming. Calling it before, will throw an error.
Return Values
llm.exec
returns an object with:
newMessages
: Array of new chat messages including the LLM response and any tool call messages (call or result). This is a function return the array when streaming.toolCalls
: Array of tool calls made by the LLMstream
: Async iterable for streaming responses (only whenstream: true
)
Best Practices
For using llm.exec
in an agent loop, take care to:
- Maintain message history: Always add
newMessages
to your conversation history - Set exit conditions: Implement proper logic to avoid infinite loops
Last updated on