Modules/RAG
Response Synthesizer
The ResponseSynthesizer is responsible for sending the query, nodes, and prompt templates to the LLM to generate a response. There are a few key modes for generating a response:
Refine
: "create and refine" an answer by sequentially going through each retrieved text chunk. This makes a separate LLM call per Node. Good for more detailed answers.CompactAndRefine
(default): "compact" the prompt during each LLM call by stuffing as many text chunks that can fit within the maximum prompt size. If there are too many chunks to stuff in one prompt, "create and refine" an answer by going through multiple compact prompts. The same asrefine
, but should result in less LLM calls.TreeSummarize
: Given a set of text chunks and the query, recursively construct a tree and return the root node as the response. Good for summarization purposes.MultiModal
: Combines textual inputs with additional modality-specific metadata to generate an integrated response. It leverages a text QA template to build a prompt that incorporates various input types and produces either streaming or complete responses. This approach is ideal for use cases where enriching the answer with multi-modal context (such as images, audio, or other data) can enhance the output quality.
The synthesize
function also supports streaming, just add stream: true
as an option: