Version: 1.12.x (Next)

Knowledge Base

A Langflow knowledge base is a vector database that stores embeddings for use in your flows. By default, knowledge bases use Chroma as a local vector store, but you can configure an external vector database provider such as OpenSearch. For more information, see Configure vector database providers.

Because knowledge bases don't re-ingest data with every flow run, they can be more efficient than using a remote vector database. They are a good choice for flows that use custom, domain-specific datasets, like slices of customer and product data.

You can use knowledge base components in much the same way that you use vector store components. However, there are several key differences:

Local storage by default: Langflow knowledge bases use Chroma local storage by default. In contrast, only some vector store components support local databases.
Built-in embedding models: Langflow knowledge bases include built-in support for several embedding models. Other models aren't supported for use with knowledge bases. To use a different provider or model, you must use a vector store component along with your preferred embedding model component.
Basic similarity search: When querying Langflow knowledge bases, only standard similarity search is supported. For more advanced searches, you must use a vector store component for a vector database provider that supports your desired functionality.
Structured data: Langflow knowledge bases only support structured data. For unstructured data, you must use a compatible vector store component.

The Knowledge Base component reads from and writes to knowledge bases using a mode selector.

Select Ingest mode to embed and index data into a knowledge base, or Retrieve mode to search an existing knowledge base using semantic search.

The output for both modes is a Table containing the results.

Knowledge Base parameters

Some parameters are hidden by default in the visual editor. You can modify all component parameters through the component inspection panel that appears when you select a component.

The following parameters are shared across both modes.

Name	Display Name	Info
mode	Mode	Input parameter. Tab selector that switches the component between Ingest and Retrieve modes.
knowledge_base	Knowledge	Input parameter. Select the knowledge base to ingest data into or retrieve data from.

Ingest mode
Retrieve mode

Name	Display Name	Info
input_df	Input	Input parameter. Table with all original columns (already chunked or processed). Accepts Message, Data, or DataFrame.
column_config	Column Configuration	Input parameter. Configure column behavior. Use the Vectorize flag to create embeddings for a column, and the Identifier flag to use a column as a unique identifier.
api_key	Embedding Provider API Key	Input parameter. Optional. Overrides the globally configured API key for the embedding provider. Leave blank to use the pre-configured key.
chunk_size	Chunk Size	Input parameter. Batch size for processing embeddings. Default: `1000`.
allow_duplicates	Allow Duplicates	Input parameter. If enabled, allows duplicate rows in the knowledge base. Default: Disabled (false).
metadata_json	Metadata	Input parameter. Optional JSON object of user metadata applied to every chunk in this run (for example, `{"tag": "invoice", "year": "2026"}`). This metadata is compatible with the Metadata Filter parameter in Retrieve mode. Malformed JSON is ignored with a warning.

Name	Display Name	Info
search_query	Search Query	Input parameter. Optional search query to filter knowledge base data using semantic similarity. If omitted, the top results are returned.
api_key	Embedding Provider API Key	Input parameter. Optional API key for the embedding provider to override a previously-provided key. The embedding provider and model are chosen when you create a knowledge base.
top_k	Top K Results	Input parameter. Number of search results to return. Default: `5`.
include_metadata	Include Metadata	Input parameter. Whether to include all metadata in the output. If enabled, each output row includes all metadata and content. If disabled, only the content is returned. Default: Enabled (true).
include_embeddings	Include Embeddings	Input parameter. Whether to include raw embedding vectors in the output. Only applicable when Include Metadata is enabled. Default: Disabled (false).
metadata_filter	Metadata Filter	Input parameter. Optional JSON object of key/value pairs to filter results by user metadata (for example, `{"tag": "invoice"}` or `{"tag": ["invoice", "audit"]}` for OR-of-values matching). Backends without native filtering apply the match client-side after retrieval.

Use the Knowledge Base component in a flow

Ingest mode
Retrieve mode

After you create a knowledge base, you can use the Knowledge Base component in Ingest mode to populate it from a DataFrame in your flow.

Add a Knowledge Base component to your flow.
In the Mode tab, select Ingest.
In the Knowledge field, select the knowledge base you want to ingest into, or create a new one.
Connect a source component, such as a Read File component or Data Operations component to the Input handle to provide the DataFrame to embed.
In the Column Configuration table, configure each column:
- Enable Vectorize for columns whose text should be embedded for semantic search.
- Enable Identifier for columns that uniquely identify each row (used for duplicate detection).
Click Run component to embed and index the data into your knowledge base.

After you create and load data to a knowledge base, you can use the Knowledge Base component in Retrieve mode to search it using semantic similarity.

Add a Knowledge Base component to your flow.
In the Mode tab, select Retrieve.
In the Knowledge field, select the knowledge base you want to search, such as the customer sales data knowledge base created in the previous steps.
To view the search results as chat messages, connect the Results output to a Chat Output component.
In Search Query, enter a query that relates to your embedded data.

For the customer sales data example, enter a product name like laptop or wireless devices.
Click Run component on the Knowledge Base component, and then open the Playground to view the output.

Knowledge Base parameters​

Use the Knowledge Base component in a flow​

See also​

Knowledge Base parameters

Use the Knowledge Base component in a flow

See also