Skip to main content

Vector Stores

Langflow's Vector Store components are used to read and write vector data, including embedding storage, vector search, Graph RAG traversals, and specialized provider-specific search, such as OpenSearch, Elasticsearch, and Vectara.

These components are critical for vector search applications, such as Retrieval Augmented Generation (RAG) chatbots that need to retrieve relevant context from large datasets.

Most of these components connect to a specific vector database provider, but some components support multiple providers or platforms. For example, the Cassandra vector store component can connect to self-managed Apache Cassandra-based clusters as well as Astra DB, which is a managed Cassandra DBaaS.

Other types of storage, like traditional structured databases and chat memory, are handled through other components like the SQL Database component or the Message History component.

Use Vector Store components in a flow

tip

For a tutorial using Vector Store components in a flow, see Create a vector RAG chatbot.

The following steps introduce the use of Vector Store components in a flow, including configuration details, how the components work when you run a flow, why you might need multiple Vector Store components in one flow, and useful supporting components, such as Embedding Model and Parser components.

  1. Create a flow with the Vector Store RAG template.

    This template has two subflows. The Load Data subflow loads embeddings and content into a vector database, and the Retriever subflow runs a vector search to retrieve relevant context based on a user's query.

  2. Configure the database connection for both Astra DB components, or replace them with another pair of Vector Store components of your choice. Make sure the components connect to the same vector store, and that the component in the Retriever subflow is able to run a similarity search.

    The parameters you set in each Vector Store component depend on the component's role in your flow. In this example, the Load Data subflow writes to the vector store, whereas the Retriever subflow reads from the vector store. Therefore, search-related parameters are only relevant to the Vector Search component in the Retriever subflow.

    For information about specific configuration parameters, see the section of this page for your chosen Vector Store component and Hidden parameters.

  3. To configure the embedding model, do one of the following:

    • Use an OpenAI model: In both OpenAI Embeddings components, enter your OpenAI API key. You can use the default model or select a different OpenAI embedding model.

    • Use another provider: Replace the OpenAI Embeddings components with another pair of Embedding Model component of your choice, and then configure the parameters and credentials accordingly.

    • Use Astra DB vectorize: If you are using an Astra DB vector store that has a vectorize integration, you can remove both OpenAI Embeddings components. If you do this, the vectorize integration automatically generates embeddings from the Ingest Data (in the Load Data subflow) and Search Query (in the Retriever subflow).

    tip

    If your vector store already contains embeddings, make sure your Embedding Model components use the same model as your previous embeddings. Mixing embedding models in the same vector store can produce inaccurate search results.

  4. Recommended: In the Split Text component, optimize the chunking settings for your embedding model. For example, if your embedding model has a token limit of 512, then the Chunk Size parameter must not exceed that limit.

    Additionally, because the Retriever subflow passes the chat input directly to the Vector Store component for vector search, make sure that your chat input string doesn't exceed your embedding model's limits. For this example, you can enter a query that is within the limits; however, in a production environment, you might need to implement additional checks or preprocessing steps to ensure compliance. For example, use additional components to prepare the chat input before running the vector search, or enforce chat input limits in your application code.

  5. In the Language Model component, enter your OpenAI API key, or select a different provider and model to use for the chat portion of the flow.

  6. Run the Load Data subflow to populate your vector store. In the File component, select one or more files, and then click Run component on the Vector Store component in the Load Data subflow.

    The Load Data subflow loads files from your local machine, chunks them, generates embeddings for the chunks, and then stores the chunks and their embeddings in the vector database.

    Embedding data into a vector store

    The Load Data subflow is separate from the Retriever subflow because you probably won't run it every time you use the chat. You can run the Load Data subflow as needed to preload or update the data in your vector store. Then, your chat interactions only use the components that are necessary for chat.

    If your vector store already contains data that you want to use for vector search, then you don't need to run the Load Data subflow.

  7. Open the Playground and start chatting to run the Retriever subflow.

    The Retriever subflow generates an embedding from chat input, runs a vector search to retrieve similar content from your vector store, parses the search results into supplemental context for the LLM, and then uses the LLM to generate a natural language response to your query. The LLM uses the vector search results along with its internal training data and tools, such as basic web search and datetime information, to produce the response.

    Retrieval from a vector store

    To avoid passing the entire block of raw search results to the LLM, the Parser component extracts text strings from the search results Data object, and then passes them to the Prompt Template component in Message format. From there, the strings and other template content are compiled into natural language instructions for the LLM.

    You can use other components for this transformation, such as the Data Operations component, depending on how you want to use the search results.

    To view the raw search results, click Inspect output on the Vector Store component after running the Retriever subflow.

Hidden parameters

You can inspect a Vector Store component's parameters to learn more about the inputs it accepts, the features it supports, and how to configure it.

Many input parameters for Vector Store components are hidden by default in the visual editor. You can toggle parameters through the Controls in each component's header menu.

Some parameters are conditional, and they are only available after you set other parameters or select specific options for other parameters. Conditional parameters may not be visible on the Controls pane until you set the required dependencies. However, all parameters are always listed in a component's code.

For information about a specific component's parameters, see the provider's documentation and the component details.

Search results output

If you use a Vector Store component to query your vector store, it produces search results that you can pass to downstream components in your flow as a list of Data objects or a tabular DataFrame. If both types are supported, you can set the format near the component's output port in the visual editor.

The exception to this pattern is the Vectara RAG component, which outputs only an answer string in Message format.

Vector store instances

Because Langflow is based on LangChain, Vector Store components use an instance of LangChain vector store to drive the underlying vector search functions. In the component code, this is often instantiated as vector_store, but some components use a different name, such as the provider name.

For the Cassandra Graph and Astra DB Graph components, vector_store is an instance of LangChain graph vector store.

These instances are provider-specific and configured according to the component's parameters. For example, the Redis component creates an instance of RedisVectorStore based on the component's parameters, such as the connection string, index name, and schema.

Some LangChain classes don't expose all possible options as component parameters. Depending on the provider, these options might use default values or allow modification through environment variables, if they are supported in Langflow. For information about specific options, see the LangChain API reference and provider documentation.

Vector Store Connection ports

The Astra DB and OpenSearch components have an additional Vector Store Connection output. This output can only connect to a VectorStore input port, and it was intended for use with dedicated Graph RAG components.

The only non-legacy component that supports this input is the Graph RAG component, which was meant as a Graph RAG extension to the Astra DB component. Instead, you can use the Astra DB Graph component that includes both the vector store connection and Graph RAG functionality. OpenSearch instances support Graph traversal through built-in RAG functionality and plugins.

Apache Cassandra

The Cassandra and Cassandra Graph components can be used with Cassandra clusters that support vector search, including Astra DB.

For more information, see the following:

Cassandra

Use the Cassandra component to read or write to a Cassandra vector store using a CassandraVectorStore instance.

Cassandra parameters
NameTypeDescription
database_refStringInput parameter. Contact points for the database or an Astra database ID.
usernameStringInput parameter. Username for the database. Leave empty for Astra DB.
tokenSecretStringInput parameter. User password for the database or an Astra application token.
keyspaceStringInput parameter. The name of the keyspace containing the vector store specified in Table Name (table_name).
table_nameStringInput parameter. The name of the table or collection that is the vector store.
ttl_secondsIntegerInput parameter. Time-to-live for added texts, if supported by the cluster. Only relevant for writes.
batch_sizeIntegerInput parameter. Amount of records to process in a single batch.
setup_modeStringInput parameter. Configuration mode for setting up a Cassandra table.
cluster_kwargsDictInput parameter. Additional keyword arguments for a Cassandra cluster.
search_queryStringInput parameter. Query string for similarity search. Only relevant for reads.
ingest_dataDataInput parameter. Data to be loaded into the vector store as raw chunks and embeddings. Only relevant for writes.
embeddingEmbeddingsInput parameter. Embedding function to use.
number_of_resultsIntegerInput parameter. Number of results to return in search. Only relevant for reads.
search_typeStringInput parameter. Type of search to perform. Only relevant for reads.
search_score_thresholdFloatInput parameter. Minimum similarity score for search results. Only relevant for reads.
search_filterDictInput parameter. An optional dictionary of metadata search filters to apply in addition to vector search. Only relevant for reads.
body_searchStringInput parameter. Document textual search terms. Only relevant for reads.
enable_body_searchBooleanInput parameter. Flag to enable body search. Only relevant for reads.

Cassandra Graph

The Cassandra Graph component uses a CassandraGraphVectorStore instance for graph traversal and graph-based document retrieval in a compatible Cassandra cluster. It also supports writing to the vector store.

Cassandra Graph parameters
NameDisplay NameInfo
database_refContact Points / Astra Database IDInput parameter. The contact points for the database or an Astra database ID. Required.
usernameUsernameInput parameter. The username for the database. Leave empty for Astra DB.
tokenPassword / Astra DB TokenInput parameter. The user password for the database or an Astra application token. Required.
keyspaceKeyspaceInput parameter. The name of the keyspace containing the vector store specified in Table Name (table_name). Required.
table_nameTable NameInput parameter. The name of the table or collection that is the vector store. Required.
setup_modeSetup ModeInput parameter. The configuration mode for setting up the Cassandra table. The options are Sync (default) or Off.
cluster_kwargsCluster argumentsInput parameter. An optional dictionary of additional keyword arguments for the Cassandra cluster.
search_querySearch QueryInput parameter. The query string for similarity search. Only relevant for reads.
ingest_dataIngest DataInput parameter. Data to be loaded into the vector store as raw chunks and embeddings. Only relevant for writes.
embeddingEmbeddingInput parameter. The embedding model to use.
number_of_resultsNumber of ResultsInput parameter. The number of results to return in similarity search. Only relevant for reads. Default: 4.
search_typeSearch TypeInput parameter. The search type to use. The options are Traversal (default), MMR Traversal, Similarity, Similarity with score threshold, or MMR (Max Marginal Relevance).
depthDepth of traversalInput parameter. The maximum depth of edges to traverse. Only relevant if Search Type (search_type) is Traversal or MMR Traversal. Default: 1.
search_score_thresholdSearch Score ThresholdInput parameter. The minimum similarity score threshold for search results. Only relevant for reads using the Similarity with score threshold search type.
search_filterSearch Metadata FilterInput parameter. An optional dictionary of metadata search filters to apply in addition to graph traversal and similarity search.

Chroma

The Chroma DB and Local DB components read and write to Chroma vector stores using an instance of Chroma vector store. Includes support for remote or in-memory instances with or without persistence.

For more information, see the following:

Chroma DB

You can use the Chroma DB component to read and write to a Chroma database in local storage or a remote Chroma server with options for persistence and caching. When writing, the component can create a new database or collection at the specified location.

tip

An ephemeral (non-persistent) local Chroma vector store is helpful for testing vector search flows where you don't need to retain the database.

The following example flow uses one Chroma DB component for both reads and writes:

  • When writing, it splits Data from a URL component into chunks, computes embeddings with attached Embedding Model component, and then loads the chunks and embeddings into the Chroma vector store. To trigger writes, click Run component on the Chroma DB component.

  • When reading, it uses chat input to perform a similarity search on the vector store, and then print the search results to the chat. To trigger reads, open the Playground and enter a chat message.

After running the flow once, you can click Inspect Output on each component to understand how the data transformed as it passed from component to component.

ChromaDB receiving split text

Chroma DB parameters
NameTypeDescription
Collection Name (collection_name)StringInput parameter. The name of your Chroma vector store collection. Default: langflow.
Persist Directory (persist_directory)StringInput parameter. To persist the Chroma database, enter a relative or absolute path to a directory to store the chroma.sqlite3 file. Leave empty for an ephemeral database. When reading or writing to an existing persistent database, specify the path to the persistent directory.
Ingest Data (ingest_data)Data or DataFrameInput parameter. Data or DataFrame input containing the records to write to the vector store. Only relevant for writes.
Search Query (search_query)StringInput parameter. The query to use for vector search. Only relevant for reads.
Cache Vector Store (cache_vector_store)BooleanInput parameter. If true, the component caches the vector store in memory for faster reads. Default: Enabled (true).
Embedding (embedding)EmbeddingsInput parameter. The embedding function to use for the vector store. By default, Chroma DB uses its built-in embeddings model, or you can attach an Embedding Model component to use a different provider or model.
CORS Allow Origins (chroma_server_cors_allow_origins)StringInput parameter. The CORS allow origins for the Chroma server.
Chroma Server Host (chroma_server_host)StringInput parameter. The host for the Chroma server.
Chroma Server HTTP Port (chroma_server_http_port)IntegerInput parameter. The HTTP port for the Chroma server.
Chroma Server gRPC Port (chroma_server_grpc_port)IntegerInput parameter. The gRPC port for the Chroma server.
Chroma Server SSL Enabled (chroma_server_ssl_enabled)BooleanInput parameter. Enable SSL for the Chroma server.
Allow Duplicates (allow_duplicates)BooleanInput parameter. If true (default), writes don't check for existing duplicates in the collection, allowing you to store multiple copies of the same content. If false, writes won't add documents that match existing documents already present in the collection. If false, it can strictly enforce deduplication by searching the entire collection or only search the number of records, specified in limit. Only relevant for writes.
Search Type (search_type)StringInput parameter. The type of search to perform, either Similarity or MMR. Only relevant for reads.
Number of Results (number_of_results)IntegerInput parameter. The number of search results to return. Default: 10. Only relevant for reads.
Limit (limit)IntegerInput parameter. Limit the number of records to compare when Allow Duplicates is false. This can help improve performance when writing to large collections, but it can result in some duplicate records. Only relevant for writes.

Local DB

The Local DB component reads and writes to a persistent, in-memory Chroma DB instance intended for use with Langflow. It has separate modes for reads and writes, automatic collection management, and default persistence in your Langflow cache directory.

A basic flow with a Local DB component in Retrieve mode.

Set the Mode parameter to reflect the operation you want the component to perform, and the configure the other parameters accordingly. Some parameters are only available for one mode.

To create or write to your local Chroma vector store, use Ingest mode.

The following parameters are available in Ingest mode:

NameTypeDescription
Name Your Collection (collection_name)StringInput parameter. The name for your Chroma vector store collection. Default: langflow. Only available in Ingest mode.
Persist Directory (persist_directory)StringInput parameter. The base directory where you want to create and persist the vector store. If you use the Local DB component in multiple flows or to create multiple collections, collections are stored at $PERSISTENT_DIRECTORY/vector_stores/$COLLECTION_NAME. If not specified, the default location is your Langflow cache directory (LANGFLOW_CONFIG_DIR). For more information, see Memory management options.
Embedding (embedding)EmbeddingsInput parameter. The embedding function to use for the vector store.
Allow Duplicates (allow_duplicates)BooleanInput parameter. If true (default), writes don't check for existing duplicates in the collection, allowing you to store multiple copies of the same content. If false, writes won't add documents that match existing documents already present in the collection. If false, it can strictly enforce deduplication by searching the entire collection or only search the number of records, specified in limit. Only available in Ingest mode.
Ingest Data (ingest_data)Data or DataFrameInput parameter. The records to write to the collection. Records are embedded and indexed for semantic search. Only available in Ingest mode.
Limit (limit)IntegerInput parameter. Limit the number of records to compare when Allow Duplicates is false. This can help improve performance when writing to large collections, but it can result in some duplicate records. Only available in Ingest mode.

Clickhouse

The Clickhouse component reads and writes to a Clickhouse vector store using an instance of Clickhouse vector store.

For more information, see the following:

Clickhouse parameters
NameDisplay NameInfo
hosthostnameInput parameter. The Clickhouse server hostname. Required. Default: localhost.
portportInput parameter. The Clickhouse server port. Required. Default: 8123.
databasedatabaseInput parameter. The Clickhouse database name. Required.
tableTable nameInput parameter. The Clickhouse table name. Required.
usernameUsernameInput parameter. Clickhouse username for authentication. Required.
passwordPasswordInput parameter. Clickhouse password for authentication. Required.
index_typeindex_typeInput parameter. Type of the index, either annoy (default) or vector_similarity.
metricmetricInput parameter. Metric to compute distance for similarity search. The options are angular (default), euclidean, manhattan, hamming, dot.
secureUse HTTPS/TLSInput parameter. If true, enables HTTPS/TLS for the Clickhouse server and overrides inferred values for interface or port arguments. Default: false.
index_paramParam of the indexInput parameter. Index parameters. Default: 100,'L2Distance'.
index_query_paramsindex query paramsInput parameter. Additional index query parameters.
search_querySearch QueryInput parameter. The query string for similarity search. Only relevant for reads.
ingest_dataIngest DataInput parameter. The records to load into the vector store.
cache_vector_storeCache Vector StoreInput parameter. If true, the component caches the vector store in memory for faster reads. Default: Enabled (true).
embeddingEmbeddingInput parameter. The embedding model to use.
number_of_resultsNumber of ResultsInput parameter. The number of search results to return. Default: 4. Only relevant for reads.
score_thresholdScore thresholdInput parameter. The threshold for similarity score comparison. Default: Unset (no threshold). Only relevant for reads.

Couchbase

The Couchbase component reads and writes to a Couchbase vector store using an instance of CouchbaseSearchVectorStore.

For more information, see the following:

Couchbase parameters
NameTypeDescription
couchbase_connection_stringSecretStringInput parameter. Couchbase Cluster connection string. Required.
couchbase_usernameStringInput parameter. Couchbase username for authentication. Required.
couchbase_passwordSecretStringInput parameter. Couchbase password for authentication. Required.
bucket_nameStringInput parameter. Name of the Couchbase bucket. Required.
scope_nameStringInput parameter. Name of the Couchbase scope. Required.
collection_nameStringInput parameter. Name of the Couchbase collection. Required.
index_nameStringInput parameter. Name of the Couchbase index. Required.
ingest_dataDataInput parameter. The records to load into the vector store. Only relevant for writes.
search_queryStringInput parameter. The query string for vector search. Only relevant for reads.
cache_vector_storeBooleanInput parameter. If true, the component caches the vector store in memory for faster reads. Default: Enabled (true).
embeddingEmbeddingsInput parameter. The embedding function to use for the vector store.
number_of_resultsIntegerInput parameter. Maximum number of search results to return. Default: 4. Only relevant for reads.

DataStax

The following components support DataStax vector stores.

For more information, see the following:

Astra DB

The Astra DB component read and writes to Astra DB Serverless databases, using an instance of AstraDBVectorStore to call the Data API and DevOps API.

important

It is recommend that you create any databases, keyspaces, and collections you need before configuring the Astra DB component.

You can create new databases and collections through this component, but this is only possible in the Langflow visual editor, not at runtime, and you must wait while the database or collection initializes before proceeding with flow configuration. Additionally, not all database and collection configuration options are available through the Astra DB component, such as hybrid search options, PCU groups, vectorize integration management, and multi-region deployments.

Astra DB parameters
NameDisplay NameInfo
tokenAstra DB Application TokenInput parameter. An Astra application token with permission to access your vector database. Once the connection is verified, additional fields are populated with your existing databases and collections. If you want to create a database through this component, the application token must have Organization Administrator permissions.
environmentEnvironmentInput parameter. The environment for the Astra DB API endpoint. Always use prod.
database_nameDatabaseInput parameter. The name of the database that you want this component to connect to. Or, you can select New Database to create a new database, and then wait for the database to initialize.
keyspaceKeyspaceInput parameter. The keyspace in your database that contains the collection specified in collection_name. Default: default_keyspace.
collection_nameCollectionInput parameter. The name of the collection that you want to use with this flow. Or, select New Collection to create a new collection with limited configuration options. To ensure your collection is configured with the correct embedding provider and search capabilities, it is recommended to create the collection in the Astra Portal or with the Data API before configuring this component. For more information, see Manage collections in Astra DB Serverless.
embedding_modelEmbedding ModelInput parameter. Attach an Embedding Model component to generate embeddings. Only available if the specified collection doesn't have a vectorize integration. If a vectorize integration exists, the component automatically uses the collection's integrated model.
ingest_dataIngest DataInput parameter. The documents to load into the specified collection.
search_querySearch QueryInput parameter. The query string for vector search.
cache_vector_storeCache Vector StoreInput parameter. Whether to cache the vector store in Langflow memory for faster reads. Default: Enabled (true).
search_methodSearch MethodInput parameter. The search methods to use, either Hybrid Search or Vector Search. Your collection must be configured to support the chosen option, and the default depends on what your collection supports. All collections in Astra DB Serverless (Vector) databases support vector search, but hybrid search requires that you set specific collection settings when creating the collection. These options are only available when creating a collection programmatically. For more information, see Ways to find data in Astra DB Serverless and Create a collection that supports hybrid search.
rerankerRerankerInput parameter. The re-ranker model to use for hybrid search, depending on the collection configuration. This parameter shows the default reranker even if the selected collection doesn't support hybrid search. To verify if a collection supports hybrid search, get collection metadata, and then check that lexical and rerank both have "enabled": true.
lexical_termsLexical TermsInput parameter. A space-separated string of keywords for hybrid search, like features, data, attributes, characteristics. This parameter is only available if the collection supports hybrid search. For more information, see the following Hybrid search example.
number_of_resultsNumber of Search ResultsInput parameter. The number of search results to return. Default: 4.
search_typeSearch TypeInput parameter. The search type to use, either Similarity (default), Similarity with score threshold, and MMR (Max Marginal Relevance).
search_score_thresholdSearch Score ThresholdInput parameter. The minimum similarity score threshold for vector search results with the Similarity with score threshold search type. Default: 0.
advanced_search_filterSearch Metadata FilterInput parameter. An optional dictionary of metadata filters to apply in addition to vector or hybrid search.
autodetect_collectionAutodetect CollectionInput parameter. Whether to automatically fetch a list of available collections after providing an application token and API endpoint.
content_fieldContent FieldInput parameter. For writes, this parameter specifies the name of the field in the documents that contains text strings for which you want to generate embeddings.
deletion_fieldDeletion Based On FieldInput parameter. When provided, documents in the target collection with metadata field values matching the input metadata field value are deleted before new records are loaded. Use this setting for writes with upserts (overwrites).
ignore_invalid_documentsIgnore Invalid DocumentsInput parameter. Whether to ignore invalid documents during writes. If disabled (false), then an error is raised for invalid documents. Default: Enabled (true).
astradb_vectorstore_kwargsAstraDBVectorStore ParametersInput parameter. An optional dictionary of additional parameters for the AstraDBVectorStore instance. For more information, see Vector store instances.
Hybrid search example

The Astra DB component supports the Data API's hybrid search feature. Hybrid search performs a vector similarity search and a lexical search, compares the results of both searches, and then returns the most relevant results overall.

To use hybrid search through the Astra DB component, do the following:

  1. Use the Data API to create a collection that supports hybrid search if you haven't already created one.

    Although you can create a collection through the Astra DB component, you have more control and insight into the collection settings when using the Data API for this operation.

  2. Create a flow based on the Hybrid Search RAG template, which includes an Astra DB component that is pre-configured for hybrid search.

  3. In the Language Model components, add your OpenAI API key.

  4. Delete the Language Model component that is connected to the Structured Output component's Input Message port, and then connect the Chat Input component to that port.

  5. Configure the Astra DB vector store component:

    1. Enter your Astra DB application token.

    2. In the Database field, select your database.

    3. In the Collection field, select your collection with hybrid search enabled.

      Once you select a collection that supports hybrid search, the other parameters automatically update to allow hybrid search options.

  6. In the component's header menu, click Controls, find the Lexical Terms field, enable the Show toggle, and then click Close.

  7. Connect the first Parser component's Parsed Text output to the Astra DB component's Lexical Terms input. This input only appears after connecting a collection that support hybrid search with reranking.

  8. Click the Structured Output component to expose the component's header menu, click Controls, find the Format Instructions row, click Expand, and then replace the prompt with the following text:


    _10
    You are a database query planner that takes a user's requests, and then converts to a search against the subject matter in question.
    _10
    You should convert the query into:
    _10
    1. A list of keywords to use against a Lucene text analyzer index, no more than 4. Strictly unigrams.
    _10
    2. A question to use as the basis for a QA embedding engine.
    _10
    Avoid common keywords associated with the user's subject matter.

  9. Click Finish Editing, and then click Close to save your changes to the component.

  10. Open the Playground, and then enter a natural language question that you would ask about your database.

    In this example, your input is sent to both the Astra DB and Structured Output components:

    • The input sent directly to the Astra DB component's Search Query port is used as a string for similarity search. An embedding is generated from the query string using the collection's Astra DB vectorize integration.

    • The input sent to the Structured Output component is processed by the Structured Output, Language Model, and Parser components to extract space-separated keywords used for the lexical search portion of the hybrid search.

    The complete hybrid search query is executed against your database using the Data API's find_and_rerank command. The API's response is output as a DataFrame that is transformed into a text string Message by another Parser component. Finally, the Chat Output component prints the Message response to the Playground.

  11. Optional: Exit the Playground, and then click Inspect Output on each individual component to understand how lexical keywords were constructed and view the raw response from the Data API. This is helpful for debugging flows where a certain component isn't receiving input as expected from another component.

    • Structured Output component: The output is the Data object produced by applying the output schema to the LLM's response to the input message and format instructions. The following example is based on the aforementioned instructions for keyword extraction:


      _10
      1. Keywords: features, data, attributes, characteristics
      _10
      2. Question: What characteristics can be identified in my data?

    • Parser component: The output is the string of keywords extracted from the structured output Data, and then used as lexical terms for the hybrid search.

    • Astra DB component: The output is the DataFrame containing the results of the hybrid search as returned by the Data API.

Astra DB Graph

The Astra DB Graph component uses a AstraDBGraphVectorStore instance for graph traversal and graph-based document retrieval in an Astra DB collection. It also supports writing to the vector store. For more information, see Build a Graph RAG system with LangChain and GraphRetriever.

Astra DB Graph parameters
NameDisplay NameInfo
tokenAstra DB Application TokenInput parameter. An Astra application token with permission to access your vector database. Once the connection is verified, additional fields are populated with your existing databases and collections. If you want to create a database through this component, the application token must have Organization Administrator permissions.
api_endpointAPI EndpointInput parameter. Your database's API endpoint.
keyspaceKeyspaceInput parameter. The keyspace in your database that contains the collection specified in collection_name. Default: default_keyspace.
collection_nameCollectionInput parameter. The name of the collection that you want to use with this flow. For write operations, if a matching collection doesn't exist, a new one is created.
metadata_incoming_links_keyMetadata Incoming Links KeyInput parameter. The metadata key for the incoming links in the vector store.
ingest_dataIngest DataInput parameter. Records to load into the vector store. Only relevant for writes.
search_inputSearch QueryInput parameter. Query string for similarity search. Only relevant for reads.
cache_vector_storeCache Vector StoreInput parameter. Whether to cache the vector store in Langflow memory for faster reads. Default: Enabled (true).
embedding_modelEmbedding ModelInput parameter. Attach an Embedding Model component to generate embeddings. If the collection has a vectorize integration, don't attach an Embedding Model component.
metricMetricInput parameter. The metrics to use for similarity search calculations, either cosine (default), dot_product, or euclidean. This is a collection setting.
batch_sizeBatch SizeInput parameter. Optional number of records to process in a single batch.
bulk_insert_batch_concurrencyBulk Insert Batch ConcurrencyInput parameter. Optional concurrency level for bulk write operations.
bulk_insert_overwrite_concurrencyBulk Insert Overwrite ConcurrencyInput parameter. Optional concurrency level for bulk write operations that allow upserts (overwriting existing records).
bulk_delete_concurrencyBulk Delete ConcurrencyInput parameter. Optional concurrency level for bulk delete operations.
setup_modeSetup ModeInput parameter. Configuration mode for setting up the vector store, either Sync (default) or Off.
pre_delete_collectionPre Delete CollectionInput parameter. Whether to delete the collection before creating a new one. Default: Disabled (false).
metadata_indexing_includeMetadata Indexing IncludeInput parameter. An list of metadata fields to index if you want to enable selective indexing only when creating a collection. Doesn't apply to existing collections. Only one *_indexing_* parameter can be set per collection. If all *_indexing_* parameters are unset, then all fields are indexed (default indexing).
metadata_indexing_excludeMetadata Indexing ExcludeInput parameter. An list of metadata fields to exclude from indexing if you want to enable selective indexing only when creating a collection. Doesn't apply to existing collections. Only one *_indexing_* parameter can be set per collection. If all *_indexing_* parameters are unset, then all fields are indexed (default indexing).
collection_indexing_policyCollection Indexing PolicyInput parameter. A dictionary to define the indexing policy if you want to enable selective indexing only when creating a collection. Doesn't apply to existing collections. Only one *_indexing_* parameter can be set per collection. If all *_indexing_* parameters are unset, then all fields are indexed (default indexing). The collection_indexing_policy dictionary is used when you need to set indexing on subfields or a complex indexing definition that isn't compatible as a list.
number_of_resultsNumber of ResultsInput parameter. Number of search results to return. Default: 4. Only relevant to reads.
search_typeSearch TypeInput parameter. Search type to use, either Similarity, Similarity with score threshold, or MMR (Max Marginal Relevance), Graph Traversal, or MMR (Max Marginal Relevance) Graph Traversal (default). Only relevant to reads.
search_score_thresholdSearch Score ThresholdInput parameter. Minimum similarity score threshold for search results if the search_type is Similarity with score threshold. Default: 0.
search_filterSearch Metadata FilterInput parameter. Optional dictionary of metadata filters to apply in addition to vector search.

Graph RAG

The Graph RAG component uses an instance of GraphRetriever for Graph RAG traversal enabling graph-based document retrieval in an Astra DB vector store. For more information, see the DataStax Graph RAG documentation.

tip

This component was meant as a Graph RAG extension for the Astra DB vector store component. However, the Astra DB Graph component includes both the vector store connection and Graph RAG functionality.

Graph RAG parameters
NameDisplay NameInfo
embedding_modelEmbedding ModelInput parameter. Specify the embedding model to use. Not required if the connected vector store has an vectorize integration.
vector_storeVector Store ConnectionInput parameter. A vector_store instance inherited from an Astra DB component's Vector Store Connection output.
edge_definitionEdge DefinitionInput parameter. Edge definition for the graph traversal.
strategyTraversal StrategiesInput parameter. The strategy to use for graph traversal. Strategy options are dynamically loaded from available strategies.
search_querySearch QueryInput parameter. The query to search for in the vector store.
graphrag_strategy_kwargsStrategy ParametersInput parameter. Optional dictionary of additional parameters for the retrieval strategy.
search_resultsSearch Results or DataFrameOutput parameter. The results of the graph-based document retrieval as a list of Data objects or as a tabular DataFrame. You can set the desired output type near the component's output port.

Hyper-Converged Database (HCD)

The Hyper-Converged Database (HCD) component uses your cluster's the Data API server to read and write to an HCD vector store. Because the underlying functions call the Data API, which originated from Astra DB, the component uses an instance of AstraDBVectorStore.

A flow using the HCD component to load vector data.

For more information about using the Data API with an HCD deployment, see Get started with the Data API in HCD 1.2.

HCD parameters
NameDisplay NameInfo
collection_nameCollection NameInput parameter. The name of a vector store collection in HCD. For write operations, if the collection doesn't exist, then a new one is created. Required.
usernameHCD UsernameInput parameter. Username for authenticating to your HCD deployment. Default: hcd-superuser. Required.
passwordHCD PasswordInput parameter. Password for authenticating to your HCD deployment. Required.
api_endpointHCD API EndpointInput parameter. Your deployment's HCD Data API endpoint, formatted as http[s]://**CLUSTER_HOST**:**GATEWAY_PORT where CLUSTER_HOST is the IP address of any node in your cluster and GATEWAY_PORT is the port number ofr your API gateway service. For example, http://192.0.2.250:8181. Required.
ingest_dataIngest DataInput parameter. Records to load into the vector store. Only relevant for writes.
search_inputSearch InputInput parameter. Query string for similarity search. Only relevant for reads.
namespaceNamespaceInput parameter. The namespace in HCD that contains or will contain the collection specified in collection_name. Default: default_namespace.
ca_certificateCA CertificateInput parameter. Optional CA certificate for TLS connections to HCD.
metricMetricInput parameter. The metrics to use for similarity search calculations, either cosine, dot_product, or euclidean. This is a collection setting. If calling an existing collection, leave unset to use the collection's metric. If a write operation creates a new collection, specify the desired similarity metric setting.
batch_sizeBatch SizeInput parameter. Optional number of records to process in a single batch.
bulk_insert_batch_concurrencyBulk Insert Batch ConcurrencyInput parameter. Optional concurrency level for bulk write operations.
bulk_insert_overwrite_concurrencyBulk Insert Overwrite ConcurrencyInput parameter. Optional concurrency level for bulk write operations that allow upserts (overwriting existing records).
bulk_delete_concurrencyBulk Delete ConcurrencyInput parameter. Optional concurrency level for bulk delete operations.
setup_modeSetup ModeInput parameter. Configuration mode for setting up the vector store, either Sync (default), Async, or Off.
pre_delete_collectionPre Delete CollectionInput parameter. Whether to delete the collection before creating a new one.
metadata_indexing_includeMetadata Indexing IncludeInput parameter. An list of metadata fields to index if you want to enable selective indexing only when creating a collection. Doesn't apply to existing collections. Only one *_indexing_* parameter can be set per collection. If all *_indexing_* parameters are unset, then all fields are indexed (default indexing).
metadata_indexing_excludeMetadata Indexing ExcludeInput parameter. An list of metadata fields to exclude from indexing if you want to enable selective indexing only when creating a collection. Doesn't apply to existing collections. Only one *_indexing_* parameter can be set per collection. If all *_indexing_* parameters are unset, then all fields are indexed (default indexing).
collection_indexing_policyCollection Indexing PolicyInput parameter. A dictionary to define the indexing policy if you want to enable selective indexing only when creating a collection. Doesn't apply to existing collections. Only one *_indexing_* parameter can be set per collection. If all *_indexing_* parameters are unset, then all fields are indexed (default indexing). The collection_indexing_policy dictionary is used when you need to set indexing on subfields or a complex indexing definition that isn't compatible as a list.
embeddingEmbedding or Astra VectorizeInput parameter. The embedding model to use by attaching an Embedding Model component. This component doesn't support additional vectorize authentication headers, so it isn't possible to use a vectorize integration with this component, even if you have enabled one on an existing HCD collection.
number_of_resultsNumber of ResultsInput parameter. Number of search results to return. Default: 4. Only relevant to reads.
search_typeSearch TypeInput parameter. Search type to use, either Similarity (default), Similarity with score threshold, or MMR (Max Marginal Relevance). Only relevant to reads.
search_score_thresholdSearch Score ThresholdInput parameter. Minimum similarity score threshold for search results if the search_type is Similarity with score threshold. Default: 0.
search_filterSearch Metadata FilterInput parameter. Optional dictionary of metadata filters to apply in addition to vector search.

Elasticsearch

The Elasticsearch component reads and writes to an Elasticsearch instance using ElasticsearchStore.

For more information, see the following:

Elasticsearch parameters
NameTypeDescription
es_urlStringInput parameter. Elasticsearch server URL.
es_userStringInput parameter. Username for Elasticsearch authentication.
es_passwordSecretStringInput parameter. Password for Elasticsearch authentication.
index_nameStringInput parameter. Name of the Elasticsearch index.
strategyStringInput parameter. Strategy for vector search, either approximate_k_nearest_neighbors or script_scoring.
distance_strategyStringInput parameter. Strategy for distance calculation, either COSINE, EUCLIDEAN_DISTANCE, or DOT_PRODUCT.
search_queryStringInput parameter. Query string for similarity search.
ingest_dataDataInput parameter. Records to load into the vector store.
embeddingEmbeddingsInput parameter. The embedding model to use.
number_of_resultsIntegerInput parameter. Number of search results to return. Default: 4.

FAISS

The FAISS component providese access to the Facebook AI Similarity Search (FAISS) library through an instance of FAISS vector store.

For more information, see the following:

FAISS parameters
NameTypeDescription
index_nameStringInput parameter. The name of the FAISS index. Default: "langflow_index".
persist_directoryStringInput parameter. Path to save the FAISS index. It is relative to where Langflow is running.
search_queryStringInput parameter. The query to search for in the vector store.
ingest_dataDataInput parameter. The list of data to ingest into the vector store.
allow_dangerous_deserializationBooleanInput parameter. Set to True to allow loading pickle files from untrusted sources. Default: True.
embeddingEmbeddingsInput parameter. The embedding function to use for the vector store.
number_of_resultsIntegerInput parameter. Number of results to return from the search. Default: 4.

Milvus

The Milvus component reads and writes to Milvus vector stores using an instance of Milvus vector store.

For more information, see the following:

Milvus parameters
NameTypeDescription
collection_nameStringInput parameter. Name of the Milvus collection.
collection_descriptionStringInput parameter. Description of the Milvus collection.
uriStringInput parameter. Connection URI for Milvus.
passwordSecretStringInput parameter. Password for Milvus.
usernameSecretStringInput parameter. Username for Milvus.
batch_sizeIntegerInput parameter. Number of data to process in a single batch.
search_queryStringInput parameter. Query for similarity search.
ingest_dataDataInput parameter. Data to be ingested into the vector store.
embeddingEmbeddingsInput parameter. Embedding function to use.
number_of_resultsIntegerInput parameter. Number of results to return in search.
search_typeStringInput parameter. Type of search to perform.
search_score_thresholdFloatInput parameter. Minimum similarity score for search results.
search_filterDictInput parameter. Metadata filters for search query.
setup_modeStringInput parameter. Configuration mode for setting up the vector store.
vector_dimensionsIntegerInput parameter. Number of dimensions of the vectors.
pre_delete_collectionBooleanInput parameter. Whether to delete the collection before creating a new one.

MongoDB Atlas

The MongoDB Atlas component reads and writes to MongoDB Atlas vector stores using an instance of MongoDBAtlasVectorSearch.

For more information, see the following:

MongoDB Atlas parameters
NameTypeDescription
mongodb_atlas_cluster_uriSecretStringInput parameter. The connection URI for your MongoDB Atlas cluster. Required.
enable_mtlsBooleanInput parameter. Enable mutual TLS authentication. Default: false.
mongodb_atlas_client_certSecretStringInput parameter. Client certificate combined with private key for mTLS authentication. Required if mTLS is enabled.
db_nameStringInput parameter. The name of the database to use. Required.
collection_nameStringInput parameter. The name of the collection to use. Required.
index_nameStringInput parameter. The name of the Atlas Search index, it should be a Vector Search. Required.
insert_modeStringInput parameter. How to insert new documents into the collection. The options are "append" or "overwrite". Default: "append".
embeddingEmbeddingsInput parameter. The embedding model to use.
number_of_resultsIntegerInput parameter. Number of results to return in similarity search. Default: 4.
index_fieldStringInput parameter. The field to index. Default: "embedding".
filter_fieldStringInput parameter. The field to filter the index.
number_dimensionsIntegerInput parameter. Embedding context length. Default: 1536.
similarityStringInput parameter. The method used to measure similarity between vectors. The options are "cosine", "euclidean", or "dotProduct". Default: "cosine".
quantizationStringInput parameter. Quantization reduces memory costs by converting 32-bit floats to smaller data types. The options are "scalar" or "binary".

OpenSearch

The OpenSearch component reads and writes to OpenSearch instances using OpenSearchVectorSearch.

For more information, see the following:

OpenSearch parameters
NameTypeDescription
opensearch_urlStringInput parameter. URL for OpenSearch cluster, such as https://192.168.1.1:9200.
index_nameStringInput parameter. The index name where the vectors are stored in OpenSearch cluster.
search_inputStringInput parameter. Enter a search query. Leave empty to retrieve all documents or if hybrid search is being used.
ingest_dataDataInput parameter. The data to be ingested into the vector store.
embeddingEmbeddingsInput parameter. The embedding function to use.
search_typeStringInput parameter. The options are "similarity", "similarity_score_threshold", "mmr".
number_of_resultsIntegerInput parameter. The number of results to return in search.
search_score_thresholdFloatInput parameter. The minimum similarity score threshold for search results.
usernameStringInput parameter. The username for the opensource cluster.
passwordSecretStringInput parameter. The password for the opensource cluster.
use_sslBooleanInput parameter. Use SSL.
verify_certsBooleanInput parameter. Verify certificates.
hybrid_search_queryStringInput parameter. Provide a custom hybrid search query in JSON format. This allows you to combine vector similarity and keyword matching.

PGVector

The PGVector component reads and writes to PostgreSQL vector stores using an instance of PGVector.

For more information, see the following:

PGVector parameters
NameTypeDescription
pg_server_urlSecretStringInput parameter. The PostgreSQL server connection string.
collection_nameStringInput parameter. The table name for the vector store.
search_queryStringInput parameter. The query for similarity search.
ingest_dataDataInput parameter. The data to be ingested into the vector store.
embeddingEmbeddingsInput parameter. The embedding function to use.
number_of_resultsIntegerInput parameter. The number of results to return in search.

Pinecone

The Pinecone component reads and writes to Pinecone vector stores using an instance of PineconeVectorStore.

For more information, see the following:

Pinecone parameters
NameTypeDescription
index_nameStringInput parameter. The name of the Pinecone index.
namespaceStringInput parameter. The namespace for the index.
distance_strategyStringInput parameter. The strategy for calculating distance between vectors.
pinecone_api_keySecretStringInput parameter. The API key for Pinecone.
text_keyStringInput parameter. The key in the record to use as text.
search_queryStringInput parameter. The query for similarity search.
ingest_dataDataInput parameter. The data to be ingested into the vector store.
embeddingEmbeddingsInput parameter. The embedding function to use.
number_of_resultsIntegerInput parameter. The number of results to return in search.

Qdrant

The Qdrant component reads and writes to Qdrant vector stores using an instance of QdrantVectorStore.

For more information, see the following:

Qdrant parameters
NameTypeDescription
collection_nameStringInput parameter. The name of the Qdrant collection.
hostStringInput parameter. The Qdrant server host.
portIntegerInput parameter. The Qdrant server port.
grpc_portIntegerInput parameter. The Qdrant gRPC port.
api_keySecretStringInput parameter. The API key for Qdrant.
prefixStringInput parameter. The prefix for Qdrant.
timeoutIntegerInput parameter. The timeout for Qdrant operations.
pathStringInput parameter. The path for Qdrant.
urlStringInput parameter. The URL for Qdrant.
distance_funcStringInput parameter. The distance function for vector similarity.
content_payload_keyStringInput parameter. The content payload key.
metadata_payload_keyStringInput parameter. The metadata payload key.
search_queryStringInput parameter. The query for similarity search.
ingest_dataDataInput parameter. The data to be ingested into the vector store.
embeddingEmbeddingsInput parameter. The embedding function to use.
number_of_resultsIntegerInput parameter. The number of results to return in search.

Redis

The Redis component reads and writes to Redis vector stores using an instance of Redis vector store.

For more information, see the following:

Redis parameters
NameTypeDescription
redis_server_urlSecretStringInput parameter. The Redis server connection string.
redis_index_nameStringInput parameter. The name of the Redis index.
codeStringInput parameter. The custom code for Redis (advanced).
schemaStringInput parameter. The schema for Redis index.
search_queryStringInput parameter. The query for similarity search.
ingest_dataDataInput parameter. The data to be ingested into the vector store.
number_of_resultsIntegerInput parameter. The number of results to return in search.
embeddingEmbeddingsInput parameter. The embedding function to use.

Supabase

The Supabase component reads and writes to Supabase vector stores using an instance of SupabaseVectorStore.

For more information, see the following:

Supabase parameters
NameTypeDescription
supabase_urlStringInput parameter. The URL of the Supabase instance.
supabase_service_keySecretStringInput parameter. The service key for Supabase authentication.
table_nameStringInput parameter. The name of the table in Supabase.
query_nameStringInput parameter. The name of the query to use.
search_queryStringInput parameter. The query for similarity search.
ingest_dataDataInput parameter. The data to be ingested into the vector store.
embeddingEmbeddingsInput parameter. The embedding function to use.
number_of_resultsIntegerInput parameter. The number of results to return in search.

Upstash

The Upstash component reads and writes to Upstash vector stores using an instance of UpstashVectorStore.

For more information, see the following:

Upstash parameters
NameTypeDescription
index_urlStringInput parameter. The URL of the Upstash index.
index_tokenSecretStringInput parameter. The token for the Upstash index.
text_keyStringInput parameter. The key in the record to use as text.
namespaceStringInput parameter. The namespace for the index.
search_queryStringInput parameter. The query for similarity search.
metadata_filterStringInput parameter. Filter documents by metadata.
ingest_dataDataInput parameter. The data to be ingested into the vector store.
embeddingEmbeddingsInput parameter. The embedding function to use.
number_of_resultsIntegerInput parameter. The number of results to return in search.

Vectara Platform

The Vectara and Vectara RAG components support Vectara vector store, search, and RAG functionality using instances of Vectara vector store.

For more information, see the following:

Vectara

The Vectara component reads and writes to Vectara vector stores, and then produces search results output.

Vectara parameters
NameTypeDescription
vectara_customer_idStringInput parameter. The Vectara customer ID.
vectara_corpus_idStringInput parameter. The Vectara corpus ID.
vectara_api_keySecretStringInput parameter. The Vectara API key.
embeddingEmbeddingsInput parameter. The embedding function to use (optional).
ingest_dataList[Document/Data]Input parameter. The data to be ingested into the vector store.
search_queryStringInput parameter. The query for similarity search.
number_of_resultsIntegerInput parameter. The number of results to return in search.

Vectara RAG

This component enables Vectara's full end-to-end RAG capabilities with reranking options.

This component uses a Vectara vector store to execute the vector search and reranking functions, and then outputs an Answer string in Message format.

Weaviate

The Weaviate component reads and writes to Weaviate vector stores using an instance of Weaviate vector store.

For more information, see the following:

Weaviate parameters
NameTypeDescription
weaviate_urlStringInput parameter. The default instance URL.
search_by_textBooleanInput parameter. Indicates whether to search by text.
api_keySecretStringInput parameter. The optional API key for authentication.
index_nameStringInput parameter. The optional index name.
text_keyStringInput parameter. The default text extraction key.
inputDocumentInput parameter. The document or record.
embeddingEmbeddingsInput parameter. The embedding model used.
attributesList[String]Input parameter. Optional additional attributes.
Search