Skip to main content

DataStax

Bundles contain custom components that support specific third-party integrations with Langflow.

This page describes the components that are available in the DataStax bundle, including components that read and write to Astra DB databases.

Astra DB

important

It is recommended that you create any databases, keyspaces, and collections you need before configuring the Astra DB component.

You can create new databases and collections through this component, but this is only possible in the Langflow visual editor (not at runtime), and you must wait while the database or collection initializes before proceeding with flow configuration. Additionally, not all database and collection configuration options are available through the Astra DB component, such as hybrid search options, PCU groups, vectorize integration management, and multi-region deployments.

The Astra DB component reads and writes to Astra DB Serverless databases, using an instance of AstraDBVectorStore to call the Data API and DevOps API.

About vector store instances

Because Langflow is based on LangChain, vector store components use an instance of LangChain vector store to drive the underlying read and write functions. These instances are provider-specific and configured according to the component's parameters, such as the connection string, index name, and schema.

In component code, this is often instantiated as vector_store, but some vector store components use a different name, such as the provider name.

Some LangChain classes don't expose all possible options as component parameters. Depending on the provider, these options might use default values or allow modification through environment variables, if they are supported in Langflow. For information about specific options, see the LangChain API reference and vector store provider's documentation.

Astra DB parameters

You can inspect a vector store component's parameters to learn more about the inputs it accepts, the features it supports, and how to configure it.

Some parameters are hidden by default in the visual editor. You can modify all parameters through the Controls in the component's header menu.

Some parameters are conditional, and they are only available after you set other parameters or select specific options for other parameters. Conditional parameters may not be visible on the Controls pane until you set the required dependencies.

For information about accepted values and functionality, see the Astra DB Serverless documentation or inspect component code.

NameDisplay NameInfo
tokenAstra DB Application TokenInput parameter. An Astra application token with permission to access your vector database. Once the connection is verified, additional fields are populated with your existing databases and collections. If you want to create a database through this component, the application token must have Organization Administrator permissions.
environmentEnvironmentInput parameter. The environment for the Astra DB API endpoint. Typically always prod.
database_nameDatabaseInput parameter. The name of the database that you want this component to connect to. Or, you can select New Database to create a new database, and then wait for the database to initialize before setting the remaining parameters.
endpointAstra DB API EndpointInput parameter. For multi-region databases, select the API endpoint for your nearest datacenter. To get the list of regions for a multi-region database, see List database regions. This field is automatically populated when you select a database, and it defaults to the primary region's endpoint.
keyspaceKeyspaceInput parameter. The keyspace in your database that contains the collection specified in collection_name. Default: default_keyspace.
collection_nameCollectionInput parameter. The name of the collection that you want to use with this flow. Or, select New Collection to create a new collection with limited configuration options. To ensure your collection is configured with the correct embedding provider and search capabilities, it is recommended to create the collection in the Astra Portal or with the Data API before configuring this component. For more information, see Manage collections in Astra DB Serverless.
embedding_modelEmbedding ModelInput parameter. Attach an embedding model component to generate embeddings. Only available if the specified collection doesn't have a vectorize integration. If a vectorize integration exists, the component automatically uses the collection's integrated model.
ingest_dataIngest DataInput parameter. The documents to load into the specified collection. Accepts Data or DataFrame input.
search_querySearch QueryInput parameter. The query string for vector search.
cache_vector_storeCache Vector StoreInput parameter. Whether to cache the vector store in Langflow memory for faster reads. Default: Enabled (true).
search_methodSearch MethodInput parameter. The search methods to use, either Hybrid Search or Vector Search. Your collection must be configured to support the chosen option, and the default depends on what your collection supports. All vector-enabled collections in Astra DB Serverless (Vector) databases support vector search, but hybrid search requires that you set specific collection settings when creating the collection. These options are only available when creating a collection programmatically. For more information, see Ways to find data in Astra DB Serverless and Create a collection that supports hybrid search.
rerankerRerankerInput parameter. The re-ranker model to use for hybrid search, depending on the collection configuration. This parameter is only available for collections that support hybrid search. To determine if a collection supports hybrid search, get collection metadata, and then check that lexical and rerank both have "enabled": true.
lexical_termsLexical TermsInput parameter. A space-separated string of keywords for hybrid search, like features, data, attributes, characteristics. This parameter is only available if the collection supports hybrid search. For more information, see the Hybrid search example.
number_of_resultsNumber of Search ResultsInput parameter. The number of search results to return. Default: 4.
search_typeSearch TypeInput parameter. The search type to use, either Similarity (default), Similarity with score threshold, and MMR (Max Marginal Relevance).
search_score_thresholdSearch Score ThresholdInput parameter. The minimum similarity score threshold for vector search results with the Similarity with score threshold search type. Default: 0.
advanced_search_filterSearch Metadata FilterInput parameter. An optional dictionary of metadata filters to apply in addition to vector or hybrid search.
autodetect_collectionAutodetect CollectionInput parameter. Whether to automatically fetch a list of available collections after providing an application token and API endpoint.
content_fieldContent FieldInput parameter. For writes, this parameter specifies the name of the field in the documents that contains text strings for which you want to generate embeddings.
deletion_fieldDeletion Based On FieldInput parameter. When provided, documents in the target collection with metadata field values matching the input metadata field value are deleted before new records are loaded. Use this setting for writes with upserts (overwrites).
ignore_invalid_documentsIgnore Invalid DocumentsInput parameter. Whether to ignore invalid documents during writes. If disabled (false), then an error is raised for invalid documents. Default: Enabled (true).
astradb_vectorstore_kwargsAstraDBVectorStore ParametersInput parameter. An optional dictionary of additional parameters for the AstraDBVectorStore instance.

Astra DB examples

Example: Vector RAG
tip

For a tutorial that uses vector data in a flow, see Create a vector RAG chatbot.

The following example demonstrates how to use vector store components in flows alongside related components like embedding model and language model components. These steps walk through important configuration details, functionality, and best practices for using these components effectively. This is only one example; it isn't a prescriptive guide to all possible use cases or configurations.

  1. Create a flow with the Vector Store RAG template.

    This template has two subflows. The Load Data subflow loads embeddings and content into a vector database, and the Retriever subflow runs a vector search to retrieve relevant context based on a user's query.

  2. Configure the database connection for both Astra DB components, or replace them with another pair of vector store components of your choice. Make sure the components connect to the same vector store, and that the component in the Retriever subflow is able to run a similarity search.

    The parameters you set in each vector store component depend on the component's role in your flow. In this example, the Load Data subflow writes to the vector store, whereas the Retriever subflow reads from the vector store. Therefore, search-related parameters are only relevant to the Vector Search component in the Retriever subflow.

    For information about specific parameters, see the documentation for your chosen vector store component.

  3. To configure the embedding model, do one of the following:

    • Use an OpenAI model: In both OpenAI Embeddings components, enter your OpenAI API key. You can use the default model or select a different OpenAI embedding model.

    • Use another provider: Replace the OpenAI Embeddings components with another pair of embedding model components of your choice, and then configure the parameters and credentials accordingly.

    • Use Astra DB vectorize: If you are using an Astra DB vector store that has a vectorize integration, you can remove both OpenAI Embeddings components. If you do this, the vectorize integration automatically generates embeddings from the Ingest Data (in the Load Data subflow) and Search Query (in the Retriever subflow).

    tip

    If your vector store already contains embeddings, make sure your embedding model components use the same model as your previous embeddings. Mixing embedding models in the same vector store can produce inaccurate search results.

  4. Recommended: In the Split Text component, optimize the chunking settings for your embedding model. For example, if your embedding model has a token limit of 512, then the Chunk Size parameter must not exceed that limit.

    Additionally, because the Retriever subflow passes the chat input directly to the vector store component for vector search, make sure that your chat input string doesn't exceed your embedding model's limits. For this example, you can enter a query that is within the limits; however, in a production environment, you might need to implement additional checks or preprocessing steps to ensure compliance. For example, use additional components to prepare the chat input before running the vector search, or enforce chat input limits in your application code.

  5. In the Language Model component, enter your OpenAI API key, or select a different provider and model to use for the chat portion of the flow.

  6. Run the Load Data subflow to populate your vector store. In the File component, select one or more files, and then click Run component on the vector store component in the Load Data subflow.

    The Load Data subflow loads files from your local machine, chunks them, generates embeddings for the chunks, and then stores the chunks and their embeddings in the vector database.

    Embedding data into a vector store

    The Load Data subflow is separate from the Retriever subflow because you probably won't run it every time you use the chat. You can run the Load Data subflow as needed to preload or update the data in your vector store. Then, your chat interactions only use the components that are necessary for chat.

    If your vector store already contains data that you want to use for vector search, then you don't need to run the Load Data subflow.

  7. Open the Playground and start chatting to run the Retriever subflow.

    The Retriever subflow generates an embedding from chat input, runs a vector search to retrieve similar content from your vector store, parses the search results into supplemental context for the LLM, and then uses the LLM to generate a natural language response to your query. The LLM uses the vector search results along with its internal training data and tools, such as basic web search and datetime information, to produce the response.

    Retrieval from a vector store

    To avoid passing the entire block of raw search results to the LLM, the Parser component extracts text strings from the search results Data object, and then passes them to the Prompt Template component in Message format. From there, the strings and other template content are compiled into natural language instructions for the LLM.

    You can use other components for this transformation, such as the Data Operations component, depending on how you want to use the search results.

    To view the raw search results, click Inspect output on the vector store component after running the Retriever subflow.

Example: Hybrid search

The Astra DB component supports the Data API's hybrid search feature. Hybrid search performs a vector similarity search and a lexical search, compares the results of both searches, and then returns the most relevant results overall.

To use hybrid search through the Astra DB component, do the following:

  1. Use the Data API to create a collection that supports hybrid search if you don't already have one.

    Although you can create a collection through the Astra DB component, you have more control and insight into the collection settings when using the Data API for this operation.

  2. Create a flow based on the Hybrid Search RAG template, which includes an Astra DB component that is pre-configured for hybrid search.

    After loading the template, check for Upgrade available alerts on the components. If any components have an upgrade pending, upgrade and reconnect them before continuing.

  3. In the Language Model components, add your OpenAI API key. If you want to use a different provider or model, see Language model components.

  4. Delete the Language Model component that is connected to the Structured Output component's Input Message port, and then connect the Chat Input component to that port.

  5. Configure the Astra DB vector store component:

    1. Enter your Astra DB application token.
    2. In the Database field, select your database.
    3. In the Collection field, select your collection with hybrid search enabled.

    Once you select a collection that supports hybrid search, the other parameters automatically update to allow hybrid search options.

  6. Connect the first Parser component's Parsed Text output to the Astra DB component's Lexical Terms input. This input only appears after connecting a collection that support hybrid search with reranking.

  7. Update the Structured Output template:

    1. Click the Structured Output component to expose the component's header menu, and then click Controls.

    2. Find the Format Instructions row, click Expand, and then replace the prompt with the following text:


      _10
      You are a database query planner that takes a user's requests, and then converts to a search against the subject matter in question.
      _10
      You should convert the query into:
      _10
      1. A list of keywords to use against a Lucene text analyzer index, no more than 4. Strictly unigrams.
      _10
      2. A question to use as the basis for a QA embedding engine.
      _10
      Avoid common keywords associated with the user's subject matter.

    3. Click Finish Editing, and then click Close to save your changes to the component.

  8. Open the Playground, and then enter a natural language question that you would ask about your database.

    In this example, your input is sent to both the Astra DB and Structured Output components:

    • The input sent directly to the Astra DB component's Search Query port is used as a string for similarity search. An embedding is generated from the query string using the collection's Astra DB vectorize integration.

    • The input sent to the Structured Output component is processed by the Structured Output, Language Model, and Parser components to extract space-separated keywords used for the lexical search portion of the hybrid search.

    The complete hybrid search query is executed against your database using the Data API's find_and_rerank command. The API's response is output as a DataFrame that is transformed into a text string Message by another Parser component. Finally, the Chat Output component prints the Message response to the Playground.

  9. Optional: Exit the Playground, and then click Inspect Output on each individual component to understand how lexical keywords were constructed and view the raw response from the Data API. This is helpful for debugging flows where a certain component isn't receiving input as expected from another component.

    • Structured Output component: The output is the Data object produced by applying the output schema to the LLM's response to the input message and format instructions. The following example is based on the aforementioned instructions for keyword extraction:


      _10
      1. Keywords: features, data, attributes, characteristics
      _10
      2. Question: What characteristics can be identified in my data?

    • Parser component: The output is the string of keywords extracted from the structured output Data, and then used as lexical terms for the hybrid search.

    • Astra DB component: The output is the DataFrame containing the results of the hybrid search as returned by the Data API.

Astra DB output

If you use a vector store component to query your vector database, it produces search results that you can pass to downstream components in your flow as a list of Data objects or a tabular DataFrame. If both types are supported, you can set the format near the vector store component's output port in the visual editor.

Vector Store Connection port

The Astra DB component has an additional Vector Store Connection output. This output can only connect to a VectorStore input port, and it was intended for use with dedicated Graph RAG components.

The only non-legacy component that supports this input is the Graph RAG component, which can be a Graph RAG extension to the Astra DB component. Instead, use the Astra DB Graph component that includes both the vector store connection and Graph RAG functionality.

Astra DB CQL

The Astra DB CQL component allows agents to query data from CQL tables in Astra DB.

The output is a list of Data objects containing the query results from the Astra DB CQL table. Each Data object contains the document fields specified by the projection fields. Limited by the number_of_results parameter.

Astra DB CQL parameters

Some parameters are hidden by default in the visual editor. You can modify all parameters through the Controls in the component's header menu.

NameTypeDescription
Tool NameStringInput parameter. The name used to reference the tool in the agent's prompt.
Tool DescriptionStringInput parameter. A brief description of the tool to guide the model in using it.
KeyspaceStringInput parameter. The name of the keyspace.
Table NameStringInput parameter. The name of the Astra DB CQL table to query.
TokenSecretStringInput parameter. The authentication token for Astra DB.
API EndpointStringInput parameter. The Astra DB API endpoint.
Projection FieldsStringInput parameter. The attributes to return, separated by commas. Default: "*".
Partition KeysDictInput parameter. Required parameters that the model must fill to query the tool.
Clustering KeysDictInput parameter. Optional parameters the model can fill to refine the query. Required parameters should be marked with an exclamation mark, for example, !customer_id.
Static FiltersDictInput parameter. Attribute-value pairs used to filter query results.
LimitStringInput parameter. The number of records to return.

Astra DB Tool

The Astra DB Tool component enables searching data in Astra DB collections, including hybrid search, vector search, and regular filter-based search. Specialized searches require that the collection is pre-configured with the required parameters.

Outputs a list of Data objects containing the query results from Astra DB. Each Data object contains the document fields specified by the projection attributes. Limited by the number_of_results parameter and the upper limit of the Astra DB Data API, depending on the type of search.

You can use the component to execute queries directly as isolated steps in a flow, or you can connect it as a tool for an agent to allow the agent to query data from Astra DB collections as needed to respond to user queries. For more information, see Use Langflow agents.

Astra DB Tool component connected as a tool to an Agent component

Astra DB Tool parameters

The following parameters are for the Astra DB Tool component overall.

The values for Collection Name, Astra DB Application Token, and Astra DB API Endpoint are found in your Astra DB deployment. For more information, see the Astra DB Serverless documentation.

NameTypeDescription
Tool NameStringInput parameter. The name used to reference the tool in the agent's prompt.
Tool DescriptionStringInput parameter. A brief description of the tool. This helps the model decide when to use it.
Keyspace NameStringInput parameter. The name of the keyspace in Astra DB. Default: default_keyspace
Collection NameStringInput parameter. The name of the Astra DB collection to query.
TokenSecretStringInput parameter. The authentication token for accessing Astra DB.
API EndpointStringInput parameter. The Astra DB API endpoint.
Projection FieldsStringInput parameter. Comma-separated list of attributes to return from matching documents. The default is the default projection, *, which returns all attributes except reserved fields like $vector.
Tool ParametersDictInput parameter. Astra DB Data API find filters that become tools for an agent. These Filters may be used in a search, if the agent selects them. See Define tool-specific parameters.
Static FiltersDictInput parameter. Attribute-value pairs used to filter query results. Equivalent to Astra DB Data API find filters. Static Filters are included with every query. Use Static Filters without semantic search to perform a regular filter search.
Number of ResultsIntInput parameter. The maximum number of documents to return.
Semantic SearchBooleanInput parameter. Whether to run a similarity search by generating a vector embedding from the chat input and following the Semantic Search Instruction. Default: false. If true, you must attach an embedding model component or have vectorize pre-enabled on your collection.
Use Astra DB VectorizeBooleanInput parameter. Whether to use the Astra DB vectorize feature for embedding generation when running a semantic search. Default: false. If true, you must have vectorize pre-enabled on your collection.
Embedding ModelEmbeddingInput parameter. A port to attach an embedding model component to generate a vector from input text for semantic search. This can be used when Semantic Search is true, with or without vectorize. Be sure to use a model that aligns with the dimensions of the embeddings already present in the collection.
Semantic Search InstructionStringInput parameter. The query to use for similarity search. Default: "Find documents similar to the query.". This instruction is used to guide the model in performing semantic search.

Define tool-specific parameters

tip

Tool Parameters are small functions that you create within the Astra DB Tool component. They give the LLM pre-defined ways to interact with the data in your collection.

Without these filters, the LLM has no concept of the data in your collection or which attributes are important.

At runtime, the LLM can decide which filters are relevant to the current query.

Filters in Tool Parameters aren't always applied. If you want to enforce filters for every query, use the Static Filters parameter. You can use both Tool Parameters and Static Filters to set some required filters and some optional filters.

In the Astra DB Tool component's Tool Parameters field, you can create filters to query documents in your collection.

When used in Tool Mode with an agent, these filters tell the agent which document attributes are most important, which are required in searches, and which operators to use on certain attributes. The filters become available as parameters that the LLM can use when calling the tool, with a better understanding of each parameter provided by the Description field.

In the Tool Parameters pane, click Add a new row, and then edit each cell in the row. For example, the following filter allows an LLM to filter by unique customer_id values:

  • Name: customer_id
  • Attribute Name: Leave empty if the attribute matches the field name in the database.
  • Description: "The unique identifier of the customer to filter by".
  • Is Metadata: Select False unless the value is stored in the metadata field.
  • Is Mandatory: Set to True to make the filter required.
  • Is Timestamp: For this example, select False because the value is an ID, not a timestamp.
  • Operator: $eq to look for an exact match.

The following fields are available for each row in the Tool Parameters pane:

ParameterDescription
NameThe name of the parameter that is exposed to the LLM. It can be the same as the underlying field name or a more descriptive label. The LLM uses this name, along with the description, to infer what value to provide during execution.
Attribute NameWhen the parameter name shown to the LLM differs from the actual field or property in the database, use this setting to map the user-facing name to the correct attribute. For example, to apply a range filter to the timestamp field, define two separate parameters, such as start_date and end_date, that both reference the same timestamp attribute.
DescriptionProvides instructions to the LLM on how the parameter should be used. Clear and specific guidance helps the LLM provide valid input. For example, if a field such as specialty is stored in lowercase, the description should indicate that the input must be lowercase.
Is MetadataWhen loading data using LangChain or Langflow, additional attributes may be stored under a metadata object. If the target attribute is stored this way, enable this option. It adjusts the query by generating a filter in the format: {"metadata.<attribute_name>": "<value>"}
Is TimestampFor date or time-based filters, enable this option to automatically convert values to the timestamp format that the Astrapy client expects. This ensures compatibility with the underlying API without requiring manual formatting.
OperatorDefines the filtering logic applied to the attribute. You can use any valid Data API filter operator. For example, to filter a time range on the timestamp attribute, use two parameters: one with the $gt operator for "greater than", and another with the $lt operator for "less than".

Astra DB Graph

The Astra DB Graph component uses AstraDBGraphVectorStore, an instance of LangChain graph vector store, for graph traversal and graph-based document retrieval in an Astra DB collection. It also supports writing to the vector store. For more information, see Build a Graph RAG system with LangChain and GraphRetriever.

If you use a vector store component to query your vector database, it produces search results that you can pass to downstream components in your flow as a list of Data objects or a tabular DataFrame. If both types are supported, you can set the format near the vector store component's output port in the visual editor.

Astra DB Graph parameters

You can inspect a vector store component's parameters to learn more about the inputs it accepts, the features it supports, and how to configure it.

Some parameters are hidden by default in the visual editor. You can modify all parameters through the Controls in the component's header menu.

Some parameters are conditional, and they are only available after you set other parameters or select specific options for other parameters. Conditional parameters may not be visible on the Controls pane until you set the required dependencies.

For information about accepted values and functionality, see the Astra DB Serverless documentation or inspect component code.

NameDisplay NameInfo
tokenAstra DB Application TokenInput parameter. An Astra application token with permission to access your vector database. Once the connection is verified, additional fields are populated with your existing databases and collections. If you want to create a database through this component, the application token must have Organization Administrator permissions.
api_endpointAPI EndpointInput parameter. Your database's API endpoint.
keyspaceKeyspaceInput parameter. The keyspace in your database that contains the collection specified in collection_name. Default: default_keyspace.
collection_nameCollectionInput parameter. The name of the collection that you want to use with this flow. For write operations, if a matching collection doesn't exist, a new one is created.
metadata_incoming_links_keyMetadata Incoming Links KeyInput parameter. The metadata key for the incoming links in the vector store.
ingest_dataIngest DataInput parameter. Records to load into the vector store. Only relevant for writes.
search_inputSearch QueryInput parameter. Query string for similarity search. Only relevant for reads.
cache_vector_storeCache Vector StoreInput parameter. Whether to cache the vector store in Langflow memory for faster reads. Default: Enabled (true).
embedding_modelEmbedding ModelInput parameter. Attach an embedding model component to generate embeddings. If the collection has a vectorize integration, don't attach an embedding model component.
metricMetricInput parameter. The metrics to use for similarity search calculations, either cosine (default), dot_product, or euclidean. This is a collection setting.
batch_sizeBatch SizeInput parameter. Optional number of records to process in a single batch.
bulk_insert_batch_concurrencyBulk Insert Batch ConcurrencyInput parameter. Optional concurrency level for bulk write operations.
bulk_insert_overwrite_concurrencyBulk Insert Overwrite ConcurrencyInput parameter. Optional concurrency level for bulk write operations that allow upserts (overwriting existing records).
bulk_delete_concurrencyBulk Delete ConcurrencyInput parameter. Optional concurrency level for bulk delete operations.
setup_modeSetup ModeInput parameter. Configuration mode for setting up the vector store, either Sync (default) or Off.
pre_delete_collectionPre Delete CollectionInput parameter. Whether to delete the collection before creating a new one. Default: Disabled (false).
metadata_indexing_includeMetadata Indexing IncludeInput parameter. A list of metadata fields to index if you want to enable selective indexing only when creating a collection. Doesn't apply to existing collections. Only one *_indexing_* parameter can be set per collection. If all *_indexing_* parameters are unset, then all fields are indexed (default indexing).
metadata_indexing_excludeMetadata Indexing ExcludeInput parameter. A list of metadata fields to exclude from indexing if you want to enable selective indexing only when creating a collection. Doesn't apply to existing collections. Only one *_indexing_* parameter can be set per collection. If all *_indexing_* parameters are unset, then all fields are indexed (default indexing).
collection_indexing_policyCollection Indexing PolicyInput parameter. A dictionary to define the indexing policy if you want to enable selective indexing only when creating a collection. Doesn't apply to existing collections. Only one *_indexing_* parameter can be set per collection. If all *_indexing_* parameters are unset, then all fields are indexed (default indexing). The collection_indexing_policy dictionary is used when you need to set indexing on subfields or a complex indexing definition that isn't compatible as a list.
number_of_resultsNumber of ResultsInput parameter. Number of search results to return. Default: 4. Only relevant to reads.
search_typeSearch TypeInput parameter. Search type to use, either Similarity, Similarity with score threshold, or MMR (Max Marginal Relevance), Graph Traversal, or MMR (Max Marginal Relevance) Graph Traversal (default). Only relevant to reads.
search_score_thresholdSearch Score ThresholdInput parameter. Minimum similarity score threshold for search results if the search_type is Similarity with score threshold. Default: 0.
search_filterSearch Metadata FilterInput parameter. Optional dictionary of metadata filters to apply in addition to vector search.

Graph RAG

The Graph RAG component uses an instance of GraphRetriever for Graph RAG traversal enabling graph-based document retrieval in an Astra DB vector store. For more information, see the DataStax Graph RAG documentation.

info

This component can be a Graph RAG extension for the Astra DB vector store component. However, the Astra DB Graph component includes both the vector store connection and Graph RAG functionality.

Graph RAG parameters

You can inspect a vector store component's parameters to learn more about the inputs it accepts, the features it supports, and how to configure it.

Some parameters are hidden by default in the visual editor. You can modify all parameters through the Controls in the component's header menu.

Some parameters are conditional, and they are only available after you set other parameters or select specific options for other parameters. Conditional parameters may not be visible on the Controls pane until you set the required dependencies.

NameDisplay NameInfo
embedding_modelEmbedding ModelInput parameter. Specify the embedding model to use. Not required if the connected vector store has a vectorize integration.
vector_storeVector Store ConnectionInput parameter. An instance of AstraDbVectorStore inherited from the Astra DB component's Vector Store Connection output.
edge_definitionEdge DefinitionInput parameter. Edge definition for the graph traversal.
strategyTraversal StrategiesInput parameter. The strategy to use for graph traversal. Strategy options are dynamically loaded from available strategies.
search_querySearch QueryInput parameter. The query to search for in the vector store.
graphrag_strategy_kwargsStrategy ParametersInput parameter. Optional dictionary of additional parameters for the retrieval strategy.
search_resultsSearch Results or DataFrameOutput parameter. The results of the graph-based document retrieval as a list of Data objects or as a tabular DataFrame. You can set the desired output type near the component's output port.

Hyper-Converged Database (HCD)

The Hyper-Converged Database (HCD) component uses your cluster's Data API server to read and write to your HCD vector store. Because the underlying functions call the Data API, which originated from Astra DB, the component uses an instance of AstraDBVectorStore.

A flow using the HCD component to load vector data.

About vector store instances

Because Langflow is based on LangChain, vector store components use an instance of LangChain vector store to drive the underlying read and write functions. These instances are provider-specific and configured according to the component's parameters, such as the connection string, index name, and schema.

In component code, this is often instantiated as vector_store, but some vector store components use a different name, such as the provider name.

Some LangChain classes don't expose all possible options as component parameters. Depending on the provider, these options might use default values or allow modification through environment variables, if they are supported in Langflow. For information about specific options, see the LangChain API reference and vector store provider's documentation.

If you use a vector store component to query your vector database, it produces search results that you can pass to downstream components in your flow as a list of Data objects or a tabular DataFrame. If both types are supported, you can set the format near the vector store component's output port in the visual editor.

For more information about HCD, see Get started with HCD 1.2 and Get started with the Data API in HCD 1.2.

HCD parameters

You can inspect a vector store component's parameters to learn more about the inputs it accepts, the features it supports, and how to configure it.

Some parameters are hidden by default in the visual editor. You can modify all parameters through the Controls in the component's header menu.

Some parameters are conditional, and they are only available after you set other parameters or select specific options for other parameters. Conditional parameters may not be visible on the Controls pane until you set the required dependencies.

NameDisplay NameInfo
collection_nameCollection NameInput parameter. The name of a vector store collection in HCD. For write operations, if the collection doesn't exist, then a new one is created. Required.
usernameHCD UsernameInput parameter. Username for authenticating to your HCD deployment. Default: hcd-superuser. Required.
passwordHCD PasswordInput parameter. Password for authenticating to your HCD deployment. Required.
api_endpointHCD API EndpointInput parameter. Your deployment's HCD Data API endpoint, formatted as http[s]://CLUSTER_HOST:GATEWAY_PORT where CLUSTER_HOST is the IP address of any node in your cluster and GATEWAY_PORT is the port number for your API gateway service. For example, http://192.0.2.250:8181. Required.
ingest_dataIngest DataInput parameter. Records to load into the vector store. Only relevant for writes.
search_inputSearch InputInput parameter. Query string for similarity search. Only relevant for reads.
namespaceNamespaceInput parameter. The namespace in HCD that contains or will contain the collection specified in collection_name. Default: default_namespace.
ca_certificateCA CertificateInput parameter. Optional CA certificate for TLS connections to HCD.
metricMetricInput parameter. The metrics to use for similarity search calculations, either cosine, dot_product, or euclidean. This is a collection setting. If calling an existing collection, leave unset to use the collection's metric. If a write operation creates a new collection, specify the desired similarity metric setting.
batch_sizeBatch SizeInput parameter. Optional number of records to process in a single batch.
bulk_insert_batch_concurrencyBulk Insert Batch ConcurrencyInput parameter. Optional concurrency level for bulk write operations.
bulk_insert_overwrite_concurrencyBulk Insert Overwrite ConcurrencyInput parameter. Optional concurrency level for bulk write operations that allow upserts (overwriting existing records).
bulk_delete_concurrencyBulk Delete ConcurrencyInput parameter. Optional concurrency level for bulk delete operations.
setup_modeSetup ModeInput parameter. Configuration mode for setting up the vector store, either Sync (default), Async, or Off.
pre_delete_collectionPre Delete CollectionInput parameter. Whether to delete the collection before creating a new one.
metadata_indexing_includeMetadata Indexing IncludeInput parameter. A list of metadata fields to index if you want to enable selective indexing only when creating a collection. Doesn't apply to existing collections. Only one *_indexing_* parameter can be set per collection. If all *_indexing_* parameters are unset, then all fields are indexed (default indexing).
metadata_indexing_excludeMetadata Indexing ExcludeInput parameter. A list of metadata fields to exclude from indexing if you want to enable selective indexing only when creating a collection. Doesn't apply to existing collections. Only one *_indexing_* parameter can be set per collection. If all *_indexing_* parameters are unset, then all fields are indexed (default indexing).
collection_indexing_policyCollection Indexing PolicyInput parameter. A dictionary to define the indexing policy if you want to enable selective indexing only when creating a collection. Doesn't apply to existing collections. Only one *_indexing_* parameter can be set per collection. If all *_indexing_* parameters are unset, then all fields are indexed (default indexing). The collection_indexing_policy dictionary is used when you need to set indexing on subfields or a complex indexing definition that isn't compatible as a list.
embeddingEmbedding or Astra VectorizeInput parameter. The embedding model to use by attaching an Embedding Model component. This component doesn't support additional vectorize authentication headers, so it isn't possible to use a vectorize integration with this component, even if you have enabled one on an existing HCD collection.
number_of_resultsNumber of ResultsInput parameter. Number of search results to return. Default: 4. Only relevant to reads.
search_typeSearch TypeInput parameter. Search type to use, either Similarity (default), Similarity with score threshold, or MMR (Max Marginal Relevance). Only relevant to reads.
search_score_thresholdSearch Score ThresholdInput parameter. Minimum similarity score threshold for search results if the search_type is Similarity with score threshold. Default: 0.
search_filterSearch Metadata FilterInput parameter. Optional dictionary of metadata filters to apply in addition to vector search.

Other DataStax components

The following components are also included in the DataStax bundle.

Astra DB Chat Memory

The Astra DB Chat Memory component retrieves and stores chat messages using an Astra DB database.

Chat memories are passed between memory storage components as the Memory data type. Specifically, the component creates an instance of AstraDBChatMessageHistory, which is a LangChain chat message history class that uses Astra DB for storage.

important

The Astra DB Chat Memory component isn't recommended for most memory storage because memories tend to be long JSON objects or strings, often exceeding the maximum size of a document or object supported by Astra DB.

However, Langflow's Agent component includes built-in chat memory that is enabled by default. Your agentic flows don't need an external database to store chat memory. For more information, see Memory management options.

For more information about using external chat memory in flows, see the Message History component.

Astra DB Chat Memory parameters

Some parameters are hidden by default in the visual editor. You can modify all parameters through the Controls in the component's header menu.

NameTypeDescription
collection_nameStringInput parameter. The name of the Astra DB collection for storing messages. Required.
tokenSecretStringInput parameter. The authentication token for Astra DB access. Required.
api_endpointSecretStringInput parameter. The API endpoint URL for the Astra DB service. Required.
namespaceStringInput parameter. The optional namespace within Astra DB for the collection.
session_idMessageTextInput parameter. The unique identifier for the chat session. Uses the current session ID if not provided.

Assistants API

The following DataStax components are used to create and manage Assistants API functions in a flow:

  • Astra Assistant Agent
  • Create Assistant
  • Create Assistant Thread
  • Get Assistant Name
  • List Assistants
  • Run Assistant

Environment variables

The following DataStax components are used to load and retrieve environment variables in a flow:

  • Dotenv
  • Get Environment Variable

Legacy DataStax components

Legacy components are longer supported and can be removed in a future release. You can continue to use them in existing flows, but it is recommended that you replace them with supported components as soon as possible. Suggested replacements are included in the Legacy banner on components in your flows. They are also given in release notes and Langflow documentation whenever possible.

If you aren't sure how to replace a legacy component, Search for components by provider, service, or component name. The component may have been deprecated in favor of a completely new component, a similar component, or a new version of the same component in a different category.

If there is no obvious replacement, consider whether another component can be adapted to your use case. For example, many Core components provide generic functionality that can support multiple providers and use cases, such as the API Request component.

If neither of these options are viable, you could use the legacy component's code to create your own custom component, or start a discussion about the legacy component.

To discourage use of legacy components in new flows, these components are hidden by default. In the visual editor, you can click Component settings to toggle the Legacy filter.

The following DataStax components are in legacy status:

Astra Vectorize

This component was deprecated in Langflow version 1.1.2. Replace it with the Astra DB component.

The Astra DB Vectorize component was used to generate embeddings with Astra DB's vectorize feature in conjunction with an Astra DB component.

The vectorize functionality is now built into the Astra DB component. You no longer need a separate component for vectorize embedding generation.

See also

Search