Skip to main content

Embedding Model

Embedding model components in Langflow generate text embeddings using a specified Large Language Model (LLM).

Langflow includes an Embedding Model core component that has built-in support for some LLMs. Alternatively, you can use any additional embedding model in place of the Embedding Model core component.

Use embedding model components in a flow

Use embedding model components anywhere you need to generate embeddings in a flow.

This example shows how to use an embedding model component in a flow to create a semantic search system. This flow loads a text file, splits the text into chunks, generates embeddings for each chunk, and then loads the chunks and embeddings into a vector store. The input and output components allow a user to query the vector store through a chat interface.

A semantic search flow that uses Embedding Model, File, Split Text, Chroma DB, Chat Input, and Chat Output components

  1. Create a flow, add a File component, and then select a file containing text data, such as a PDF, that you can use to test the flow.

  2. Add the Embedding Model core component, and then provide a valid OpenAI API key. You can enter the API key directly or use a global variable.

    My preferred provider or model isn't listed

    If your preferred embedding model provider or model isn't supported by the Embedding Model core component, you can use any additional embedding models in place of the core component.

    Browse Bundles or Search for your preferred provider to find additional embedding models, such as the Hugging Face Embeddings Inference component.

  3. Add a Split Text component to your flow. This component splits text input into smaller chunks to be processed into embeddings.

  4. Add a vector store component, such as the Chroma DB component, to your flow, and then configure the component to connect to your vector database. This component stores the generated embeddings so they can be used for similarity search.

  5. Connect the components:

    • Connect the File component's Loaded Files output to the Split Text component's Data or DataFrame input.
    • Connect the Split Text component's Chunks output to the vector store component's Ingest Data input.
    • Connect the Embedding Model component's Embeddings output to the vector store component's Embedding input.
  6. To query the vector store, add Chat Input and Output components:

    • Connect the Chat Input component to the vector store component's Search Query input.
    • Connect the vector store component's Search Results output to the Chat Output component.
  7. Click Playground, and then enter a search query to retrieve text chunks that are most semantically similar to your query.

Embedding Model parameters

The following parameters are for the Embedding Model core component. Other embedding model components can have additional or different parameters.

Some parameters are hidden by default in the visual editor. You can modify all parameters through the Controls in the component's header menu.

NameDisplay NameTypeDescription
providerModel ProviderListInput parameter. Select the embedding model provider.
modelModel NameListInput parameter. Select the embedding model to use.
api_keyOpenAI API KeySecret[String]Input parameter. The API key required for authenticating with the provider.
api_baseAPI Base URLStringInput parameter. Base URL for the API. Leave empty for default.
dimensionsDimensionsIntegerInput parameter. The number of dimensions for the output embeddings.
chunk_sizeChunk SizeIntegerInput parameter. The size of text chunks to process. Default: 1000.
request_timeoutRequest TimeoutFloatInput parameter. Timeout for API requests.
max_retriesMax RetriesIntegerInput parameter. Maximum number of retry attempts. Default: 3.
show_progress_barShow Progress BarBooleanInput parameter. Whether to display a progress bar during embedding generation.
model_kwargsModel KwargsDictionaryInput parameter. Additional keyword arguments to pass to the model.
embeddingsEmbeddingsEmbeddingsOutput parameter. An instance for generating embeddings using the selected provider.

Additional embedding models

If your provider or model isn't supported by the Embedding Model core component, you can replace this component with any other component that generates embeddings.

To find additional embedding model components, browse Bundles or Search for your preferred provider.

Pair models with vector stores

By design, vector data is essential for LLM applications, such as chatbots and agents.

While you can use an LLM alone for generic chat interactions and common tasks, you can take your application to the next level with context sensitivity (such as RAG) and custom datasets (such as internal business data). This often requires integrating vector databases and vector searches that provide the additional context and define meaningful queries.

Langflow includes vector store components that can read and write vector data, including embedding storage, similarity search, Graph RAG traversals, and dedicated search instances like OpenSearch. Because of their interdependent functionality, it is common to use vector store, language model, and embedding model components in the same flow or in a series of dependent flows.

To find available vector store components, browse Bundles or Search for your preferred vector database provider.

Example: Vector search flow
tip

For a tutorial that uses vector data in a flow, see Create a vector RAG chatbot.

The following example demonstrates how to use vector store components in flows alongside related components like embedding model and language model components. These steps walk through important configuration details, functionality, and best practices for using these components effectively. This is only one example; it isn't a prescriptive guide to all possible use cases or configurations.

  1. Create a flow with the Vector Store RAG template.

    This template has two subflows. The Load Data subflow loads embeddings and content into a vector database, and the Retriever subflow runs a vector search to retrieve relevant context based on a user's query.

  2. Configure the database connection for both Astra DB components, or replace them with another pair of vector store components of your choice. Make sure the components connect to the same vector store, and that the component in the Retriever subflow is able to run a similarity search.

    The parameters you set in each vector store component depend on the component's role in your flow. In this example, the Load Data subflow writes to the vector store, whereas the Retriever subflow reads from the vector store. Therefore, search-related parameters are only relevant to the Vector Search component in the Retriever subflow.

    For information about specific parameters, see the documentation for your chosen vector store component.

  3. To configure the embedding model, do one of the following:

    • Use an OpenAI model: In both OpenAI Embeddings components, enter your OpenAI API key. You can use the default model or select a different OpenAI embedding model.

    • Use another provider: Replace the OpenAI Embeddings components with another pair of embedding model components of your choice, and then configure the parameters and credentials accordingly.

    • Use Astra DB vectorize: If you are using an Astra DB vector store that has a vectorize integration, you can remove both OpenAI Embeddings components. If you do this, the vectorize integration automatically generates embeddings from the Ingest Data (in the Load Data subflow) and Search Query (in the Retriever subflow).

    tip

    If your vector store already contains embeddings, make sure your embedding model components use the same model as your previous embeddings. Mixing embedding models in the same vector store can produce inaccurate search results.

  4. Recommended: In the Split Text component, optimize the chunking settings for your embedding model. For example, if your embedding model has a token limit of 512, then the Chunk Size parameter must not exceed that limit.

    Additionally, because the Retriever subflow passes the chat input directly to the vector store component for vector search, make sure that your chat input string doesn't exceed your embedding model's limits. For this example, you can enter a query that is within the limits; however, in a production environment, you might need to implement additional checks or preprocessing steps to ensure compliance. For example, use additional components to prepare the chat input before running the vector search, or enforce chat input limits in your application code.

  5. In the Language Model component, enter your OpenAI API key, or select a different provider and model to use for the chat portion of the flow.

  6. Run the Load Data subflow to populate your vector store. In the File component, select one or more files, and then click Run component on the vector store component in the Load Data subflow.

    The Load Data subflow loads files from your local machine, chunks them, generates embeddings for the chunks, and then stores the chunks and their embeddings in the vector database.

    Embedding data into a vector store

    The Load Data subflow is separate from the Retriever subflow because you probably won't run it every time you use the chat. You can run the Load Data subflow as needed to preload or update the data in your vector store. Then, your chat interactions only use the components that are necessary for chat.

    If your vector store already contains data that you want to use for vector search, then you don't need to run the Load Data subflow.

  7. Open the Playground and start chatting to run the Retriever subflow.

    The Retriever subflow generates an embedding from chat input, runs a vector search to retrieve similar content from your vector store, parses the search results into supplemental context for the LLM, and then uses the LLM to generate a natural language response to your query. The LLM uses the vector search results along with its internal training data and tools, such as basic web search and datetime information, to produce the response.

    Retrieval from a vector store

    To avoid passing the entire block of raw search results to the LLM, the Parser component extracts text strings from the search results Data object, and then passes them to the Prompt Template component in Message format. From there, the strings and other template content are compiled into natural language instructions for the LLM.

    You can use other components for this transformation, such as the Data Operations component, depending on how you want to use the search results.

    To view the raw search results, click Inspect output on the vector store component after running the Retriever subflow.

Search