Data
You can use Langflow's Data components to bring data into your flows from various sources like files, API endpoints, and URLs. For example:
-
Load files: Import data from a file or directory with the File component and Directory component.
-
Search the web: Fetch data from the web with components like the News Search component, RSS Reader component, Web Search component, and URL component.
-
Make API calls: Use APIs to trigger flows or perform actions with the API Request component and Webhook component.
-
Run SQL queries: Query an SQL database with the SQL Database component.
Each component runs different commands for retrieval, processing, and type checking. Some components are a minimal wrapper for a command that you provide, and others include built-in scripts to fetch and process data based on variable inputs. Additionally, some components return raw data, whereas others can convert, restructure, or validate the data before outputting it. This means that some similar components might produce different results.
Data components pair well with Processing components that can perform additional parsing, transformation, and validation after retrieving the data.
This can include basic operations, like saving a file in a specific format, or more complex tasks, like using a Text Splitter component to break down a large document into smaller chunks before generating embeddings for vector search.
Use Data components in flows
Data components are used often in flows because they offer a versatile way to perform common, basic functions.
You can use Data components to perform their base functions as isolated steps in your flow, or you can connect them to an Agent component as tools.
For examples of Data components in flows, see the following:
-
Create a chatbot that can ingest files: Learn how to use a File component to load a file as context for a chatbot. The file and user input are both passed to the LLM so you can ask questions about the file you uploaded.
-
Create a vector RAG chatbot: Learn how to ingest files for use in Retrieval-Augmented Generation (RAG), and then set up a chatbot that can use the ingested files as context. The two flows in this tutorial prepare files for RAG, and then let your LLM use vector search to retrieve contextually relevant data during a chat session.
-
Configure tools for agents: Learn how to use any component as a tool for an agent. When used as tools, the agent autonomously decides when to call a component based on the user's query.
-
Trigger flows with webhooks: Learn how to use the Webhook component to trigger a flow run in response to an external event.
API Request
The API Request component constructs and sends an HTTP requests using URLs or curl commands:
- URL mode: Enter one or more comma-separated URLs, and then select the method for the request to each URL.
- curl mode: Enter the curl command to execute.
You can enable additional request options and fields in the component's parameters.
Returns a Data
object containing the response.
For provider-specific API components, see Bundles.
API Request parameters
Most API Request component input parameters are hidden by default in the visual editor. You can toggle parameters through the Controls in the component's header menu.
Name | Display Name | Info |
---|---|---|
mode | Mode | Input parameter. Set the mode to either URL or curl. |
urls | URL | Input parameter. Enter one or more comma-separated URLs for the request. |
curl | curl | Input parameter. curl mode only. Enter a complete curl command. Other component parameters are populated from the command arguments. |
method | Method | Input parameter. The HTTP method to use. |
query_params | Query Parameters | Input parameter. The query parameters to append to the URL. |
body | Body | Input parameter. The body to send with POST, PATCH, and PUT requests as a dictionary. |
headers | Headers | Input parameter. The headers to send with the request as a dictionary. |
timeout | Timeout | Input parameter. The timeout to use for the request. |
follow_redirects | Follow Redirects | Input parameter. Whether to follow HTTP redirects. The default is enabled (true). If disabled (false), HTTP redirects aren't followed. |
save_to_file | Save to File | Input parameter. Whether to save the API response to a temporary file. Default: Disabled/false |
include_httpx_metadata | Include HTTPx Metadata | Input parameter. Whether to include properties such as headers , status_code , response_headers , and redirection_history in the output. Default: Disabled/false |
Directory
The Directory component recursively loads files from a directory, with options for file types, depth, and concurrency.
Files must be of a supported type and size to be loaded.
Outputs either a Data
or DataFrame
object, depending on the directory contents.
Directory parameters
Many Directory component input parameters are hidden by default in the visual editor. You can toggle parameters through the Controls in the component's header menu.
Name | Type | Description |
---|---|---|
path | MessageTextInput | Input parameter. The path to the directory to load files from. Default: Current directory (. ) |
types | MessageTextInput | Input parameter. The file types to load. Select one or more, or leave empty to attempt to load all files. |
depth | IntInput | Input parameter. The depth to search for files. |
max_concurrency | IntInput | Input parameter. The maximum concurrency for loading multiple files. |
load_hidden | BoolInput | Input parameter. If true, hidden files are loaded. |
recursive | BoolInput | Input parameter. If true, the search is recursive. |
silent_errors | BoolInput | Input parameter. If true, errors don't raise an exception. |
use_multithreading | BoolInput | Input parameter. If true, multithreading is used. |
File
The File component loads and parses files, converts the content into a Data
, DataFrame
, or Message
object.
It supports multiple file types and provides parameters for parallel processing and error handling.
You can add files to the File component in the visual editor or at runtime, and you can upload multiple files at once. For more information about uploading files and working with files in flows, see File management and Create a chatbot that can ingest files.
File type and size limits
By default, the maximum file size is 100 MB.
To modify this value, change the --max-file-size-upload
environment variable.
Supported file types
The following file types are supported by the File component. Use archive and compressed formats to bundle multiple files together, or use the Directory component to load all files in a directory.
.bz2
.csv
.docx
.gz
.htm
.html
.json
.js
.md
.mdx
.pdf
.py
.sh
.sql
.tar
.tgz
.ts
.tsx
.txt
.xml
.yaml
.yml
.zip
If you need to load an unsupported file type, you must use a different component that supports that file type and, potentially, parses it outside Langflow, or you must convert it to a supported type before uploading it.
For images, see Upload images.
For videos, see the Twelve Labs and YouTube bundles in the Langflow Components menu.
File parameters
Most File component input parameters are hidden by default in the visual editor. You can toggle parameters through the Controls in the component's header menu.
Name | Display Name | Info |
---|---|---|
path | Files | Input parameter. The path to files to load. Can be local or in Langflow file management. Supports individual files and bundled archives. |
file_path | Server File Path | Input parameter. A Data object with a file_path property pointing to a file in Langflow file management or a Message object with a path to the file. Supersedes Files (path ) but supports the same file types. |
separator | Separator | Input parameter. The separator to use between multiple outputs in Message format. |
silent_errors | Silent Errors | Input parameter. If true, errors in the component don't raise an exception. The default is false/disabled. |
delete_server_file_after_processing | Delete Server File After Processing | Input parameter. If true (default), the Server File Path (file_path ) is deleted after processing. |
ignore_unsupported_extensions | Ignore Unsupported Extensions | Input parameter. If enabled (true), files with unsupported extensions are accepted but not processed. If disabled (false), the File component either can throw an error if an unsupported file type is provided. The default is true. |
ignore_unspecified_files | Ignore Unspecified Files | Input parameter. If true, Data with no file_path property is ignored. If false (default), the component errors when a file isn't specified. |
concurrency_multithreading | Processing Concurrency | Input parameter. The number of files to process concurrently if multiple files are uploaded. Default is 1. Values greater than 1 enable parallel processing for 2 or more files. |
File output
The output of the File component depends on the number and type of files loaded:
-
No files: Throws an error or, if Silent Errors is enabled, produces no output.
-
One file: Produces one of the following depending on the file type. If multiple types are available, you can select the output type by clicking the output field (near the component's output port).
- Structured Content: Available for some tabular and structured data.
For
.csv
files, produces aDataFrame
representing the table data. For.json
files, produces aData
object with the parsed JSON data. - Raw Content: A
Message
containing the file's raw text content. - File Path: A
Message
containing the path to the file in Langflow file management.
- Structured Content: Available for some tabular and structured data.
For
-
Multiple files: Produces a Files
DataFrame
containing the content and metadata of all selected files.
News Search
The News Search component searches Google News through RSS, and then returns clean article data as a DataFrame
containing article titles, links, publication dates, and summaries.
The component's clean_html
method parses the HTML content with the BeautifulSoup library, removes HTML markup, and strips whitespace to output clean data.
For other RSS feeds, use the RSS Reader component, and for other searches use the Web Search component or a provider-specific bundle.
When used as a standard component in a flow, the News Search component must be connected to a component that accepts DataFrame
input.
You can connect the News Search component directly to a compatible component, or you can use a Processing component to convert or extract data of a different type between components.
When used in Tool Mode with an Agent component, the News Search component can be connected directly to the Agent component's Tools port without converting the data.
The agent decides whether to use the News Search component based on the user's query, and it can process the DataFrame
output directly.
News Search parameters
Most News Search component input parameters are hidden by default in the visual editor. You can toggle parameters through the Controls in the component's header menu.
Name | Display Name | Info |
---|---|---|
query | Search Query | Input parameter. Search keywords for news articles. |
hl | Language (hl) | Input parameter. Language code, such as en-US, fr, de. Default: en-US . |
gl | Country (gl) | Input parameter. Country code, such as US, FR, DE. Default: US . |
ceid | Country:Language (ceid) | Input parameter. Language, such as US:en, FR:fr. Default: US:en . |
topic | Topic | Input parameter. One of: WORLD , NATION , BUSINESS , TECHNOLOGY , ENTERTAINMENT , SCIENCE , SPORTS , HEALTH . |
location | Location (Geo) | Input parameter. City, state, or country for location-based news. Leave blank for keyword search. |
timeout | Timeout | Input parameter. Timeout for the request in seconds. |
articles | News Articles | Output parameter. A DataFrame with the key columns title , link , published and summary . |
RSS Reader
The RSS Reader component fetches and parses RSS feeds from any valid RSS feed URL, and then returns the feed content as a DataFrame
containing article titles, links, publication dates, and summaries.
When used as a standard component in a flow, the RSS Reader component must be connected to a component that accepts DataFrame
input.
You can connect the RSS Reader component directly to a compatible component, or you can use a Processing component to convert or extract data of a different type between components.
When used in Tool Mode with an Agent component, the RSS Reader component can be connected directly to the Agent component's Tools port without converting the data.
The agent decides whether to use the RSS Reader component based on the user's query, and it can process the DataFrame
output directly.
RSS Reader parameters
Name | Display Name | Info |
---|---|---|
rss_url | RSS Feed URL | Input parameter. URL of the RSS feed to parse, such as https://rss.nytimes.com/services/xml/rss/nyt/HomePage.xml . |
timeout | Timeout | Input parameter. Timeout for the RSS feed request in seconds. Default: 5 . |
articles | Articles | Output parameter. A DataFrame containing the key columns title , link , published and summary . |
SQL Database
The SQL Database component executes SQL queries on SQLAlchemy-compatible databases. It supports any SQLAlchemy-compatible database, such as PostgreSQL, MySQL, and SQLite.
For CQL queries, see the DataStax bundle.
Query an SQL database with natural language prompts
The following example demonstrates how to use the SQL Database component in a flow, and then modify the component to support natural language queries through an Agent component.
This allows you to use the same SQL Database component for any query, rather than limiting it to a single manually entered query or requiring the user, application, or another component to provide valid SQL syntax as input. Users don't need to master SQL syntax because the Agent component translates the users' natural language prompts into SQL queries, passes the query to the SQL Database component, and then returns the results to the user.
Additionally, input from applications and other components doesn't have to be extracted and transformed to exact SQL queries. Instead, you only need to provide enough context for the agent to understand that it should create and run a SQL query according to the incoming data.
-
Use your own sample database or create a test database.
Create a test SQL database
-
Create a database called
test.db
:_10sqlite3 test.db -
Add some values to the database:
_13sqlite3 test.db "_13CREATE TABLE users (_13id INTEGER PRIMARY KEY,_13name TEXT,_13email TEXT,_13age INTEGER_13);_13_13INSERT INTO users (name, email, age) VALUES_13('John Doe', 'john@example.com', 30),_13('Jane Smith', 'jane@example.com', 25),_13('Bob Johnson', 'bob@example.com', 35);_13" -
Verify that the database has been created and contains your data:
_10sqlite3 test.db "SELECT * FROM users;"The result should list the text data you entered in the previous step:
_101|John Doe|john@example.com_102|Jane Smith|jane@example.com_103|John Doe|john@example.com_104|Jane Smith|jane@example.com
-
-
Add an SQL Database component to your flow.
-
In the Database URL field, add the connection string for your database, such as
sqlite:///test.db
.At this point, you can enter an SQL query in the SQL Query field or use the port to pass a query from another component, such as a Chat Input component. If you need more space, click Expand to open a full-screen text field.
However, to make this component more dynamic in an agentic context, use an Agent component to transform natural language input to SQL queries, as explained in the following steps.
-
Click the SQL Database component to expose the component's header menu, and then enable Tool Mode.
You can now use this component as a tool for an agent. In Tool Mode, no query is set in the SQL Database component because the agent will generate and send one if it determines that the tool is required to complete the user's request. For more information, see Configure tools for agents.
-
Add an Agent component to your flow, and then enter your OpenAI API key.
The default model is an OpenAI model. If you want to use a different model, edit the Model Provider, Model Name, and API Key fields accordingly.
If you need to execute highly specialized queries, consider selecting a model that is trained for tasks like advanced SQL queries. If your preferred model isn't in the Agent component's built-in model list, select the Custom model provider, and then use a Language Model component to attach a specific model.
-
Connect the SQL Database component's Toolset output to the Agent component's Tools input.
-
Click Playground, and then ask the agent a question about the data in your database, such as
Which users are in my database?
The agent determines that it needs to query the database to answer the question, uses the LLM to generate an SQL query, and then uses the SQL Database component's
RUN_SQL_QUERY
action to run the query on your database. Finally, it returns the results in a conversational format, unless you provide instructions to return raw results or a different format.The following example queried a test database with little data, but with a more robust dataset you could ask more detailed or complex questions.
_10Here are the users in your database:_10_101. **John Doe** - Email: john@example.com_102. **Jane Smith** - Email: jane@example.com_103. **John Doe** - Email: john@example.com_104. **Jane Smith** - Email: jane@example.com_10_10It seems there are duplicate entries for the users.
SQL Database parameters
Some SQL Database component input parameters are hidden by default in the visual editor. You can toggle parameters through the Controls in the component's header menu.
Name | Display Name | Info |
---|---|---|
database_url | Database URL | Input parameter. The SQLAlchemy-compatible database connection URL. |
query | SQL Query | Input parameter. The SQL query to execute, which can be entered directly, passed in from another component, or, in Tool Mode, automatically provided by an Agent component. |
include_columns | Include Columns | Input parameter. Whether to include column names in the result. The default is enabled (true). |
add_error | Add Error | Input parameter. If enabled, adds any error messages to the result, if any are returned. The default is disabled (false). |
run_sql_query | Result Table | Output parameter. The query results as a DataFrame . |
URL
The URL component fetches content from one or more URLs, processes the content, and returns it in various formats. It follows links recursively to a given depth, and it supports output in plain text or raw HTML.
URL parameters
Most URL component input parameters are hidden by default in the visual editor. You can toggle parameters through the Controls in the component's header menu.
Some of the available parameters include the following:
Name | Display Name | Info |
---|---|---|
urls | URLs | Input parameter. One or more URLs to crawl recursively. In the visual editor, click Add URL to add multiple URLs. |
max_depth | Depth | Input parameter. Controls link traversal: how many "clicks" away from the initial page the crawler will go. A depth of 1 limits the crawl to the first page at the given URL only. A depth of 2 means the crawler crawls the first page plus each page directly linked from the first page, then stops. This setting exclusively controls link traversal; it doesn't limit the number of URL path segments or the domain. |
prevent_outside | Prevent Outside | Input parameter. If enabled, only crawls URLs within the same domain as the root URL. This prevents the crawler from accessing sites outside the given URL's domain, even if they are linked from one of the crawled pages. |
use_async | Use Async | Input parameter. If enabled, uses asynchronous loading which can be significantly faster but might use more system resources. |
format | Output Format | Input parameter. Sets the desired output format as Text or HTML. The default is Text. For more information, see URL output. |
timeout | Timeout | Input parameter. Timeout for the request in seconds. |
headers | Headers | Input parameter. The headers to send with the request if needed for authentication or otherwise. |
Additional input parameters are available for error handling and encoding.
URL output
There are two settings that control the output of the URL component at different stages:
-
Output Format: This optional parameter controls the content extracted from the crawled pages:
- Text (default): The component extracts only the text from the HTML of the crawled pages.
- HTML: The component extracts the entire raw HTML content of the crawled pages.
-
Output data type: In the component's output field (near the output port) you can select the structure of the outgoing data when it is passed to other components:
When used as a standard component in a flow, the URL component must be connected to a component that accepts the selected output data type (DataFrame
or Message
).
You can connect the URL component directly to a compatible component, or you can use a Type Convert component to convert the output to another type before passing the data to other components if the data types aren't directly compatible.
Processing components, like the Type Convert component, are useful with the URL component because it can extract a large amount of data from the crawled pages. For example, if you only want to pass specific fields to other components, you can use a Parser component to extract only that data from the crawled pages before passing the data to other components.
When used in Tool Mode with an Agent component, the URL component can be connected directly to the Agent component's Tools port without converting the data.
The agent decides whether to use the URL component based on the user's query, and it can process the DataFrame
or Message
output directly.
Web Search
The Web Search component performs a basic web search using DuckDuckGo's HTML scraping interface. For other search APIs, see Bundles.
The Web Search component uses web scraping that can be subject to rate limits.
For production use, consider using another search component with more robust API support, such as a provider-specific bundle.
Use the Web Search component in a flow
The following steps demonstrate one way that you can use a Web Search component in a flow:
-
Create a flow based on the Basic Prompting template.
-
Add a Web Search component, and then enter a search query, such as
environmental news
. -
Add a Type Convert component, set the Output Type to Message, and then connect the Web Search component's output to the Type Convert component's input.
By default, the Web Search component outputs a
DataFrame
. Because the Prompt Template component only acceptsMessage
data, this conversion is required so that the flow can pass the search results to the Prompt Template component. For more information, see Web Search output. -
In the Prompt Template component's Template field, add a variable like
{searchresults}
or{context}
.This adds a field to the Prompt Template component that you can use to pass the converted search results to the prompt.
-
Connect the Type Convert component's output to the new variable field on the Prompt Template component.
-
In the Language Model component, add your OpenAI API key, or select a different provider and model.
-
Click Playground, and then enter
latest news
.The LLM processes the request, including the context passed through the Prompt Template component, and then prints the response in the Playground chat interface.
Result
The following is an example of a possible response. Your response may vary based on the current state of the web, your specific query, the model, and other factors.
_10Here are some of the latest news articles related to the environment:_10Ozone Pollution and Global Warming: A recent study highlights that ozone pollution is a significant global environmental concern, threatening human health and crop production while exacerbating global warming. Read more_10...
Web Search parameters
Name | Display Name | Info |
---|---|---|
query | Search Query | Input parameter. Keywords to search for. |
timeout | Timeout | Input parameter. Timeout for the web search request in seconds. Default: 5 . |
results | Search Results | Output parameter. Returns a DataFrame containing title , links , and snippets . For more information, see Web Search output. |
Web Search output
The Web Search component outputs a DataFrame
containing the key columns title
, links
, and snippets
.
When used as a standard component in a flow, the Web Search component must be connected to a component that accepts DataFrame
input, or you must use a Type Convert component to convert the output to Data
or Message
type before passing the data to other components.
When used in Tool Mode with an Agent component, the Web Search component can be connected directly to the Agent component's Tools port without converting the data.
The agent decides whether to use the Web Search component based on the user's query, and it can process the DataFrame
output directly.
Webhook
The Webhook component defines a webhook trigger that runs a flow when it receives an HTTP POST request.
Trigger the webhook
When you add a Webhook component to your flow, a Webhook curl tab is added to the flow's API Access pane. This tab automatically generates an HTTP POST request code snippet that you can use to trigger your flow through the Webhook component. For example:
_10curl -X POST \_10 "http://$LANGFLOW_SERVER_ADDRESS/api/v1/webhook/$FLOW_ID" \_10 -H 'Content-Type: application/json' \_10 -H 'x-api-key: $LANGFLOW_API_KEY' \_10 -d '{"any": "data"}'
For more information, see Trigger flows with webhooks.
Webhook parameters
Name | Display Name | Description |
---|---|---|
data | Payload | Input parameter. Receives a payload from external systems through HTTP POST requests. |
curl | curl | Input parameter. The curl command template for making requests to this webhook. |
endpoint | Endpoint | Input parameter. The endpoint URL where this webhook receives requests. |
output_data | Data | Output parameter. The processed data from the webhook input. Returns an empty Data object if no input is provided. If the input isn't valid JSON, the Webhook component wraps it in a payload object so that it can be accepted as input to trigger the flow. |
Additional Data components
Langflow's core components are meant to be generic and support a range of use cases. Core components typically aren't limited to a single provider.
If the core Data components don't meet your needs, you can find provider-specific components in the Bundles section of the Components menu.
For example, the DataStax bundle includes components for CQL queries, and the Google bundle includes components for Google Search APIs.
Legacy Data components
The Load CSV and Load JSON components are legacy components. You can still use them in your flows, but they are no longer maintained and can be removed in a future release.
Replace these components with the File component, which supports loading CSV and JSON files, as well as many other file types.