Hello there πŸ‘‹

Thanks for stopping by! We use cookies to help us understand how you interact with our website.

By clicking β€œAccept all”, you consent to our use of cookies. For more information, please see our privacy policy.

Dataframes

Dataframes are a great way to display and edit data in a tabular format. Working with Pandas DataFrames and other tabular data structures is key to data science workflows. If developers and data scientists want to display this data in Streamlit, they have multiple options: st.dataframe and st.data_editor. If you want to solely display data in a table-like UI, st.dataframe is the way to go. If you want to interactively edit data, use st.data_editor. We explore the use cases and advantages of each option in the following sections.

Streamlit can display dataframes in a table-like UI via st.dataframe :

import streamlit as st import pandas as pd df = pd.DataFrame( [ {"command": "st.selectbox", "rating": 4, "is_widget": True}, {"command": "st.balloons", "rating": 5, "is_widget": False}, {"command": "st.time_input", "rating": 3, "is_widget": True}, ] ) st.dataframe(df, use_container_width=True)

st.dataframe provides additional functionality by using glide-data-grid under the hood:

  • Column sorting: Sort columns by clicking on their headers.
  • Column resizing: Resize columns by dragging and dropping column header borders.
  • Table resizing: Resize tables by dragging and dropping the bottom right corner.
  • Fullscreen view: Enlarge tables to fullscreen by clicking the fullscreen icon (fullscreen) in the toolbar.
  • Search: Click the search icon (search) in the toolbar or use hotkeys (⌘+F or Ctrl+F) to search through the data.
  • Download: Click the download icon in the toolbar to download the data as a CSV file.
  • Copy to clipboard: Select one or multiple cells, copy them to the clipboard (⌘+C or Ctrl+C), and paste them into your favorite spreadsheet software.

Try out all the UI features using the embedded app from the prior section.

In addition to Pandas DataFrames, st.dataframe also supports other common Python types, e.g., list, dict, or numpy array. It also supports Snowpark and PySpark DataFrames, which allow you to lazily evaluate and pull data from databases. This can be useful for working with large datasets.

Streamlit supports editable dataframes via the st.data_editor command. Check out its API in st.data_editor. It shows the dataframe in a table, similar to st.dataframe. But in contrast to st.dataframe, this table isn't static! The user can click on cells and edit them. The edited data is then returned on the Python side. Here's an example:

df = pd.DataFrame( [ {"command": "st.selectbox", "rating": 4, "is_widget": True}, {"command": "st.balloons", "rating": 5, "is_widget": False}, {"command": "st.time_input", "rating": 3, "is_widget": True}, ] ) edited_df = st.data_editor(df) # πŸ‘ˆ An editable dataframe favorite_command = edited_df.loc[edited_df["rating"].idxmax()]["command"] st.markdown(f"Your favorite command is **{favorite_command}** 🎈")

Try it out by double-clicking on any cell. You'll notice you can edit all cell values. Try editing the values in the rating column and observe how the text output at the bottom changes:

st.data_editor also supports a few additional things:

  • Add and delete rows: You can do this by setting num_rows= "dynamic" when calling st.data_editor. This will allow users to add and delete rows as needed.
  • Copy and paste support: Copy and paste both between st.data_editor and spreadsheet software like Google Sheets and Excel.
  • Access edited data: Access only the individual edits instead of the entire edited data structure via Session State.
  • Bulk edits: Similar to Excel, just drag a handle to edit neighboring cells.
  • Automatic input validation: Column Configuration provides strong data type support and other configurable options. For example, there's no way to enter letters into a number cell. Number cells can have a designated min and max.
  • Edit common data structures: st.data_editor supports lists, dicts, NumPy ndarray, and more!

With st.data_editor, viewers can add or delete rows via the table UI. This mode can be activated by setting the num_rows parameter to "dynamic":

edited_df = st.data_editor(df, num_rows="dynamic")
  • To add new rows, click the plus icon (add) in the toolbar. Alternatively, click inside a shaded cell below the bottom row of the table.
  • To delete rows, select one or more rows using the checkboxes on the left. Click the delete icon (delete) or press the delete key on your keyboard.

The data editor supports pasting in tabular data from Google Sheets, Excel, Notion, and many other similar tools. You can also copy-paste data between st.data_editor instances. This functionality, powered by the Clipboard API, can be a huge time saver for users who need to work with data across multiple platforms. To try it out:

  1. Copy data from this Google Sheets document to your clipboard.
  2. Single click any cell in the name column in the app above. Paste it in using hotkeys (⌘+V or Ctrl+V).
push_pin

Note

Every cell of the pasted data will be evaluated individually and inserted into the cells if the data is compatible with the column type. For example, pasting in non-numerical text data into a number column will be ignored.

star

Tip

If you embed your apps with iframes, you'll need to allow the iframe to access the clipboard if you want to use the copy-paste functionality. To do so, give the iframe clipboard-write and clipboard-read permissions. E.g.

<iframe allow="clipboard-write;clipboard-read;" ... src="https://your-app-url"></iframe>

As developers, ensure the app is served with a valid, trusted certificate when using TLS. If users encounter issues with copying and pasting data, direct them to check if their browser has activated clipboard access permissions for the Streamlit application, either when prompted or through the browser's site settings.

Sometimes, it is more convenient to know which cells have been changed rather than getting the entire edited dataframe back. Streamlit makes this easy through the use of Session State. If a key parameter is set, Streamlit will store any changes made to the dataframe in Session State.

This snippet shows how you can access changed data using Session State:

st.data_editor(df, key="my_key", num_rows="dynamic") # πŸ‘ˆ Set a key st.write("Here's the value in Session State:") st.write(st.session_state["my_key"]) # πŸ‘ˆ Show the value in Session State

In this code snippet, the key parameter is set to "my_key". After the data editor is created, the value associated to "my_key" in Session State is displayed in the app using st.write. This shows the additions, edits, and deletions that were made.

This can be useful when working with large dataframes and you only need to know which cells have changed, rather than access the entire edited dataframe.

Use all we've learned so far and apply them to the above embedded app. Try editing cells, adding new rows, and deleting rows.

Notice how edits to the table are reflected in Session State. When you make any edits, a rerun is triggered which sends the edits to the backend. The widget's state is a JSON object containing three properties: edited_rows, added_rows, and deleted rows:.

priority_high

Warning

When going from st.experimental_data_editor to st.data_editor in 1.23.0, the data editor's representation in st.session_state was changed. The edited_cells dictionary is now called edited_rows and uses a different format ({0: {"column name": "edited value"}} instead of {"0:1": "edited value"}). You may need to adjust your code if your app uses st.experimental_data_editor in combination with st.session_state."

  • edited_rows is a dictionary containing all edits. Keys are zero-based row indices and values are dictionaries that map column names to edits (e.g. {0: {"col1": ..., "col2": ...}}).
  • added_rows is a list of newly added rows. Each value is a dictionary with the same format as above (e.g. [{"col1": ..., "col2": ...}]).
  • deleted_rows is a list of row numbers that have been deleted from the table (e.g. [0, 2]).

st.data_editor does not support reordering rows, so added rows will always be appended to the end of the dataframe with any edits and deletions applicable to the original rows.

The data editor includes a feature that allows for bulk editing of cells. Similar to Excel, you can drag a handle across a selection of cells to edit their values in bulk. You can even apply commonly used keyboard shortcuts in spreadsheet software. This is useful when you need to make the same change across multiple cells, rather than editing each cell individually.

Editing doesn't just work for Pandas DataFrames! You can also edit lists, tuples, sets, dictionaries, NumPy arrays, or Snowpark & PySpark DataFrames. Most data types will be returned in their original format. But some types (e.g. Snowpark and PySpark) are converted to Pandas DataFrames. To learn about all the supported types, read the st.data_editor API.

For example, you can easily let the user add items to a list:

edited_list = st.data_editor(["red", "green", "blue"], num_rows= "dynamic") st.write("Here are all the colors you entered:") st.write(edited_list)

Or numpy arrays:

import numpy as np st.data_editor(np.array([ ["st.text_area", "widget", 4.92], ["st.markdown", "element", 47.22] ]))

Or lists of records:

st.data_editor([ {"name": "st.text_area", "type": "widget"}, {"name": "st.markdown", "type": "element"}, ])

Or dictionaries and many more types!

st.data_editor({ "st.text_area": "widget", "st.markdown": "element" })

The data editor includes automatic input validation to help prevent errors when editing cells. For example, if you have a column that contains numerical data, the input field will automatically restrict the user to only entering numerical data. This helps to prevent errors that could occur if the user were to accidentally enter a non-numerical value. Additional input validation can be configured through the Column configuration API. Keep reading below for an overview of column configuration, including validation options.

You can configure the display and editing behavior of columns in st.dataframe and st.data_editor via the Column configuration API. We have developed the API to let you add images, charts, and clickable URLs in dataframe and data editor columns. Additionally, you can make individual columns editable, set columns as categorical and specify which options they can take, hide the index of the dataframe, and much more.

Column configuration includes the following column types: Text, Number, Checkbox, Selectbox, Date, Time, Datetime, List, Link, Image, Line chart, Bar chart, and Progress. There is also a generic Column option. See the embedded app below to view these different column types. Each column type is individually previewed in the Column configuration API documentation.

A format parameter is available in column configuration for Text, Date, Time, and Datetime columns. Chart-like columns can also be formatted. Line chart and Bar chart columns have a y_min and y_max parameters to set the vertical bounds. For a Progress column, you can declare the horizontal bounds with min_value and max_value.

When specifying a column configuration, you can declare not only the data type of the column but also value restrictions. All column configuration elements allow you to make a column required with the keyword parameter required=True.

For Text and Link columns, you can specify the maximum number of characters with max_chars or use regular expressions to validate entries through validate. Numerical columns, including Number, Date, Time, and Datetime have min_value and max_value parameters. Selectbox columns have a configurable list of options.

The data type for Number columns is float by default. Passing a value of type int to any of min_value, max_value, step, or default will set the type for the column as int.

You can use st.data_editor to collect tabular input from a user. When starting from an empty dataframe, default column types are text. Use column configuration to specify the data types you want to collect from users.

import streamlit as st import pandas as pd df = pd.DataFrame(columns=['name','age','color']) colors = ['red', 'orange', 'yellow', 'green', 'blue', 'indigo', 'violet'] config = { 'name' : st.column_config.TextColumn('Full Name (required)', width='large', required=True), 'age' : st.column_config.NumberColumn('Age (years)', min_value=0, max_value=122), 'color' : st.column_config.SelectboxColumn('Favorite Color', options=colors) } result = st.data_editor(df, column_config = config, num_rows='dynamic') if st.button('Get results'): st.write(result)

In addition to column configuration, st.dataframe and st.data_editor have a few more parameters to customize the display of your dataframe.

  • hide_index : Set to True to hide the dataframe's index.
  • column_order : Pass a list of column labels to specify the order of display.
  • disabled : Pass a list of column labels to disable them from editing. This let's you avoid disabling them individually.

st.dataframe and st.data_editor have been designed to theoretically handle tables with millions of rows thanks to their highly performant implementation using the glide-data-grid library and HTML canvas. However, the maximum amount of data that an app can realistically handle will depend on several other factors, including:

  1. The maximum size of WebSocket messages: Streamlit's WebSocket messages are configurable via the server.maxMessageSize config option, which limits the amount of data that can be transferred via the WebSocket connection at once.
  2. The server memory: The amount of data that your app can handle will also depend on the amount of memory available on your server. If the server's memory is exceeded, the app may become slow or unresponsive.
  3. The user's browser memory: Since all the data needs to be transferred to the user's browser for rendering, the amount of memory available on the user's device can also affect the app's performance. If the browser's memory is exceeded, it may crash or become unresponsive.

In addition to these factors, a slow network connection can also significantly slow down apps that handle large datasets.

When handling large datasets with more than 150,000 rows, Streamlit applies additional optimizations and disables column sorting. This can help to reduce the amount of data that needs to be processed at once and improve the app's performance.

  • Streamlit casts all column names to strings internally, so st.data_editor will return a DataFrame where all column names are strings.
  • The dataframe toolbar is not currently configurable.
  • While Streamlit's data editing capabilities offer a lot of functionality, editing is enabled for a limited set of column types (TextColumn, NumberColumn, LinkColumn, CheckboxColumn, SelectboxColumn, DateColumn, TimeColumn, and DatetimeColumn). We are actively working on supporting editing for other column types as well, such as images, lists, and charts.
  • Almost all editable datatypes are supported for index editing. However, pandas.CategoricalIndex and pandas.MultiIndex are not supported for editing.
  • Sorting is not supported for st.data_editor when num_rows="dynamic".
  • Sorting is deactivated to optimize performance on large datasets with more than 150,000 rows.

We are continually working to improve Streamlit's handling of DataFrame and add functionality to data editing, so keep an eye out for updates.

forum

Still have questions?

Our forums are full of helpful information and Streamlit experts.