Display a scatterplot chart.

This is syntax-sugar around st.altair_chart. The main difference is this command uses the data's own column and indices to figure out the chart's spec. As a result this is easier to use for many "just plot this" scenarios, while being less customizable.

If st.scatter_chart does not guess the data specification correctly, try specifying your desired chart using st.altair_chart.

Function signature[source]

st.scatter_chart(data=None, *, x=None, y=None, color=None, size=None, width=0, height=0, use_container_width=True)

Parameters

data (pandas.DataFrame, pandas.Styler, pyarrow.Table, numpy.ndarray, pyspark.sql.DataFrame, snowflake.snowpark.dataframe.DataFrame, snowflake.snowpark.table.Table, Iterable, dict or None)

Data to be plotted.

x (str or None)

Column name to use for the x-axis. If None, uses the data index for the x-axis.

y (str, Sequence of str, or None)

Column name(s) to use for the y-axis. If a Sequence of strings, draws several series on the same chart by melting your wide-format table into a long-format table behind the scenes. If None, draws the data of all remaining columns as data series.

color (str, tuple, Sequence of str, Sequence of tuple, or None)

The color of the circles representing each datapoint.

This can be:

  • None, to use the default color.

  • A hex string like "#ffaa00" or "#ffaa0088".

  • An RGB or RGBA tuple with the red, green, blue, and alpha components specified as ints from 0 to 255 or floats from 0.0 to 1.0.

  • The name of a column in the dataset where the color of that datapoint will come from.

    If the values in this column are in one of the color formats above (hex string or color tuple), then that color will be used.

    Otherwise, the color will be automatically picked from the default palette.

    For example: if the dataset has 1000 rows, but this column only contains the values "adult", "child", and "baby", then those 1000 datapoints be shown using three colors from the default palette.

    But if this column only contains floats or ints, then those 1000 datapoints will be shown using a colors from a continuous color gradient.

    Finally, if this column only contains the values "#ffaa00", "#f0f", "#0000ff", then then each of those 1000 datapoints will be assigned "#ffaa00", "#f0f", or "#0000ff" as appropriate.

If the dataframe is in wide format (that is, y is a Sequence of columns), this can also be:

  • A list of string colors or color tuples to be used for each of the series in the chart. This list should have the same length as the number of y values (e.g. color=["#fd0", "#f0f", "#04f"] for three series).

size (str, float, int, or None)

The size of the circles representing each point.

This can be:

  • A number like 100, to specify a single size to use for all datapoints.
  • The name of the column to use for the size. This allows each datapoint to be represented by a circle of a different size.

width (int)

The chart width in pixels. If 0, selects the width automatically.

height (int)

The chart height in pixels. If 0, selects the height automatically.

use_container_width (bool)

If True, set the chart width to the column width. This takes precedence over the width argument.

Examples

import streamlit as st
import pandas as pd
import numpy as np

chart_data = pd.DataFrame(np.random.randn(20, 3), columns=["a", "b", "c"])

st.scatter_chart(chart_data)

You can also choose different columns to use for x and y, as well as set the color dynamically based on a 3rd column (assuming your dataframe is in long format):

import streamlit as st
import pandas as pd
import numpy as np

chart_data = pd.DataFrame(np.random.randn(20, 3), columns=["col1", "col2", "col3"])
chart_data['col4'] = np.random.choice(['A','B','C'], 20)

st.scatter_chart(
    chart_data,
    x='col1',
    y='col2',
    color='col4',
    size='col3',
)

Finally, if your dataframe is in wide format, you can group multiple columns under the y argument to show multiple series with different colors:

import streamlit as st
import pandas as pd
import numpy as np

chart_data = pd.DataFrame(np.random.randn(20, 4), columns=["col1", "col2", "col3", "col4"])

st.scatter_chart(
    chart_data,
    x='col1',
    y=['col2', 'col3'],
    size='col4',
    color=['#FF0000', '#0000FF'],  # Optional
)

Concatenate a dataframe to the bottom of the current one.

Function signature[source]

element.add_rows(data=None, **kwargs)

Parameters

data (pandas.DataFrame, pandas.Styler, pyarrow.Table, numpy.ndarray, pyspark.sql.DataFrame, snowflake.snowpark.dataframe.DataFrame, Iterable, dict, or None)

Table to concat. Optional.

**kwargs (pandas.DataFrame, numpy.ndarray, Iterable, dict, or None)

The named dataset to concat. Optional. You can only pass in 1 dataset (including the one in the data parameter).

Example

import streamlit as st
import pandas as pd
import numpy as np

df1 = pd.DataFrame(np.random.randn(50, 20), columns=("col %d" % i for i in range(20)))

my_table = st.table(df1)

df2 = pd.DataFrame(np.random.randn(50, 20), columns=("col %d" % i for i in range(20)))

my_table.add_rows(df2)
# Now the table shown in the Streamlit app contains the data for
# df1 followed by the data for df2.

You can do the same thing with plots. For example, if you want to add more data to a line chart:

# Assuming df1 and df2 from the example above still exist...
my_chart = st.line_chart(df1)
my_chart.add_rows(df2)
# Now the chart shown in the Streamlit app contains the data for
# df1 followed by the data for df2.

And for plots whose datasets are named, you can pass the data with a keyword argument where the key is the name:

my_chart = st.vega_lite_chart({
    'mark': 'line',
    'encoding': {'x': 'a', 'y': 'b'},
    'datasets': {
      'some_fancy_name': df1,  # <-- named dataset
     },
    'data': {'name': 'some_fancy_name'},
}),
my_chart.add_rows(some_fancy_name=df2)  # <-- name used as keyword
forum

Still have questions?

Our forums are full of helpful information and Streamlit experts.