Skip to content

Latest commit

 

History

History
235 lines (151 loc) · 5.76 KB

04_plotting.rst

File metadata and controls

235 lines (151 loc) · 5.76 KB

{{ header }}

.. ipython:: python

    import pandas as pd
    import matplotlib.pyplot as plt

Data used for this tutorial:
  • .. ipython:: python
    
        air_quality = pd.read_csv("data/air_quality_no2.csv", index_col=0, parse_dates=True)
        air_quality.head()
    
    

    Note

    The usage of the index_col and parse_dates parameters of the read_csv function to define the first (0th) column as index of the resulting DataFrame and convert the dates in the column to :class:`Timestamp` objects, respectively.

How to create plots in pandas?

  • I want a quick visual check of the data.

    .. ipython:: python
    
        @savefig 04_airqual_quick.png
        air_quality.plot()
    
    

    With a DataFrame, pandas creates by default one line plot for each of the columns with numeric data.

  • I want to visually compare the NO_2 values measured in London versus Paris.

    .. ipython:: python
    
        @savefig 04_airqual_scatter.png
        air_quality.plot.scatter(x="station_london", y="station_paris", alpha=0.5)
    
    

Apart from the default line plot when using the plot function, a number of alternatives are available to plot data. Let’s use some standard Python to get an overview of the available plot methods:

.. ipython:: python

    [
        method_name
        for method_name in dir(air_quality.plot)
        if not method_name.startswith("_")
    ]

Note

In many development environments as well as IPython and Jupyter Notebook, use the TAB button to get an overview of the available methods, for example air_quality.plot. + TAB.

One of the options is :meth:`DataFrame.plot.box`, which refers to a boxplot. The box method is applicable on the air quality example data:

.. ipython:: python

    @savefig 04_airqual_boxplot.png
    air_quality.plot.box()

To user guide

For an introduction to plots other than the default line plot, see the user guide section about :ref:`supported plot styles <visualization.other>`.

  • I want each of the columns in a separate subplot.

    .. ipython:: python
    
        @savefig 04_airqual_area_subplot.png
        axs = air_quality.plot.area(figsize=(12, 4), subplots=True)
    
    

    Separate subplots for each of the data columns are supported by the subplots argument of the plot functions. The builtin options available in each of the pandas plot functions are worth reviewing.

To user guide

Some more formatting options are explained in the user guide section on :ref:`plot formatting <visualization.formatting>`.

  • I want to further customize, extend or save the resulting plot.

    .. ipython:: python
    
        fig, axs = plt.subplots(figsize=(12, 4))
        air_quality.plot.area(ax=axs)
        @savefig 04_airqual_customized.png
        axs.set_ylabel("NO$_2$ concentration")
        fig.savefig("no2_concentrations.png")
    
    
    .. ipython:: python
       :suppress:
    
       import os
    
       os.remove("no2_concentrations.png")
    
    

Each of the plot objects created by pandas is a Matplotlib object. As Matplotlib provides plenty of options to customize plots, making the link between pandas and Matplotlib explicit enables all the power of Matplotlib to the plot. This strategy is applied in the previous example:

fig, axs = plt.subplots(figsize=(12, 4))        # Create an empty Matplotlib Figure and Axes
air_quality.plot.area(ax=axs)                   # Use pandas to put the area plot on the prepared Figure/Axes
axs.set_ylabel("NO$_2$ concentration")          # Do any Matplotlib customization you like
fig.savefig("no2_concentrations.png")           # Save the Figure/Axes using the existing Matplotlib method.

REMEMBER

  • The .plot.* methods are applicable on both Series and DataFrames.
  • By default, each of the columns is plotted as a different element (line, boxplot,…).
  • Any plot created by pandas is a Matplotlib object.
To user guide

A full overview of plotting in pandas is provided in the :ref:`visualization pages <visualization>`.