Skip to content

Add colormap= argument to DataFrame plotting methods #3860

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 27, 2013
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions doc/source/release.rst
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,8 @@ pandas 0.12
- A ``filter`` method on grouped Series or DataFrames returns a subset of
the original (:issue:`3680`, :issue:`919`)
- Access to historical Google Finance data in pandas.io.data (:issue:`3814`)
- DataFrame plotting methods can sample column colors from a Matplotlib
colormap via the ``colormap`` keyword. (:issue:`3860`)

**Improvements to existing features**

Expand Down
6 changes: 6 additions & 0 deletions doc/source/v0.12.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,12 @@ API changes
and thus you should cast to an appropriate numeric dtype if you need to
plot something.

- Add ``colormap`` keyword to DataFrame plotting methods. Accepts either a
matplotlib colormap object (ie, matplotlib.cm.jet) or a string name of such
an object (ie, 'jet'). The colormap is sampled to select the color for each
column. Please see :ref:`visualization.colormaps` for more information.
(:issue:`3860`)

- ``DataFrame.interpolate()`` is now deprecated. Please use
``DataFrame.fillna()`` and ``DataFrame.replace()`` instead. (:issue:`3582`,
:issue:`3675`, :issue:`3676`)
Expand Down
62 changes: 62 additions & 0 deletions doc/source/visualization.rst
Original file line number Diff line number Diff line change
Expand Up @@ -531,3 +531,65 @@ be colored differently.

@savefig radviz.png width=6in
radviz(data, 'Name')

.. _visualization.colormaps:

Colormaps
~~~~~~~~~

A potential issue when plotting a large number of columns is that it can be difficult to distinguish some series due to repetition in the default colors. To remedy this, DataFrame plotting supports the use of the ``colormap=`` argument, which accepts either a Matplotlib `colormap <http://matplotlib.org/api/cm_api.html>`__ or a string that is a name of a colormap registered with Matplotlib. A visualization of the default matplotlib colormaps is available `here <http://wiki.scipy.org/Cookbook/Matplotlib/Show_colormaps>`__.

As matplotlib does not directly support colormaps for line-based plots, the colors are selected based on an even spacing determined by the number of columns in the DataFrame. There is no consideration made for background color, so some colormaps will produce lines that are not easily visible.

To use the jet colormap, we can simply pass ``'jet'`` to ``colormap=``

.. ipython:: python

df = DataFrame(randn(1000, 10), index=ts.index)
df = df.cumsum()

plt.figure()

@savefig jet.png width=6in
df.plot(colormap='jet')

or we can pass the colormap itself

.. ipython:: python

from matplotlib import cm

plt.figure()

@savefig jet_cm.png width=6in
df.plot(colormap=cm.jet)

Colormaps can also be used other plot types, like bar charts:

.. ipython:: python

dd = DataFrame(randn(10, 10)).applymap(abs)
dd = dd.cumsum()

plt.figure()

@savefig greens.png width=6in
dd.plot(kind='bar', colormap='Greens')

Parallel coordinates charts:

.. ipython:: python

plt.figure()

@savefig parallel_gist_rainbow.png width=6in
parallel_coordinates(data, 'Name', colormap='gist_rainbow')

Andrews curves charts:

.. ipython:: python

plt.figure()

@savefig andrews_curve_winter.png width=6in
andrews_curves(data, 'Name', colormap='winter')
62 changes: 62 additions & 0 deletions pandas/tests/test_graphics.py
Original file line number Diff line number Diff line change
Expand Up @@ -103,6 +103,35 @@ def test_bar_colors(self):
self.assert_(xp == rs)

plt.close('all')

from matplotlib import cm

# Test str -> colormap functionality
ax = df.plot(kind='bar', colormap='jet')

rects = ax.patches

rgba_colors = map(cm.jet, np.linspace(0, 1, 5))
for i, rect in enumerate(rects[::5]):
xp = rgba_colors[i]
rs = rect.get_facecolor()
self.assert_(xp == rs)

plt.close('all')

# Test colormap functionality
ax = df.plot(kind='bar', colormap=cm.jet)

rects = ax.patches

rgba_colors = map(cm.jet, np.linspace(0, 1, 5))
for i, rect in enumerate(rects[::5]):
xp = rgba_colors[i]
rs = rect.get_facecolor()
self.assert_(xp == rs)

plt.close('all')

df.ix[:, [0]].plot(kind='bar', color='DodgerBlue')

@slow
Expand Down Expand Up @@ -600,6 +629,7 @@ def test_andrews_curves(self):
def test_parallel_coordinates(self):
from pandas import read_csv
from pandas.tools.plotting import parallel_coordinates
from matplotlib import cm
path = os.path.join(curpath(), 'data/iris.csv')
df = read_csv(path)
_check_plot_works(parallel_coordinates, df, 'Name')
Expand All @@ -611,6 +641,7 @@ def test_parallel_coordinates(self):
colors=('#556270', '#4ECDC4', '#C7F464'))
_check_plot_works(parallel_coordinates, df, 'Name',
colors=['dodgerblue', 'aquamarine', 'seagreen'])
_check_plot_works(parallel_coordinates, df, 'Name', colormap=cm.jet)

df = read_csv(
path, header=None, skiprows=1, names=[1, 2, 4, 8, 'Name'])
Expand All @@ -622,9 +653,11 @@ def test_parallel_coordinates(self):
def test_radviz(self):
from pandas import read_csv
from pandas.tools.plotting import radviz
from matplotlib import cm
path = os.path.join(curpath(), 'data/iris.csv')
df = read_csv(path)
_check_plot_works(radviz, df, 'Name')
_check_plot_works(radviz, df, 'Name', colormap=cm.jet)

@slow
def test_plot_int_columns(self):
Expand Down Expand Up @@ -666,6 +699,7 @@ def test_line_colors(self):
import matplotlib.pyplot as plt
import sys
from StringIO import StringIO
from matplotlib import cm

custom_colors = 'rgcby'

Expand All @@ -691,6 +725,30 @@ def test_line_colors(self):
finally:
sys.stderr = tmp

plt.close('all')

ax = df.plot(colormap='jet')

rgba_colors = map(cm.jet, np.linspace(0, 1, len(df)))

lines = ax.get_lines()
for i, l in enumerate(lines):
xp = rgba_colors[i]
rs = l.get_color()
self.assert_(xp == rs)

plt.close('all')

ax = df.plot(colormap=cm.jet)

rgba_colors = map(cm.jet, np.linspace(0, 1, len(df)))

lines = ax.get_lines()
for i, l in enumerate(lines):
xp = rgba_colors[i]
rs = l.get_color()
self.assert_(xp == rs)

# make color a list if plotting one column frame
# handles cases like df.plot(color='DodgerBlue')
plt.close('all')
Expand Down Expand Up @@ -862,6 +920,10 @@ def test_option_mpl_style(self):
except ValueError:
pass

def test_invalid_colormap(self):
df = DataFrame(np.random.randn(500, 2), columns=['A', 'B'])

self.assertRaises(ValueError, df.plot, colormap='invalid_colormap')

def _check_plot_works(f, *args, **kwargs):
import matplotlib.pyplot as plt
Expand Down
Loading