Skip to content

DataFrame.plot() produces incorrect legend labels when plotting multiple series on the same axis #18222

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
scottlawsonbc opened this issue Nov 10, 2017 · 9 comments · Fixed by #27808
Labels

Comments

@scottlawsonbc
Copy link

scottlawsonbc commented Nov 10, 2017

Code Sample, a copy-pastable example if possible

import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame(data=[[1, 1, 1, 1], [2, 2, 4, 8]], columns=['x', 'r', 'g', 'b'])
fig, ax = plt.subplots(nrows=1, ncols=3)
# Left plot
df.plot(x='x', y='r', linewidth=0, marker='o', color='r', ax=ax[0])
df.plot(x='x', y='g', linewidth=1, marker='x', color='g', ax=ax[0])
df.plot(x='x', y='b', linewidth=1, marker='o', color='b', ax=ax[0])
# Center plot
df.plot(x='x', y='b', linewidth=1, marker='o', color='b', ax=ax[1])
df.plot(x='x', y='r', linewidth=0, marker='o', color='r', ax=ax[1])
df.plot(x='x', y='g', linewidth=1, marker='x', color='g', ax=ax[1])
# Right plot
df.plot(x='x', y='g', linewidth=1, marker='x', color='g', ax=ax[2])
df.plot(x='x', y='b', linewidth=1, marker='o', color='b', ax=ax[2])
df.plot(x='x', y='r', linewidth=0, marker='o', color='r', ax=ax[2])
# Produces correct output when uncommented:
# ax[0].legend()
# ax[1].legend()
# ax[2].legend()
plt.show()

Problem description

Note: Possibly related to #14958, #17939, #14563, however this issue discusses how the behaviour depends on the order in which df.plot() is called

  • In certain situations, df.plot() may generate incorrect legend labels (see example)
  • Incorrect legend labels may appear when df.plot() plots multiple series on the same axis
  • Plotting only one series per axis always appears to produce the correct legend label
  • When multiple series are plotted on the same axis:
    • Only the last df.plot() call will produce the correct legend label
    • Calling ax.legend() after the final df.plot() will produce the correct legend entry
  • Calling df.plot() with markerstyle='' always appears produces the correct legend label for that series regardless of the order in which df.plot() is called, or the number of other series on the same axis

Expected Output

This is the incorrect output produced by the sample code
Note that in each plot, only the last series to be plotted has a correct legend label
image
This is the expected output
image

Output of pd.show_versions()

[paste the output of pd.show_versions() here below this line]
INSTALLED VERSIONS

commit: None
python: 3.5.4.final.0
python-bits: 64
OS: Linux
OS-release: 4.10.0-37-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.21.0
pytest: None
pip: 9.0.1
setuptools: 36.5.0.post20170921
Cython: None
numpy: 1.13.3
scipy: 0.19.1
pyarrow: None
xarray: None
IPython: 6.1.0
sphinx: None
patsy: None
dateutil: 2.6.1
pytz: 2017.3
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 2.1.0
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

@scottlawsonbc scottlawsonbc changed the title DataFrame.plot() produces incorrect legend labels DataFrame.plot() produces incorrect legend labels when plotting multiple series on the same axis Nov 10, 2017
@gfyoung gfyoung added the Visualization plotting label Nov 11, 2017
@gfyoung
Copy link
Member

gfyoung commented Nov 11, 2017

cc @TomAugspurger

@scottlawsonbc : Could you provide snapshots of what you're seeing currently?

@scottlawsonbc
Copy link
Author

@gfyoung there are two screenshots under the expected output section

@gfyoung
Copy link
Member

gfyoung commented Nov 12, 2017

there are two screenshots under the expected output section

Yes, I see that, but which, if any, represent what you are seeing currently? Or are both snapshots representative of what you would expect to see?

@scottlawsonbc
Copy link
Author

scottlawsonbc commented Nov 12, 2017

@gfyoung I understand your question now. I see how the plots are confusing and I should have explained their meaning in a way that is more clear.

At the bottom of my example code, I commented out several lines:

# Produces correct output when uncommented:
# ax[0].legend()
# ax[1].legend()
# ax[2].legend()

When these lines are commented out, the sample code produces a plot with incorrect legend labels. If you run the sample code exactly as I provided in the original issue, then it will produce the erroneous incorrect output.

When these lines are uncommented, the sample code produces a plot with the correct legend labels. However, the issue is that pandas should be producing this output without requiring these lines to be run.

In the expected output section of my issue, the first plot shows the output which is incorrect, and the second plot shows what the correct output should be (intended as a reference).

Does this answer your question? I will update my issue to make this more clear.

@gfyoung
Copy link
Member

gfyoung commented Nov 12, 2017

Ah, okay. That makes sense. If you could update the issue to reflect your answer, that would be great.

@TomAugspurger
Copy link
Contributor

@scottlawsonbc do you have time to debug where things are going wrong?

def _make_legend(self):
def _post_plot_logic_common(self, ax, data):
may be good starting points (or maybe not, haven't touched that code in a while)

@timisid
Copy link

timisid commented May 7, 2020

if you still using multiple series with one axis (like barplot with x,y, and hue options) and the legend still incorect, you can modified the legend using get_texts() after df.plot().
in this case df.plots() going to change it with sns.barplot()

example:
first line ; sns.barplot(data=data, x=..., y=..., hue=..., ax=ax)
next line :
L=ax.legend()
L.set_title('modified legend')
L.get_texts()[0].set_text('b')
L.get_texts()[1].set_text('r')
L.get_texts()[2].set_text('g')

@comready
Copy link

If you need another example of this problem let me know

@MarcoGorelli
Copy link
Member

Hi @jzlcdh - if you're aware of a new bug then please open a new issue, thanks

@MarcoGorelli MarcoGorelli mentioned this issue Apr 4, 2021
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
6 participants