Skip to content

dataframe.plot accepts undocumented 'by=' keyword but ignores it. #9274

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
cswarth opened this issue Jan 16, 2015 · 3 comments
Open

dataframe.plot accepts undocumented 'by=' keyword but ignores it. #9274

cswarth opened this issue Jan 16, 2015 · 3 comments
Labels

Comments

@cswarth
Copy link
Contributor

cswarth commented Jan 16, 2015

Maybe this is a request for enhancement, or documentation of a bug - I don't know.

A tutorial showed that DataFrame.boxplot() can take a by= keyword to produced plot stratified by the values in one column. I naively assumed I could apply that to other plots as well. DataFrame.plot() accepts by= but seems to silently ignores it. This seems to be an undocumented keyword for DataFrame.plot().

Is this intentional? Why not include the groupby functionality from boxplot into plot so it can do something reasonable with all the other plot types as well? If not, then why does plot accept the by= keyword?

Example code below,

import pandas as pd
pd.show_versions()

plt.rcParams.update(pd.tools.plotting.mpl_stylesheet)

# taken from http://pandas.pydata.org/pandas-docs/stable/visualization.html#box-plots
df = pd.DataFrame(rand(10,2), columns=['Col1', 'Col2'] )
df['X'] = pd.Series(['A','A','A','A','A','B','B','B','B','B'])
plt.figure();
print(df)

# boxplot produces two subplots segregated by the values in X
df.boxplot(by='X')

# Other kinds of plots seem to silently ignore 'by='
df.plot(kind='box', by='X')

# Other kinds of plots seem to silently ignore 'by='
df.plot(kind='scatter', x=0, y=1, by='X')

# Other kinds of plots seem to silently ignore 'by='
df.plot(kind='line', by='X')
INSTALLED VERSIONS
------------------
commit: None
python: 2.7.6.final.0
python-bits: 64
OS: Linux
OS-release: 3.13.0-32-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.15.2
nose: 1.3.4
Cython: 0.20.1post0
numpy: 1.9.1
scipy: 0.13.3
statsmodels: 0.6.0
IPython: 2.3.1
sphinx: 1.2.3
patsy: 0.3.0
dateutil: 2.4.0
pytz: 2014.10
bottleneck: None
tables: 3.1.1
numexpr: 2.2.2
matplotlib: 1.4.2
openpyxl: 1.8.6
xlrd: 0.9.3
xlwt: 0.7.5
xlsxwriter: None
lxml: 3.3.3
bs4: 4.3.2
html5lib: 0.999
httplib2: 0.9
apiclient: None
rpy2: 2.5.2
sqlalchemy: 0.8.4
pymysql: None
psycopg2: None
       Col1      Col2  X
0  0.561433  0.329668  A
1  0.502967  0.111894  A
2  0.607194  0.565945  A
3  0.006764  0.617442  A
4  0.912123  0.790524  A
5  0.992081  0.958802  B
6  0.791964  0.285251  B
7  0.624917  0.478094  B
8  0.195675  0.382317  B
9  0.053874  0.451648  B

image
image
image
image

@TomAugspurger
Copy link
Contributor

DataFrame.boxplot was united under DataFrame.plot() with the kind='boxplot' keyword argument a while back. boxplot was the only plot type to ever support faceting (with the by keyword).

There aren't any plans to add faceting for other plot types right now. Other libraries like seaborn or ggplot do it and handle DataFrames nicely.

A pull request to document the by keyword, noting that it only has an effect with kind=boxplot would be welcome! There's an issue here about validating keyword arguments to df.plot and warning / raising when things are unused.

@TomAugspurger TomAugspurger added this to the 0.16.0 milestone Jan 16, 2015
@shoyer
Copy link
Member

shoyer commented Jan 16, 2015

This is a hazard of our current API design, which uses a single plot method to dispatch to many different underlying plot types. The ultimate fix is probably to encourage more specific methods like df.plot.scatter() rather than df.plot(). See #9124 for more discussion of that design.

@cswarth
Copy link
Contributor Author

cswarth commented Jan 16, 2015

Thanks Tom, I'll dig into the code to see how it's used and document the 'by=' keyword.

@jreback jreback modified the milestones: 0.16.0, Next Major Release Mar 5, 2015
@mroeschke mroeschke removed this from the Contributions Welcome milestone Oct 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants