You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Maybe this is a request for enhancement, or documentation of a bug - I don't know.
A tutorial showed that DataFrame.boxplot() can take a by= keyword to produced plot stratified by the values in one column. I naively assumed I could apply that to other plots as well. DataFrame.plot() accepts by= but seems to silently ignores it. This seems to be an undocumented keyword for DataFrame.plot().
Is this intentional? Why not include the groupby functionality from boxplot into plot so it can do something reasonable with all the other plot types as well? If not, then why does plot accept the by= keyword?
Example code below,
import pandas as pd
pd.show_versions()
plt.rcParams.update(pd.tools.plotting.mpl_stylesheet)
# taken from http://pandas.pydata.org/pandas-docs/stable/visualization.html#box-plots
df = pd.DataFrame(rand(10,2), columns=['Col1', 'Col2'] )
df['X'] = pd.Series(['A','A','A','A','A','B','B','B','B','B'])
plt.figure();
print(df)
# boxplot produces two subplots segregated by the values in X
df.boxplot(by='X')
# Other kinds of plots seem to silently ignore 'by='
df.plot(kind='box', by='X')
# Other kinds of plots seem to silently ignore 'by='
df.plot(kind='scatter', x=0, y=1, by='X')
# Other kinds of plots seem to silently ignore 'by='
df.plot(kind='line', by='X')
INSTALLED VERSIONS
------------------
commit: None
python: 2.7.6.final.0
python-bits: 64
OS: Linux
OS-release: 3.13.0-32-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
pandas: 0.15.2
nose: 1.3.4
Cython: 0.20.1post0
numpy: 1.9.1
scipy: 0.13.3
statsmodels: 0.6.0
IPython: 2.3.1
sphinx: 1.2.3
patsy: 0.3.0
dateutil: 2.4.0
pytz: 2014.10
bottleneck: None
tables: 3.1.1
numexpr: 2.2.2
matplotlib: 1.4.2
openpyxl: 1.8.6
xlrd: 0.9.3
xlwt: 0.7.5
xlsxwriter: None
lxml: 3.3.3
bs4: 4.3.2
html5lib: 0.999
httplib2: 0.9
apiclient: None
rpy2: 2.5.2
sqlalchemy: 0.8.4
pymysql: None
psycopg2: None
Col1 Col2 X
0 0.561433 0.329668 A
1 0.502967 0.111894 A
2 0.607194 0.565945 A
3 0.006764 0.617442 A
4 0.912123 0.790524 A
5 0.992081 0.958802 B
6 0.791964 0.285251 B
7 0.624917 0.478094 B
8 0.195675 0.382317 B
9 0.053874 0.451648 B
The text was updated successfully, but these errors were encountered:
DataFrame.boxplot was united under DataFrame.plot() with the kind='boxplot' keyword argument a while back. boxplot was the only plot type to ever support faceting (with the by keyword).
There aren't any plans to add faceting for other plot types right now. Other libraries like seaborn or ggplot do it and handle DataFrames nicely.
A pull request to document the by keyword, noting that it only has an effect with kind=boxplot would be welcome! There's an issue here about validating keyword arguments to df.plot and warning / raising when things are unused.
This is a hazard of our current API design, which uses a single plot method to dispatch to many different underlying plot types. The ultimate fix is probably to encourage more specific methods like df.plot.scatter() rather than df.plot(). See #9124 for more discussion of that design.
Maybe this is a request for enhancement, or documentation of a bug - I don't know.
A tutorial showed that
DataFrame.boxplot()
can take aby=
keyword to produced plot stratified by the values in one column. I naively assumed I could apply that to other plots as well.DataFrame.plot()
acceptsby=
but seems to silently ignores it. This seems to be an undocumented keyword forDataFrame.plot()
.Is this intentional? Why not include the groupby functionality from
boxplot
intoplot
so it can do something reasonable with all the other plot types as well? If not, then why doesplot
accept theby=
keyword?Example code below,
The text was updated successfully, but these errors were encountered: