Skip to content

ENH: Adding DataFrame.boxplot() #287

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

Conversation

lodagro
Copy link
Contributor

@lodagro lodagro commented Oct 25, 2011

Added DataFrame.boxplot() in similar way as DataFrame.plot() and DataFrame.hist().
Some syntactic sugar ;-)

Thinking about adding DataFrameGroupBy.boxplot(), now this call is transfered to DataFrame.boxplot().
But i would prefer the behavior to be different. When groupby is used i`m interested to see in a single plot boxplots for one particular column for all groupby keys and not in one plot to have boxpots for all columns for one groupby key. An input argument can be used to so both ways of boxplotting can be used.

@wesm
Copy link
Member

wesm commented Nov 2, 2011

I'll think about this a bit. There are two cases:

  • Box plots of each column
  • Box plot of one column, grouped by the values in another

I guess it would make sense to support both of these use cases in a single function. You could imagine a plot with multiple box plots generated off of multiple columns, would be pretty cool

@wesm
Copy link
Member

wesm commented Nov 14, 2011

So I'm skipping your patch and I hacked up a DataFrame.boxplot with a groupby and column selection option, it's very nice! See e.g. http://imgur.com/1oko5

take a look in the latest and try out the new boxplot function...let me know if there's anything else you would want to add

@wesm wesm closed this Nov 14, 2011
@lbeltrame
Copy link
Contributor

I noticed an inconsistency in the boxplot documentation. It's supposed to return an AxesSubplot instance, but at least without grouping, it returns None.

@lodagro
Copy link
Contributor Author

lodagro commented Nov 15, 2011

Just gave it a spin, this is what i had in mind and looks good.

some comments:

  • As cswegger mentioned DataFrame.boxplot has no return call (pandas.tools.plotting.boxplot does), when i create boxplots i usually add two horizontal lines on the plot indicating the data boundaries, getting the ax as return from boxplot makes this easy. I did not try to use the ax input argument.
  • Either i don`t understand the column argument or it has some issues.
    repo/scripts/boxplot_test..py is where i started from:
    • df.boxplot(by=['indic', 'indic2'], fontsize=8, rot=90) -- works fine, this is what is in the test script
    • df.boxplot() -- ok
    • df.boxplot(column=['A']) -- i would expect this would only create a boxplot for column A, but boxplot for all coumns are generated
    • df.boxplot(by=['indic', 'indic2'], column=['A']) -- i woudl expect a boxplot similar as in the first example, but only for column A, it gives TypeError: unhashable type: 'list'
  • I mainly use MultiIndex together with DataFrame. The by argument of Dataframe.boxplot works on columns, not on the index, workaround is to use DataFrame.delevel(), but then the indices are also boxplotted (if they hold numeric data). Deleveling also offers possibility to do groupby on multiple levels, which is not (yet :-) )possible on the MultiIndex itself.
  • subplot layout improved (not always using a rectangular layout), which is nice. DataFrame.hist() could benefit from this also.

dan-nadler pushed a commit to dan-nadler/pandas that referenced this pull request Sep 23, 2019
support for has_symbol and updates to list_symbols
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants