Skip to content

Sharey keyword for boxplot #20968

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 18 commits into from
Jun 8, 2018
Merged

Sharey keyword for boxplot #20968

merged 18 commits into from
Jun 8, 2018

Conversation

sorenwacker
Copy link

…to _subplots().

.. code-block:: jupyter-notebook

%pylab inline
import pandas as pd

N = 100
rand = random.random(N)
clas = random.binomial(5,.5, N)
df = pd.DataFrame({'Rand': rand-clas,
                'Rand2': rand,
                'Class': clas},
                index= np.arange(N))

df.groupby('Class').boxplot(sharey=True, sharex=False)
>>> TypeError: boxplot() got an unexpected keyword argument 'sharey'

New Behavior:

.. ipython:: jpyter-notebook:

...

df.groupby('Class').boxplot(sharey=True, sharex=True)
df.groupby('Class').boxplot(sharey=True, sharex=False)
df.groupby('Class').boxplot(sharey=False, sharex=True)
df.groupby('Class').boxplot(sharey=False, sharex=False)

All leads to different behaviour. The shareing of axes both x and y
can be turned on and off separately. 

To restore previous behavior, use boxplot() without keywords.

Default is the previous behavior of sharey

@codecov
Copy link

codecov bot commented May 6, 2018

Codecov Report

Merging #20968 into master will increase coverage by 0.03%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #20968      +/-   ##
==========================================
+ Coverage   91.85%   91.88%   +0.03%     
==========================================
  Files         153      153              
  Lines       49570    49564       -6     
==========================================
+ Hits        45533    45543      +10     
+ Misses       4037     4021      -16
Flag Coverage Δ
#multiple 90.28% <ø> (+0.03%) ⬆️
#single 41.86% <ø> (ø) ⬆️
Impacted Files Coverage Δ
pandas/plotting/_core.py 83.53% <ø> (+1.14%) ⬆️
pandas/core/reshape/pivot.py 96.97% <0%> (-0.06%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update abfac97...fd76955. Read the comment docs.

@TomAugspurger
Copy link
Contributor

We'll want tests for this. See the e.g. test_sharex_and_ax in pandas/tests/plotting/test_frame.py for an example.

@sorenwacker
Copy link
Author

Thanks. I will look into that and give it a try.

@pep8speaks
Copy link

pep8speaks commented May 9, 2018

Hello @soerendip! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on June 08, 2018 at 11:26 Hours UTC

@@ -898,6 +898,48 @@ yourself. To revert to the old setting, you can run this line:

pd.options.display.max_columns = 20

.. _whatsnew_0230.boxplot.sharexy:

Optional sharing of x/y-axis by pandas.DataFrame().groupby().boxplot()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't an API change, so we don't need this long of a release note. Probably a single line saying that boxplot now accepts the sharey keyword, and a link to a more detailed example in the docstring or plotting.rst

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the docstring and the the part in v0.23.0.txt.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move this to 0.23.1.txt now?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I think the example can be trimmed down. I'd rather update the docstring for DataFrame.plot.box since that's a longer-term thing. The release note are for people checking changes. They'll see that it now supports sharey, and click through if interested.

@@ -898,6 +898,48 @@ yourself. To revert to the old setting, you can run this line:

pd.options.display.max_columns = 20

.. _whatsnew_0230.boxplot.sharexy:

Optional sharing of x/y-axis by pandas.DataFrame().groupby().boxplot()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you move this to 0.23.1.txt now?

@@ -898,6 +898,48 @@ yourself. To revert to the old setting, you can run this line:

pd.options.display.max_columns = 20

.. _whatsnew_0230.boxplot.sharexy:

Optional sharing of x/y-axis by pandas.DataFrame().groupby().boxplot()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I think the example can be trimmed down. I'd rather update the docstring for DataFrame.plot.box since that's a longer-term thing. The release note are for people checking changes. They'll see that it now supports sharey, and click through if interested.


Previous Behavior:

.. code-block:: jupyter-notebook
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the other examples in /pandas/plotting/_core.py for how these should be formatted.


...

df.groupby('Class').boxplot(sharey=True, sharex=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't need all of these. Probably just one will convey the point.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, 0.23.1.txt does not exist yet. When I fetch from upstream I don't receive it. Does that mean I should create that file? I thought I already removed the example. Not sure why it still there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@soerendip did you merge after fetching? The file is there already on master

@sorenwacker
Copy link
Author

Done. Hope it is correct this time.

@@ -2567,6 +2567,12 @@ def boxplot_frame_groupby(grouped, subplots=True, column=None, fontsize=None,
figsize : A tuple (width, height) in inches
layout : tuple (optional)
(rows, columns) for the layout of the plot
sharex :
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you take a look at http://pandas.pydata.org/pandas-docs/stable/contributing_docstring.html for how to format these? Something like

sharex : bool, default False
    Whether the horizontal axes is shared among subplots

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, did that. Should this part of the doc-string then be corrected as well?

subplots :
    * ``False`` - no subplots will be used
    * ``True`` - create a subplot for each group


# standart behavior
axes = df.groupby('c').boxplot()
self._check_visible(axes[0].get_yticklabels(), visible=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests are a bit wordy. I wonder if the following is clear?

expected = pd.Series([True, False, True, False])
result = axes.apply(lambda ax: ax.yaxis.get_visible())
tm.assert_frame_equal(result, expected)

Is that the same? Do you find it clearer?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see where you want to go. Would this be acceptable? It uses a function that you might want somewhere else though in case you want to use it in other tests? Where would be a good position to put it?

def test_groupby_boxplot_sharex(self):

    def _assert_xtickslabels_visibility(axes, expected):
        for ax, exp in zip(axes, expected):
            self._check_visible(ax.get_xticklabels(), visible=exp)

    df = DataFrame({'a': [-1.43, -0.15, -3.70, -1.43, -0.14],
                    'b': [0.56, 0.84, 0.29, 0.56, 0.85],
                    'c': [0, 1, 2, 3, 1]},
                   index=[0, 1, 2, 3, 4])

    # standart behavior
    axes = df.groupby('c').boxplot()
    expected = [True, True, True, True]
    _assert_xtickslabels_visibility(axes, expected)
    # set sharex=False should be identical
    axes = df.groupby('c').boxplot(sharex=False)
    expected = [True, True, True, True]
    _assert_xtickslabels_visibility(axes, expected)        
    # sharex=True, xticklabels should be visible only for bottom plots
    axes = df.groupby('c').boxplot(sharex=True)
    expected = [False, False, True, True]
    _assert_xtickslabels_visibility(axes, expected)´

@sorenwacker
Copy link
Author

sorenwacker commented May 28, 2018

Any updates on this? I shortened the tests in the style of your examples using the assert function that was implemented to check for visibility. However, I had to create two helper functions, and was not sure where the optimal place for those is.

@@ -2567,6 +2567,10 @@ def boxplot_frame_groupby(grouped, subplots=True, column=None, fontsize=None,
figsize : A tuple (width, height) in inches
layout : tuple (optional)
(rows, columns) for the layout of the plot
sharex : bool, default False
Whether x-axes will be shared among subplots
Copy link
Contributor

@jreback jreback May 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a versionadded tag (0.23.1)

Copy link
Author

@sorenwacker sorenwacker May 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like this? I added this to v.0.23.1.txt.

Plotting
^^^^^^^^

- Optional sharing of x/y-axis by pandas.DataFrame().groupby().boxplot() (:issue: `20968`)

If not, could you be more specific about what I should add where, please?

Copy link
Contributor

@TomAugspurger TomAugspurger May 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

    sharey : bool, default False
        Whether y-axes will be shared among subplots 

        .. versionadded:: 0.23.1

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you.

# sharey can now be switched check whether the right
# pair of axes is turned on or off

def _assert_ytickslabels_visibility(axes, expected):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we have similar functions at a top level? maybe we should

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I am new to pandas. I am not sure whether I understand what you mean. Are you saying, I should move the function somewhere else? If yes, where exactly should the function go?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can make this a function on the class itself rather than defininig it inside the function

# standart behavior
axes = df.groupby('c').boxplot()
expected = [True, False, True, False]
_assert_ytickslabels_visibility(axes, expected)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use a blank line between cases

@jreback jreback added this to the 0.23.1 milestone May 29, 2018
@sorenwacker
Copy link
Author

All changes should be done.


Optional sharing of x/y-axis by pandas.DataFrame().groupby().boxplot()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(:issue:`15184`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this file needs to be reverted

# sharey can now be switched check whether the right
# pair of axes is turned on or off

def _assert_ytickslabels_visibility(axes, expected):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can make this a function on the class itself rather than defininig it inside the function

'c': [0, 1, 2, 3, 1]},
index=[0, 1, 2, 3, 4])

# standart behavior
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

standard

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is 'standard' mean here, can you update the comment

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@TomAugspurger
Copy link
Contributor

@soerendip looks like tests are failing in the latest commit.

@sorenwacker
Copy link
Author

sorenwacker commented Jun 6, 2018 via email

@@ -76,7 +76,7 @@ I/O
Plotting
^^^^^^^^

-
- Optional sharing of x/y-axis by pandas.DataFrame().groupby().boxplot() (:issue: `20968`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its not clear to me what is being fixed / enhancement here, can you re-word

@@ -2921,6 +2972,14 @@ def test_secondary_axis_font_size(self, method):
self._check_ticks_props(axes=ax.right_ax,
ylabelsize=fontsize)

def _assert_ytickslabels_visibility(self, axes, expected):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you you move these to the very top of the class (I think we hvae other checker functions there)

@sorenwacker
Copy link
Author

Done.

@jreback jreback merged commit 4f1704e into pandas-dev:master Jun 8, 2018
@jreback
Copy link
Contributor

jreback commented Jun 8, 2018

thanks @soerendip

daminisatya pushed a commit to daminisatya/pandas that referenced this pull request Jun 8, 2018
@jorisvandenbossche jorisvandenbossche modified the milestones: 0.23.1, 0.24.0 Jun 8, 2018
david-liu-brattle-1 pushed a commit to david-liu-brattle-1/pandas that referenced this pull request Jun 18, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Please, add sharey keyword to boxplot.
7 participants