Inconsistent indexes for tick label plotting #28733

nrebena · 2019-10-01T17:08:33Z

The tick position for BarPlot can be define using the convert tool from matplotlib.

The main advantage is that it will reuse the same position when you have text as axis values. Previously, the tick position was determined by the order of the given element, so that ['A', 'B'] where given label [0, 1], and if updating the plot with order ['B', 'A'], you will not draw at the right position.

The resulting plot for each issue would be:

closes Inconsitent index for plot #26186
closes X axis weirdness with DataFrame.plot(kind='bar') #11465
tests added / passed
passes black pandas
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

nrebena · 2019-10-01T21:24:35Z

So one of the example of the doc revealed a bug in this PR.
I added it to the test suite and hopefully will fix it. It is about using a MultiIndex of string (or containing a string in a level?) as axis label.

nrebena · 2019-10-03T19:55:07Z

This PR solved the bug related to index, but handling MultiIndex in the same way requires more work and I consider it outside the scope of this PR, so I expressely did not handle it.

The best way to handle MultiIndex plot would be via label grouping and multi-level axis label, as described in this issue on matplotlib matplotlib/matplotlib#6321, and this one on pandas #2088. I may try to tackle this one day, but definitly not in this PR.

pandas/plotting/_matplotlib/core.py

WillAyd · 2019-11-07T23:37:03Z

pandas/tests/plotting/test_frame.py

@@ -1129,14 +1129,18 @@ def test_bar_categorical(self):
        for df in [df1, df2]:
            ax = df.plot.bar()
            ticks = ax.xaxis.get_ticklocs()
-            tm.assert_numpy_array_equal(ticks, np.array([0, 1, 2, 3, 4, 5]))
+            tm.assert_numpy_array_equal(
+                ticks, np.array([0, 1, 2, 3, 4, 5], dtype=np.float)


Why did this need to change?

The ticks pos are handled by matplotlib fonction convert_xunits, which return a float array rather than the int array previously used.
I would argue that having float array for tick position make more sense to have the same type for all plot.
I do not think that the point of the test was to ensure the type of the tick position, so I changed it.

Can you help me understand what part of this PR would have changed this? Still not sure on this

Before the PR, the tick position where defined by using self.tick_pos = np.arange(len(data)).
In this PR, I let matplotlib handle the position by calling

ax.xaxis.update_units(self.ax_index) self.tick_pos = ax.convert_xunits(self.ax_index)

This use the matplotlib Conversion Interface. For convertiong of string category, this use numpy float array (see https://matplotlib.org/_modules/matplotlib/category.html#StrCategoryConverter), so the tick position are now float instead of int.

I could cast this back to an int array, but it made more sense IMO to change the test here, that trying to force the tick position to be integer.

I think this should stay as integer to not break backwards compatibility - how hard would it be to accomplish that?

pandas/tests/plotting/test_frame.py

jreback · 2019-12-27T19:50:13Z

can you merge master and will look again

WillAyd · 2020-02-12T00:37:52Z

pandas/tests/plotting/test_frame.py

@@ -1129,14 +1129,18 @@ def test_bar_categorical(self):
        for df in [df1, df2]:
            ax = df.plot.bar()
            ticks = ax.xaxis.get_ticklocs()
-            tm.assert_numpy_array_equal(ticks, np.array([0, 1, 2, 3, 4, 5]))
+            tm.assert_numpy_array_equal(
+                ticks, np.array([0, 1, 2, 3, 4, 5], dtype=np.float)


I think this should stay as integer to not break backwards compatibility - how hard would it be to accomplish that?

WillAyd · 2020-02-12T00:38:06Z

doc/source/whatsnew/v1.0.0.rst

@@ -1090,6 +1090,7 @@ Plotting
 - Bug in color validation incorrectly raising for non-color styles (:issue:`29122`).
 - Allow :meth:`DataFrame.plot.scatter` to plot ``objects`` and ``datetime`` type data (:issue:`18755`, :issue:`30391`)
 - Bug in :meth:`DataFrame.hist`, ``xrot=0`` does not work with ``by`` and subplots (:issue:`30288`).
+- Bug in BarPlot. Tick position where assigned by value order instead of using the value for numeric, or a smart order for sting. (:issue:`26186` and issue:`11465`)


Can you move this to 1.1?

nrebena · 2020-02-15T13:29:59Z

I made the dtype for tick array as int as pointed by @WillAyd, see last commit change.
I also rebased on master.

WillAyd · 2020-03-14T22:16:34Z

@nrebena I think this should be good. Can you fix the merge conflict?

nrebena · 2020-03-14T23:01:05Z

Rebased and green 🍏 @WillAyd

WillAyd

lgtm - @jreback care to look?

jreback · 2020-03-14T23:25:57Z

doc/source/whatsnew/v1.1.0.rst

@@ -345,6 +345,7 @@ Plotting
 - :func:`.plot` for line/bar now accepts color by dictonary (:issue:`8193`).
 -
 - Bug in :meth:`DataFrame.boxplot` and :meth:`DataFrame.plot.boxplot` lost color attributes of ``medianprops``, ``whiskerprops``, ``capprops`` and ``medianprops`` (:issue:`30346`)
+- Bug in BarPlot. Tick position where assigned by value order instead of using the value for numeric, or a smart order for sting. (:issue:`26186` and issue:`11465`)


if u can add the proper reference here to plot.bar

looks like the notes are duplicated (2nd one looks good)

pandas/plotting/_matplotlib/core.py

jreback · 2020-03-14T23:27:22Z

pandas/plotting/_matplotlib/core.py

@@ -1345,6 +1348,16 @@ def _make_plot(self):

        for i, (label, y) in enumerate(self._iter_data(fillna=0)):
            ax = self._get_ax(i)
+
+            if self.orientation == "vertical":


is there any assert in orientation that it takes on only certain values?

No assert. Orientation is never a parameter. It is defined by the BarPlot class as "vertical", and overloaded by the Barhplot class as "horizontal". It is also defined in LinePlot class.
Globaly, it is use by the _post_plot_logic_common method from the base class MPLPlot.

charlesdong1991

Hi, @nrebena

nice change and I think it's very close to get merged,

just two nits since there is conflicts if you are still interested in.

charlesdong1991 · 2020-05-18T16:24:38Z

pandas/plotting/_matplotlib/core.py

+            elif self.orientation == "horizontal":
+                ax.yaxis.update_units(self.ax_index)
+                self.tick_pos = ax.convert_yunits(self.ax_index).astype(np.int)
+                self.ax_pos = self.tick_pos - self.tickoffset


i think this self.ax_pos = self.tick_pos - self.tickoffset can be taken out from if.. else.. clause, and no need to use it twice. the main purpose of this clause is to get self.tick_pos for bar and barh separately.

Sure thing.

charlesdong1991 · 2020-05-18T16:26:51Z

pandas/tests/plotting/test_frame.py

+        errors = gp3.std()
+
+        # No assertion we just ensure that we can plot a MultiIndex bar plot
+        means.plot.bar(yerr=errors, capsize=4)


don't we have user warning here? shall catch it?

Good catch. Done.

MarcoGorelli · 2020-06-03T13:53:15Z

Hi @nrebena - sorry to chase you up, just wanted to ask if this is still active

nrebena · 2020-06-03T14:00:08Z

Hi. No worries. I may come around to finish it, but if someone want to take it over no problem.

charlesdong1991

thanks @nrebena for sticking to it, sorry for such long time! the CI error doesn't seem related to this PR

cc @jreback for comments

arw2019 · 2020-11-21T03:35:53Z

@nrebena can you merge master once more?

ping @jreback

…26186_tick_label

charlesdong1991 · 2020-11-21T12:37:14Z

thanks @nrebena , very nice PR!

thanks @MarcoGorelli for resolving the conflicts!

This reverts commit fb379d8.

…abel plotting (pandas-dev#28733)"

jreback · 2021-01-18T15:21:15Z

this PR was reverted, but I guess we don't have a tracking issue. @nrebena can you open one.

…ing (#28733)" (#39252) Co-authored-by: Simon Hawkins <[email protected]>

…" (pandas-dev#39235)

nrebena added a commit to nrebena/pandas that referenced this pull request Oct 3, 2019

DOC: Add whatsnew entry for PR pandas-dev#28733

b6dc7e9

WillAyd requested changes Nov 7, 2019

View reviewed changes

nrebena added a commit to nrebena/pandas that referenced this pull request Nov 19, 2019

DOC: Add whatsnew entry for PR pandas-dev#28733

16589c3

nrebena force-pushed the issue_26186_tick_label branch from 38d5392 to 2b8373b Compare November 19, 2019 19:54

jreback added the Visualization plotting label Nov 20, 2019

jreback changed the title ~~Issue 26186 tick label~~ Inconsistent indexes for tick label plotting Nov 20, 2019

nrebena requested a review from WillAyd December 16, 2019 20:09

WillAyd requested changes Feb 12, 2020

View reviewed changes

nrebena force-pushed the issue_26186_tick_label branch from bf6fefc to 82ef684 Compare February 12, 2020 19:58

nrebena added a commit to nrebena/pandas that referenced this pull request Feb 12, 2020

DOC: Add whatsnew entry for PR pandas-dev#28733

ef5d9dc

nrebena force-pushed the issue_26186_tick_label branch from 22608a2 to bcb47bb Compare February 16, 2020 10:37

nrebena added a commit to nrebena/pandas that referenced this pull request Feb 16, 2020

DOC: Add whatsnew entry for PR pandas-dev#28733

81625d3

nrebena added a commit to nrebena/pandas that referenced this pull request Feb 16, 2020

DOC: Add whatsnew entry for PR pandas-dev#28733

284942c

nrebena force-pushed the issue_26186_tick_label branch from bcb47bb to 57e09cc Compare February 16, 2020 18:04

nrebena force-pushed the issue_26186_tick_label branch from 57e09cc to 02a52a6 Compare March 14, 2020 22:26

nrebena added a commit to nrebena/pandas that referenced this pull request Mar 14, 2020

DOC: Add whatsnew entry for PR pandas-dev#28733

306c82a

WillAyd approved these changes Mar 14, 2020

View reviewed changes

jreback requested changes Mar 14, 2020

View reviewed changes

nrebena force-pushed the issue_26186_tick_label branch from c6c23ac to 5b40eed Compare March 15, 2020 14:56

charlesdong1991 suggested changes May 18, 2020

View reviewed changes

nrebena added a commit to nrebena/pandas that referenced this pull request Jun 13, 2020

DOC: Add whatsnew entry for PR pandas-dev#28733

1ffda9b

nrebena force-pushed the issue_26186_tick_label branch from 5b40eed to c66af8e Compare June 13, 2020 18:03

nrebena added 11 commits September 12, 2020 18:13

CLN: Clean up in code and doc

0ef2dd5

CLN: Clean up test_bar_numeric

e22710a

DOC Move to whatsnew v1.1

52cfacb

FIX: Make tick dtype int for backwards compatibility

0ac15a0

DOC: Improve whatsnew message

70683b8

ENH: Add UserWarning when plotting bar plot with MultiIndex

3c8b54f

CLN: Remove duplicate code line

28f06fc

TST: Capture UserWarning for Bar plot with MultiIndex

c19ef4b

TST: Improve test explanation

a28db9c

ENH: Raise UserWarning only if redrawing on existing axis with data

1e39ad9

DOC: Move to whatsnew v1.2.9

23635d4

nrebena force-pushed the issue_26186_tick_label branch from 0355c8a to 23635d4 Compare September 12, 2020 16:26

charlesdong1991 approved these changes Sep 19, 2020

View reviewed changes

Merge remote-tracking branch 'upstream/master' into pr/nrebena-issue_…

9593256

…26186_tick_label

charlesdong1991 merged commit fb379d8 into pandas-dev:master Nov 21, 2020

pcolazurdo mentioned this pull request Dec 28, 2020

BUG: Change of behavior between 1.1.5 and 1.2.0 in plot functions #38736

Closed

3 tasks

This was referenced Dec 31, 2020

REGR: Sorted pandas series is not plotted in the expected order in pandas 1.2.0 #38865

Closed

REGR: Barplot broken on Index(dtype='object') #38947

Closed

jorisvandenbossche mentioned this pull request Jan 5, 2021

REGR: Bar plot from Series with IntervalIndex fails in pandas 1.2.0 #38969

Closed

3 tasks

simonjayhawkins added a commit to simonjayhawkins/pandas that referenced this pull request Jan 17, 2021

Revert "Inconsistent indexes for tick label plotting (pandas-dev#28733)"

bfa923e

This reverts commit fb379d8.

simonjayhawkins mentioned this pull request Jan 17, 2021

Revert "Inconsistent indexes for tick label plotting (#28733)" #39235

Merged

jreback pushed a commit that referenced this pull request Jan 18, 2021

Revert "Inconsistent indexes for tick label plotting (#28733)" (#39235)

358c614

meeseeksmachine pushed a commit to meeseeksmachine/pandas that referenced this pull request Jan 18, 2021

Backport PR pandas-dev#39235: Revert "Inconsistent indexes for tick l…

d036e19

…abel plotting (pandas-dev#28733)"

meeseeksmachine mentioned this pull request Jan 18, 2021

Backport PR #39235 on branch 1.2.x (Revert "Inconsistent indexes for tick label plotting (#28733)") #39252

Merged

simonjayhawkins mentioned this pull request Jan 18, 2021

X axis weirdness with DataFrame.plot(kind='bar') #11465

Open

simonjayhawkins added a commit that referenced this pull request Jan 18, 2021

Backport PR #39235: Revert "Inconsistent indexes for tick label plott…

3cad03f

…ing (#28733)" (#39252) Co-authored-by: Simon Hawkins <[email protected]>

nofarm3 pushed a commit to nofarm3/pandas that referenced this pull request Jan 21, 2021

Revert "Inconsistent indexes for tick label plotting (pandas-dev#28733)…

e35866c

…" (pandas-dev#39235)

Uh oh!

Inconsistent indexes for tick label plotting #28733

Inconsistent indexes for tick label plotting #28733

Uh oh!

Conversation

nrebena commented Oct 1, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nrebena commented Oct 1, 2019

Uh oh!

nrebena commented Oct 3, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nrebena Dec 16, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jreback commented Dec 27, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nrebena commented Feb 15, 2020

Uh oh!

WillAyd commented Mar 14, 2020

Uh oh!

nrebena commented Mar 14, 2020

Uh oh!

WillAyd left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

charlesdong1991 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

charlesdong1991 May 18, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MarcoGorelli commented Jun 3, 2020

Uh oh!

nrebena commented Jun 3, 2020

Uh oh!

charlesdong1991 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

arw2019 commented Nov 21, 2020

Uh oh!

charlesdong1991 commented Nov 21, 2020

Uh oh!

nrebena commented Oct 1, 2019 •

edited

Loading

nrebena commented Oct 3, 2019 •

edited

Loading

nrebena Dec 16, 2019 •

edited

Loading

charlesdong1991 left a comment •

edited

Loading

charlesdong1991 May 18, 2020 •

edited

Loading

charlesdong1991 left a comment •

edited

Loading