-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: Fixes #36918 boxplots with matplotlib #37107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will need a test and a whatnew note
Sorry I've now added a test and a note in the 1.2.0 rst |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, the change looks very nice!!
btw, just a question to clarify, are you basically solving two issues in this PR: one is error raised when vert=False
, and the other one is overlapping/duplicate labels when sharex/y=True
if so, maybe clarify it in the whatsnew, and better to make two tests, one to have df and a single plot with vert=False
to prove it works, the other test which is similar to what you have now to show the duplicate/overlapping labels issue is gone?
doc/source/whatsnew/v1.2.0.rst
Outdated
@@ -437,6 +437,7 @@ Plotting | |||
- Bug in :meth:`DataFrame.plot` was rotating xticklabels when ``subplots=True``, even if the x-axis wasn't an irregular time series (:issue:`29460`) | |||
- Bug in :meth:`DataFrame.plot` where a marker letter in the ``style`` keyword sometimes causes a ``ValueError`` (:issue:`21003`) | |||
- Twinned axes were losing their tick labels which should only happen to all but the last row or column of 'externally' shared axes (:issue:`33819`) | |||
- Bug in :meth:`DataFrame.boxplot` was raising ``ValueError`` when plotting with ``vert=True`` on a subplot with shared axes (:issue:`36918`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ehh, isn't the original issue about erroring when vert=False
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, sorry my bad I will correct it
axis.set_major_formatter(FixedFormatter(keys)) | ||
ax.tick_params(axis=axis.axis_name, which="major", rotation=rot) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice clean-up!
def test_boxplot_shared_axis(self, vert): | ||
# GH 37107 | ||
df1 = DataFrame(np.random.random((100, 5)), columns=["A", "B", "C", "D", "E"]) | ||
df2 = DataFrame(np.random.random((100, 5)), columns=["A", "B", "C", "D", "E"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
df2 = DataFrame(np.random.random((100, 5)), columns=["A", "B", "C", "D", "E"]) | |
df2 = df1.copy() |
ahh, if it's to generate two dfs with different random numbers, ignore, but still, probably one df is enough
# In order for the ticklabels to be placed, the plot has to be drawn | ||
fig.canvas.draw() | ||
|
||
for ax in axes: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i might not fully understand, why needs two dfs and two plots (in each of axes
)? can we just use one?
assert len(labels) % 5 == 0 | ||
assert len(labels) // (len(labels) // 5) == 5 | ||
assert labels[:5] == ["A", "B", "C", "D", "E"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you maybe simply it a bit? seems we want to test there are 10 labels while the first 5 is those column names, and last 5 are empty?
Thanks for the feedback about the tests, I looked better into the issue of the duplicated and empty ticks and checked better if it was really related to the shared axis or not. Turns out that it was not. The problem is that when This is really why there are double ticks when
So now I think that the duplicated ticks are not a feature of the shared axis as I initially thought, but rather a "feature" of
The problem can be solved by preventing this redundant ticks to show up when multiple call of
However I have got some questions:
There could be other solutions (such as removing the duplicates in the FixedLocator) I do not know if a more complex solution is acceptable if it can work in a wider range of cases, Sorry for the wall of text and for changing the reviewed code (TODO: still to update the whatsnew and test) |
This pull request is stale because it has been open for thirty days with no activity. Please update or respond to this comment if you're still interested in working on this. |
@francibm97 if you can merge master and update we can get this reviewed and in |
closing as stale. if you want to continue, pls ping and can re-open. |
Fixes #36918
I believe that the @TomAugspurger in #35393 found out that, when
sharex
orsharey
are enabled, matplotlib counts the number of ticks of an axis in a subplot as the total number of ticks of the shared axis in all subplots. Since matplotlib became stricter withset_xticks
, the proposed workaround was replicating the labels n times where n is "total number of ticks"/"number of ticks of this subplot"="number of subplots".However, that fix was implemented only for vertical boxplots and not for horizontal boxplots, hence why the issue.
Also, that fix causes the labels of the shared axis to be drawn n times (where n is the number of subplots) in some cases, see below (matplotlib 3.3.2, code adapted from the one posted by the author of issue 36918)
In this fix I used
set_major_formatter
with aFixedFormatter
which fixes the issue and does not overdraw the labels: