-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: bar and line plots are not aligned on the x-axis/xticks #56653
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
8ac6b19
to
c8eb0b4
Compare
@rhshadrach or anyone interested in reviewing, welcoming comments |
c8eb0b4
to
cf55e0c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for submitting a PR
df = DataFrame(plot_data) | ||
ax_bar = df["bars"].plot(kind="bar") | ||
df["pct"].plot(kind="line") | ||
actual_bar_x = [ax.get_x() + ax.get_width() / 2.0 for ax in ax_bar.patches] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part of the code base is not my area of expertise but what does ax.get_x()
return here? I'm a bit surprised we have to do this much work to get the labels
pandas/plotting/_matplotlib/core.py
Outdated
@@ -1805,6 +1805,18 @@ def _kind(self) -> Literal["bar", "barh"]: | |||
def orientation(self) -> PlottingOrientation: | |||
return "vertical" | |||
|
|||
# GH56460 checks similar to LinePlot's _is_ts_plot and _use_dynamic_x |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the OP it seems like the issue was LinePlot
not having the proper labels - is that not the case? If so I think we need to change the logic in that class or maybe even remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, thanks for the comment.
I think the related issue here explains it better than I possibly can: #55508
This is my attempt to explain:
When a BarPlot and a LinePlot are combined, because BarPlot sets sequential indices starting from 0 (the line of code: self.tick_pos = np.arange(len(data))
) while LinePlot uses what it gets, the combined plot uses the correct labels on BarPlot's incorrect xticks.
If the index starts from 3 instead, the line plot is "shifted" by 3 and partially cut off: #56460 (comment)
(I think the displayed labels are reconciled in ticker._compute_offset
with self.locs
containing the entire range when it enters, and then not when it exits, but I am not 100% sure as it took quite a bit of stepping through the code to get there and I was interrupted by Christmas festivities around that time.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! It appears to me that if the xticks are not numeric, the current approach here will fail.
pandas/plotting/_matplotlib/core.py
Outdated
if self._is_series and not self._is_ts_plot(data): | ||
return np.array(self._get_xticks(), dtype=int) | ||
else: | ||
return np.arange(len(data)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like this would fail on the input:
df = pd.DataFrame(
{
"bars": {
"a": 0.5,
"b": 1.0,
},
"pct": {
"a": 4.0,
"b": 2.0,
},
}
)
Is that correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, thanks for catching this. That would fail completely.
In the current code, that dataframe would return a graph with xticks of [0,1], since BarPlot currently sets a numpy integer array for xticks with self.tick_pos = np.arange(len(data))
.
In the context of this bug, using is_integer_dtype(data.index)
should handle the above case within the confines of the integer array BarPlot expects:
def _set_tick_pos(self, data) -> np.ndarray:
# only set the tick_pos to the index, if index is an integer series
# else return a data-length array of integers
if is_integer_dtype(data.index) and self._is_series:
return data.index
else:
return np.arange(len(data))
Would it, however, be a good idea to expand the scope of this bug fix, to encompass also
- When the index is a timestamp
- When the index is a set of floats
This would not so much be a bug fix as a change in the way BarPlot handles plotting, though. As a user, I have always just reset_index()
in order to handle these cases. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw your comments on the bug report, I also need to address them.
patch.get_bbox().ymax | ||
for patch in sorted(ax.patches, key=lambda patch: patch.get_bbox().xmax) | ||
] | ||
result = [patch.get_bbox().ymax for patch in ax.patches] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as is, this test expects [1,2,3,4] and subsequently [10, 50, 20, 30]. it actually passes on both current BarPlot and PR implementations.
but on the actual displayed graph, the xticks are sorted in ascending order. to investigate what's causing this behaviour and to fix the test (or create a new one to test fixed logic)
aa546e4
to
44b69e9
Compare
This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this. |
44b69e9
to
152374a
Compare
I think this fix is actually irrelevant now because of the discussion in the issue. |
@sharonwoo - no problem, will close for now. If you'd like to take up the fix discussed in the issue, feel free to either reuse this PR (we can reopen) or open a new one. |
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.Added a test which will fail on main's self.tick_pos = np.arange(len(data)), which in this test will result in actual_bar_x of [0,1,2,3,4].