-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
automatic inversion of x axis by pandas.plot(...) #10118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
A cleaner example: Note the xticklabels on In [6]: df = pd.DataFrame({'a': [1, 2, 3], 'b': [3, 2, 1], 'c': [10, 20, 30]})
In [7]: fig, (ax1, ax2) = plt.subplots(ncols=2)
In [8]: df.plot(x='a', y='c', ax=ax1)
Out[8]: <matplotlib.axes._subplots.AxesSubplot at 0x1123fe2b0>
In [9]: df.plot(x='b', y='c', ax=ax2)
Out[9]: <matplotlib.axes._subplots.AxesSubplot at 0x10a32b978>
In [10]: plt.savefig('/Users/tom.augspurger/Desktop/gh.png') So what's going on here is we call In [32]: df.set_index('b').plot(kind='bar')
Out[32]: <matplotlib.axes._subplots.AxesSubplot at 0x114609438> did sort the x-axis to go small to large. Thoughts? |
By looking at your example of a bar plot, I now understand the source of confusion. You want the horizontal locations of your bars specified by index (i.e. row label), and each bar labeled by the value of column 'b'. In a scatter plot, I meant the horizontal location of points specified by the value of column 'b', and x axis labeled by the value of column 'b'. Determination of horizontal position needs to be handled separately from the choice of x tick labels. Apparently they are confused since API reference says "x : label or position, default None". If you need to specify position by the index, you can use To clearly separate the way to set position from the way to set labels, a new way to specify x tick labels is necessary, I think. For example, |
The "x: label or position" refers to what you actually pass in, i.e. a label ( df.sort('b').plot(x='b', y='c', ax=ax2) |
Since I don't know the entire design philosophy of pandas and dataframe, probably your way is more suitable. I can use plt.plot(...) directly. I liked dataframe.plot(...) though. It provides parameters like x, y, ax, and automatically putting axis labels and grid. However, let me point out a few more things. What "label or position" means.By "x: label or position" , you mean:
What I thought is:
I think, when visualizing, you care more about how it is displayed, i.e. its effect. How coordinate of a point is specified.
With current design, x coordinates are specified by the row index whereas y coordinates are specified by the value of a column when you use plot(x='a', y='b'). This could be a pitfall for people without knowledge about internals of dataframe.plot. It would be best if these two methods can be explicitly controllable by a user. Sorting is another thing.By sorting, you change the mapping between the row index and the value of a column. Thus, if the row index is used as x coordinates, the resulting graph could look different when the plotted points are connected by lines. As shown below; compare the second row and the third row of the subplots. Range of x axisThe topmost two subplots show that the range of x axis is not selected well when the values of a column is not monotonic. I think this is a separate issue that needs to be fixed.
|
This is fixed in v0.21.0 |
Thanks, #16600 was the fix. |
X axis was inverted automatically and unexpectedly when plotting a series of data against another series of data using pandas.
My example code blow creates three plots, only some, not all, of which shows inverted x axis. I think this behavior is very confusing for users even if there was some rationale behind it. IMHO, automatic inversion of x axis is unnecessary because a user can use invert_xaxis() in case one wants to invert it. On stackoverflow, a workaround was suggested, but no direct solution.
The text was updated successfully, but these errors were encountered: