-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
DOC: updated the pandas.DataFrame.plot.hexbin docstring #20121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
dc9aabf
17d653b
9f94841
c686467
4977860
697e9fa
439721e
b31b687
4d7b73c
347a012
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2874,25 +2874,62 @@ def scatter(self, x, y, s=None, c=None, **kwds): | |
def hexbin(self, x, y, C=None, reduce_C_function=None, gridsize=None, | ||
**kwds): | ||
""" | ||
Hexbin plot | ||
Make hexagonal binning plots. | ||
|
||
Make an hexagonal binning plot of `x` versus `y`, where `x`, | ||
`y` are 1-D sequences of the same length, `N`. If `C` is `None` | ||
(the default), this is an histogram of the number of occurrences | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. lgtm. if you can find a referene on wikipedia might be nice to link. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm afraid there is no reference in wikipedia to hexagonal binning. The closed topic I found in wiki is the data binning article. Maybe it's a little bit to generic to include it in the hexbin docstring since it is also suitable for histogram and histogram2d |
||
of the observations at (x[i],y[i]). | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I believe we should quote this using double backticks since it's code and not just a variable name:
But I cannot find this in the guide. @datapythonista? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We discussed the backticks really last minute, and the guide wasn't very clear, but I think that's the standard in general, yes. As it's code, we can also respect PEP-8 and have the space after the comma. :) |
||
|
||
If `C` is specified, specifies values at given coordinates | ||
(x[i],y[i]). These values are accumulated for each hexagonal | ||
bin and then reduced according to `reduce_C_function`, | ||
having as default | ||
the numpy's mean function (np.mean). (If *C* is | ||
specified, it must also be a 1-D sequence of the same length | ||
as `x` and `y`.) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you reflow the text a bit so that each line is just less than 79 chars ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done! |
||
|
||
Parameters | ||
---------- | ||
x, y : label or position, optional | ||
Coordinates for each point. | ||
x : label or position, optional | ||
Coordinates for x point. | ||
y : label or position, optional | ||
Coordinates for y point. | ||
C : label or position, optional | ||
The value at each `(x, y)` point. | ||
reduce_C_function : callable, optional | ||
reduce_C_function : callable, optional, default `mean` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. you can remove the 'optional' part here (and the same below) |
||
Function of one argument that reduces all the values in a bin to | ||
a single number (e.g. `mean`, `max`, `sum`, `std`). | ||
gridsize : int, optional | ||
Number of bins. | ||
`**kwds` : optional | ||
gridsize : int, optional, default 100 | ||
The number of hexagons in the x-direction. | ||
The corresponding number of hexagons in the y-direction is | ||
chosen in a way that the hexagons are approximately regular. | ||
Alternatively, | ||
gridsize can be a tuple with two elements specifying the number of | ||
hexagons in the x-direction and the y-direction. | ||
kwds : optional | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should be There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yes please, we'll fix the script one of these days. :) |
||
Keyword arguments to pass on to :py:meth:`pandas.DataFrame.plot`. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you make this explanation into "Additional keyword arguments are documented in DataFrame.plot" ? also, the |
||
|
||
Returns | ||
------- | ||
axes : matplotlib.AxesSubplot or np.array of them | ||
axes : matplotlib.AxesSubplot or np.array of them. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you indicate here when an array is returned? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure if it ever is an ndarray. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It looks like the pandas wrapper can only get |
||
|
||
See Also | ||
-------- | ||
matplotlib.pyplot.hexbin : hexagonal binning plot using matplotlib. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would say something like "the matplotlib function that is used under the hood" ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done |
||
|
||
Examples | ||
-------- | ||
|
||
.. plot:: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we add some plain English text before the example to explain what the example is about? |
||
:context: close-figs | ||
|
||
>>> from sklearn.datasets import load_iris | ||
>>> iris = load_iris() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am not fully sure if we want to rely on sklearn for the example data. In this case, I think it is actually fine to generate some random data with There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yes pls don't use sklearn imports There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok, if there is no problem with using There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yes, you can use random data in this case There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I will update the docstrings guide to explain when random data might be OK in examples, and also explain how to set a random seed to avoid changing examples. :) |
||
>>> df = pd.DataFrame(iris.data, columns=iris.feature_names) | ||
>>> hexbin = df.plot.hexbin(x='sepal length (cm)', | ||
... y='sepal width (cm)', | ||
... gridsize=10, cmap='viridis') | ||
""" | ||
if reduce_C_function is not None: | ||
kwds['reduce_C_function'] = reduce_C_function | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would remove " where
x
,y
are 1-D sequences of the same length" since they are references to columns