-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
DOC: updated the pandas.DataFrame.plot.hexbin docstring #20121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 3 commits
dc9aabf
17d653b
9f94841
c686467
4977860
697e9fa
439721e
b31b687
4d7b73c
347a012
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2938,26 +2938,59 @@ def scatter(self, x, y, s=None, c=None, **kwds): | |
def hexbin(self, x, y, C=None, reduce_C_function=None, gridsize=None, | ||
**kwds): | ||
""" | ||
Hexbin plot | ||
Make hexagonal binning plot. | ||
|
||
Make an hexagonal binning plot of `x` versus `y`. If `C` is `None` | ||
(the default), this is an histogram of the number of occurrences | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. lgtm. if you can find a referene on wikipedia might be nice to link. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm afraid there is no reference in wikipedia to hexagonal binning. The closed topic I found in wiki is the data binning article. Maybe it's a little bit to generic to include it in the hexbin docstring since it is also suitable for histogram and histogram2d |
||
of the observations at (x[i],y[i]). | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I believe we should quote this using double backticks since it's code and not just a variable name:
But I cannot find this in the guide. @datapythonista? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We discussed the backticks really last minute, and the guide wasn't very clear, but I think that's the standard in general, yes. As it's code, we can also respect PEP-8 and have the space after the comma. :) |
||
|
||
If `C` is specified, specifies values at given coordinates | ||
(x[i],y[i]). These values are accumulated for each hexagonal | ||
bin and then reduced according to `reduce_C_function`, | ||
having as default the numpy's mean function (np.mean). | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we link to the numpy function? I believe:
should do the job. |
||
(If `C` is specified, it must also be a 1-D sequence | ||
of the same length as `x` and `y`.) | ||
|
||
Parameters | ||
---------- | ||
x, y : label or position, optional | ||
Coordinates for each point. | ||
x : label or position | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In the guide we ask people to use types like, There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In my opinion I'd use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The reason we might make an exception is because 'str' is too restrictive for labels, as a column labels can in principle be any hashable object ... (but ok, in most cases str will be what people use) But I am fine here with using There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, I forgot about the type. I have a doubt with |
||
Coordinates for x points. | ||
y : label or position | ||
Coordinates for y points. | ||
C : label or position, optional | ||
The value at each `(x, y)` point. | ||
reduce_C_function : callable, optional | ||
reduce_C_function : callable, optional, default `mean` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. you can remove the 'optional' part here (and the same below) |
||
Function of one argument that reduces all the values in a bin to | ||
a single number (e.g. `mean`, `max`, `sum`, `std`). | ||
gridsize : int, optional | ||
Number of bins. | ||
`**kwds` : optional | ||
Additional keyword arguments are documented in | ||
:meth:`pandas.DataFrame.plot`. | ||
gridsize : int, optional, default 100 | ||
The number of hexagons in the x-direction. | ||
The corresponding number of hexagons in the y-direction is | ||
chosen in a way that the hexagons are approximately regular. | ||
Alternatively, | ||
gridsize can be a tuple with two elements specifying the number | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Then we should write: |
||
of hexagons in the x-direction and the y-direction. | ||
kwds : optional | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should be There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yes please, we'll fix the script one of these days. :) |
||
Keyword arguments to pass on to :py:meth:`pandas.DataFrame.plot`. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you make this explanation into "Additional keyword arguments are documented in DataFrame.plot" ? also, the |
||
|
||
Returns | ||
------- | ||
axes : matplotlib.AxesSubplot or np.array of them | ||
axes : matplotlib.AxesSubplot. | ||
|
||
See Also | ||
-------- | ||
matplotlib.pyplot.hexbin : hexagonal binning plot using matplotlib, | ||
the matplotlib function that is used under the hood. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is the indentation level of this line OK? I thought it should be at the level of the previous one +4. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think sphinx doesn't care, but let's indeed use '+4' to be consistent |
||
|
||
Examples | ||
-------- | ||
|
||
.. plot:: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we add some plain English text before the example to explain what the example is about? |
||
:context: close-figs | ||
|
||
>>> n = 100000 | ||
>>> df = pd.DataFrame({'x':np.random.randn(n), | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For examples using random data I think we should add a seed. Otherwise the example will change everytime we build the documentation, which is a bit weird IMO. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I've been told that it's OK-ish to use random data in plots without seed if the exact plot result is not important, so nevermind. |
||
... 'y':np.random.randn(n)}) | ||
>>> hexbin = df.plot.hexbin(x='x', y='y', cmap='viridis') | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can we add an example using There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, I'm working on one that is self explanatory |
||
""" | ||
if reduce_C_function is not None: | ||
kwds['reduce_C_function'] = reduce_C_function | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd write here also
Make an hexagonal binning plot.
so it's correct English, but a bit nit-pick.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And, something else, in the other PRs I think they used "Generate" instead of "Make" ? I don't really care which ones, but maybe best to be consistent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, Done