Skip to content

DOC: update the pd.Index.asof docstring #20344

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

DataOmbudsman
Copy link
Contributor

Checklist for the pandas documentation sprint (ignore this if you are doing
an unrelated PR):

  • PR title is "DOC: update the docstring"
  • The validation script passes: scripts/validate_docstrings.py <your-function-or-method>
  • The PEP8 style check passes: git diff upstream/master -u -- "*.py" | flake8 --diff
  • The html version looks good: python doc/make.py --single <your-function-or-method>
  • It has been proofread on language by another sprint participant

Please include the output of the validation script below between the "```" ticks:

################################################################################
######################## Docstring (pandas.Index.asof)  ########################
################################################################################

Return the latest index label up to and including the passed label.

For sorted indexes, return the index label that is the latest among
the labels that are not later than the passed index label.

Parameters
----------
label : object
    The label up to which the method returns the latest index label.

Returns
-------
object : The index label that is the latest as of the passed label,
    or NaN if there is no such label.

See also
--------
Index.get_loc : `asof` is a thin wrapper around `get_loc`
    with method='pad'.

Examples
--------
13 is the latest index label up to 14.

>>> pd.Index([13, 18, 20]).asof(14)
13

If the label is in the index, the method returns the passed label.

>>> pd.Index([13, 18, 20]).asof(18)
18

If all of the labels in the index are later than the passed label,
NaN is returned.

>>> pd.Index([13, 18, 20]).asof(1)
nan

If the index is not sorted, an error is raised.

>>> pd.Index([13, 20, 18]).asof(18)
Traceback (most recent call last):
ValueError: index must be monotonic increasing or decreasing

################################################################################
################################## Validation ##################################
################################################################################

Docstring for "pandas.Index.asof" correct. :)

If the validation script still gives errors, but you think there is a good reason
to deviate in this case (and there are certainly such cases), please state this
explicitly.

Return the latest index label up to and including the passed label.

For sorted indexes, return the index label that is the latest among
the labels that are not later than the passed index label.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this whole docstring a bit confusing to be honest. I think it's a complex method to document, but I think we can do better.

For the examples, I find this much clearer:

import pandas

idx = pandas.Index(['a', 'b', 'd', 'e'])
idx.asof('c')

For the description, it's difficult, but something like "Return the label from the index, or the previous if not found" would be something easier to understand for me.

For the rest looks quite good.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's hard to be completely understandable within a single line. But I'll try again using your suggestions. Thanks!

@sinhrks sinhrks added the Docs label Mar 15, 2018
@codecov
Copy link

codecov bot commented Mar 19, 2018

Codecov Report

Merging #20344 into master will decrease coverage by 0.04%.
The diff coverage is 90.47%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #20344      +/-   ##
==========================================
- Coverage   91.82%   91.77%   -0.05%     
==========================================
  Files         152      153       +1     
  Lines       49248    49270      +22     
==========================================
- Hits        45222    45220       -2     
- Misses       4026     4050      +24
Flag Coverage Δ
#multiple 90.17% <90.47%> (-0.05%) ⬇️
#single 41.89% <90.47%> (-0.01%) ⬇️
Impacted Files Coverage Δ
pandas/core/indexes/base.py 96.63% <90.47%> (-0.06%) ⬇️
pandas/plotting/_converter.py 63.2% <0%> (-1.88%) ⬇️
pandas/io/clipboard/clipboards.py 30.58% <0%> (-1.6%) ⬇️
pandas/core/nanops.py 95.13% <0%> (-1.57%) ⬇️
pandas/core/config_init.py 99.24% <0%> (-0.76%) ⬇️
pandas/core/arrays/categorical.py 95.7% <0%> (-0.49%) ⬇️
pandas/core/resample.py 96.06% <0%> (-0.37%) ⬇️
pandas/util/_decorators.py 82.25% <0%> (-0.15%) ⬇️
pandas/plotting/_core.py 82.39% <0%> (-0.12%) ⬇️
pandas/io/pytables.py 92.41% <0%> (-0.05%) ⬇️
... and 34 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cdfce2b...9bc532a. Read the comment docs.

@DataOmbudsman
Copy link
Contributor Author

Hello @datapythonista I've updated the documentation based on your suggestions. Do you think it's okay now to be merged?

>>> idx_not_sorted = pd.Index([13, 20, 18])
>>> idx_not_sorted.asof(18)
Traceback (most recent call last):
ValueError: index must be monotonic increasing or decreasing
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. The only think that comes to my mind is if it'd make sense to use dates in the examples. I think the example with integers is simpler, and easier to understand. But I'd say in practice the method is only used with dates.

Like, if you want to know the price of a stock, and you say, give me the price asof 1st of January. The 1st of January the markets are closed, so the price returned would be the one of the 31st of December or the last day the markets were open.

I'm not quite sure what would be the best. If you think you can improve the docstring with the user case I just mentioned, that would be great. Otherwise looks good to me.

And may be we could also add Series.asof and pandas.merge_asof to the See Also section.

Thanks for the good work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback! I've updated the docs using your suggestions.

-------
object : The passed label if it is in the index. The previous label
if the passed label is not in the sorted index, or NaN if there
is no such label.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, just one minor formatting thing. The description goes to the next line, and the colon is not required. Also you can add backticks around the NaN. For the rest is perfect, really like this final result.

Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. Great job!

@DataOmbudsman
Copy link
Contributor Author

Not exactly sure how to proceed now. Should I ping a core developer and ask for merge?

@datapythonista
Copy link
Member

@DataOmbudsman no need for that. Project maintainers and core developers are in most cases volunteers, that use their spare time to work on the projects. They will take a look at this when they have the time. In many cases that will be in few hours or couple of days. In other cases it can take weeks. You just need to wait. If you check it, you'll see that there are 165 pull requests like yours open. Be patient, and work in something else if you've got the time. :)

@DataOmbudsman
Copy link
Contributor Author

@datapythonista All right, thanks for the clarification. :)

@mroeschke mroeschke added this to the 0.24.0 milestone Jul 7, 2018
@mroeschke mroeschke merged commit 3dd0920 into pandas-dev:master Jul 7, 2018
@mroeschke
Copy link
Member

Thanks @DataOmbudsman! Sorry for the delay on our end. Great docstring!

Sup3rGeo pushed a commit to Sup3rGeo/pandas that referenced this pull request Oct 1, 2018
* DOC: update the pd.Index.asof docstring

* clearer descriptions, restructured examples.

* extended See Also section, dates instead of integers in Examples

* minor formatting
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants