Skip to content

DOC: Rephrased doc for Series.asof. Added examples #21034

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Nov 4, 2018
Merged

DOC: Rephrased doc for Series.asof. Added examples #21034

merged 4 commits into from
Nov 4, 2018

Conversation

dragosthealex
Copy link
Contributor

@dragosthealex dragosthealex commented May 14, 2018

Used code from #20652 as example, to illustrate different behaviours.

--------
Take all columns into consideration

>>> data = u'''dt,a,b
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably best to construct this directly, rather than using read_csv. Can you do

df = pd.DataFrame({'a': [...], 'b': [...], index=pd.DatetimeIndex([...])

instead? Also, these floating point values are a bit hard to understand when just glancing. Could you change to integers instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor

@TomAugspurger TomAugspurger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We recently added a docstring guide http://pandas-docs.github.io/pandas-docs-travis/contributing_docstring.html

Can you validate that your changes pass this guide?

@codecov
Copy link

codecov bot commented May 14, 2018

Codecov Report

Merging #21034 into master will decrease coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #21034      +/-   ##
==========================================
- Coverage   92.23%   92.23%   -0.01%     
==========================================
  Files         161      161              
  Lines       51187    51197      +10     
==========================================
+ Hits        47211    47220       +9     
- Misses       3976     3977       +1
Flag Coverage Δ
#multiple 90.61% <100%> (-0.01%) ⬇️
#single 42.27% <100%> (ø) ⬆️
Impacted Files Coverage Δ
pandas/core/generic.py 96.81% <100%> (ø) ⬆️
pandas/core/arrays/timedeltas.py 93.75% <0%> (-0.56%) ⬇️
pandas/util/testing.py 86.64% <0%> (-0.2%) ⬇️
pandas/core/indexes/api.py 99% <0%> (ø) ⬆️
pandas/core/series.py 93.87% <0%> (ø) ⬆️
pandas/core/indexes/datetimelike.py 98.01% <0%> (ø) ⬆️
pandas/core/indexes/timedeltas.py 90.74% <0%> (+0.12%) ⬆️
pandas/core/indexes/frozen.py 92.1% <0%> (+0.43%) ⬆️
pandas/core/arrays/period.py 98.08% <0%> (+0.59%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c30c07f...65c1a2b. Read the comment docs.

@pep8speaks
Copy link

pep8speaks commented May 14, 2018

Hello @dragosthealex! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on May 14, 2018 at 20:44 Hours UTC

Used code from #20652 as
example, to illustrate different behaviours.
@dragosthealex
Copy link
Contributor Author

@TomAugspurger Yes, the validation passes

@@ -6065,19 +6071,43 @@ def asof(self, where, subset=None):

Returns
-------
where is scalar
`where` is scalar
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be formatted as

possible types
    case 1
    case 2

so somehting like

scalar, Series, or DataFrame

* scalar : when `self` is a Series and `where` is a scalar
* Series: when `self` is a Series and `where` is an array-like, or when `self` is a DataFrame and `where` is a scalar
* DataFrame : when `self` is a DataFrame and `where` is an array-like


where is Index: same shape object as input
`where` is :class:`pandas.Index`:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I understand this. Is an Index for where treated differently than other inputes?


Examples
--------
Take all columns into consideration
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add additional examples for

  • Series w/ scalar where
  • Series w/ array-like where
  • DataFrame w/ scalar where

@gfyoung gfyoung added the Docs label May 21, 2018
@jreback
Copy link
Contributor

jreback commented Jun 19, 2018

can you update for comments

@datapythonista datapythonista self-assigned this Jul 22, 2018
@datapythonista datapythonista mentioned this pull request Aug 22, 2018
9 tasks
@jreback
Copy link
Contributor

jreback commented Nov 1, 2018

closing as stale. if you'd like to continue pls ping.

@jreback jreback closed this Nov 1, 2018
@jreback jreback reopened this Nov 1, 2018
@jreback
Copy link
Contributor

jreback commented Nov 1, 2018

@datapythonista this looks close.

@TomAugspurger
Copy link
Contributor

Pushed a quick update. Probably good to merge.

Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed couple of additional formatting fixes, merging on green

@jreback jreback added this to the 0.24.0 milestone Nov 4, 2018
@jreback jreback merged commit 17e3c5c into pandas-dev:master Nov 4, 2018
@jreback
Copy link
Contributor

jreback commented Nov 4, 2018

thanks @dragosthealex

Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@datapythonista I was fixing some doc warnings in #23636, and came across those ones.

In the PR I simply removed all quotes on "NaN" (as that also avoids the problem with "NaNs"), but can certainly discuss further.

@@ -6495,40 +6495,98 @@ def interpolate(self, method='linear', axis=0, limit=None, inplace=False,

def asof(self, where, subset=None):
"""
The last row without any NaN is taken (or the last row without
NaN considering only the subset of columns in the case of a DataFrame)
Return the last row(s) without any `NaN`s before `where`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like `NaN`s fails in sphinx, we can only use ` on things that are quoted completely, without trailing 's' or so

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Further, I am not fully sure we should quote NaN. It is not a pandas function name or keyword argument or so. I would rather say it is "code"-like, but then we should actually use the proper name like np.nan. Therefore, maybe a compromise to simply not quote?

if not None use these columns for NaN propagation
where : date or array-like of dates
Date(s) before which the last row(s) are returned.
subset : str or array-like of str, default `None`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in general we don't quote on the parameter type line?

@datapythonista
Copy link
Member

Yes, those are good points I wanted to discuss at some point.

This is something I fixed in the past, as the rendering in sphinx was broken:

`DataFrame`s -> `DataFrame` objects

For what I know, single backticks should be used for argument names, variables, functions, methods, classes... So, NaN as a value or concept, could use double backticks, considering it code, normal quotes, or nothing. I don't have a preference.

In the type, we don't usually use backticks (but I've seen some cases where we do):

fill_value : object, default None

fill_value : object, default ``None``

I don't have a preference either, but we should start to be consistent.

May be we should create an issue, have some discussion, and then start enforcing it? Not sure how difficult can be to add validation for this in validate_docstrings.py at this point, but surely something can (and should) be done.

JustinZhengBC pushed a commit to JustinZhengBC/pandas that referenced this pull request Nov 14, 2018
tm9k1 pushed a commit to tm9k1/pandas that referenced this pull request Nov 19, 2018
Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019
Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DataFrame.asof() fails when some columns are NaN
7 participants