Skip to content

DOC: Rephrased doc for Series.asof. Added examples #21034

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Nov 4, 2018
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
88 changes: 73 additions & 15 deletions pandas/core/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -6495,40 +6495,98 @@ def interpolate(self, method='linear', axis=0, limit=None, inplace=False,

def asof(self, where, subset=None):
"""
The last row without any NaN is taken (or the last row without
NaN considering only the subset of columns in the case of a DataFrame)
Return the last row(s) without any `NaN`s before `where`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like `NaN`s fails in sphinx, we can only use ` on things that are quoted completely, without trailing 's' or so

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Further, I am not fully sure we should quote NaN. It is not a pandas function name or keyword argument or so. I would rather say it is "code"-like, but then we should actually use the proper name like np.nan. Therefore, maybe a compromise to simply not quote?


The last row (for each element in `where`, if list) without any
`NaN` is taken.
In case of a :class:`~pandas.DataFrame`, the last row without `NaN`
considering only the subset of columns (if not `None`)

.. versionadded:: 0.19.0 For DataFrame

If there is no good value, NaN is returned for a Series
If there is no good value, `NaN` is returned for a Series or
a Series of NaN values for a DataFrame

Parameters
----------
where : date or array of dates
subset : string or list of strings, default None
if not None use these columns for NaN propagation
where : date or array-like of dates
Date(s) before which the last row(s) are returned.
subset : str or array-like of str, default `None`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in general we don't quote on the parameter type line?

For DataFrame, if not `None`, only use these columns to
check for `NaN`s.

Notes
-----
Dates are assumed to be sorted
Raises if this is not the case
Dates are assumed to be sorted. Raises if this is not the case.

Returns
-------
where is scalar

- value or NaN if input is Series
- Series if input is DataFrame
scalar, Series, or DataFrame

where is Index: same shape object as input
* scalar : when `self` is a Series and `where` is a scalar
* Series: when `self` is a Series and `where` is an array-like,
or when `self` is a DataFrame and `where` is a scalar
* DataFrame : when `self` is a DataFrame and `where` is an
array-like

See Also
--------
merge_asof
merge_asof : Perform an asof merge. Similar to left join.

"""
Examples
--------
A Series and a scalar `where`.

>>> s = pd.Series([1, 2, np.nan, 4], index=[10, 20, 30, 40])
>>> s
10 1.0
20 2.0
30 NaN
40 4.0
dtype: float64

>>> s.asof(20)
2.0

For a sequence `where`, a Series is returned. The first value is
``NaN``, because the first element of `where` is before the first
index value.

>>> s.asof([5, 20])
5 NaN
20 2.0
dtype: float64

Missing values are not considered. The following is ``2.0``, not
``NaN``, even though ``NaN`` is at the index location for ``30``.

>>> s.asof(30)
2.0

Take all columns into consideration
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add additional examples for

  • Series w/ scalar where
  • Series w/ array-like where
  • DataFrame w/ scalar where


>>> df = pd.DataFrame({'a': [10, 20, 30, 40, 50],
... 'b': [None, None, None, None, 500]},
... index=pd.DatetimeIndex(['2018-02-27 09:01:00',
... '2018-02-27 09:02:00',
... '2018-02-27 09:03:00',
... '2018-02-27 09:04:00',
... '2018-02-27 09:05:00']))
>>> df.asof(pd.DatetimeIndex(['2018-02-27 09:03:30',
... '2018-02-27 09:04:30']))
a b
2018-02-27 09:03:30 NaN NaN
2018-02-27 09:04:30 NaN NaN

Take a single column into consideration

>>> df.asof(pd.DatetimeIndex(['2018-02-27 09:03:30',
... '2018-02-27 09:04:30']),
... subset=['a'])
a b
2018-02-27 09:03:30 30.0 NaN
2018-02-27 09:04:30 40.0 NaN
"""
if isinstance(where, compat.string_types):
from pandas import to_datetime
where = to_datetime(where)
Expand Down