-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
DOC: update the pandas.core.resample.Resampler.fillna docstring #20379
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 1 commit
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -624,18 +624,134 @@ def backfill(self, limit=None): | |
|
||
def fillna(self, method, limit=None): | ||
""" | ||
Fill missing values | ||
Fill the new missing values in the resampled data using different | ||
methods. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you try to get this on a single line? |
||
|
||
In statistics, imputation is the process of replacing missing data with | ||
substituted values [1]_. When resampling data, missing values may | ||
appear (e.g., when the resampling frequency is higher than the original | ||
frequency). | ||
|
||
The backward fill ('bfill') will replace NaN values that appeared in | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Aside from the last sentence, this can be folded into the Parameters section. |
||
the resampled data with the next value in the original sequence. The | ||
forward fill ('ffill'), on the other hand, will replace NaN values | ||
that appeared in the resampled data with the previous value in the | ||
original sequence. Missing values that existed in the orginal data will | ||
not be modified. | ||
|
||
Parameters | ||
---------- | ||
method : str, method of resampling ('ffill', 'bfill') | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you change the type description to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Note that ffilll is an alias for pad and bfill is an alias for backfill. Can you check that And move the descriptions from above here. |
||
Method to use for filling holes in resampled data | ||
* ffill: use previous valid observation to fill gap (forward | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No indentation is needed here (compared to the "Method ..." on the line above), but, sphinx needs a blank line between those two lines There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Quote these, since they're strings. |
||
fill). | ||
* bfill: use next valid observation to fill gap (backward | ||
fill). | ||
limit : integer, optional | ||
limit of how many values to fill | ||
Limit of how many values to fill. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Say consecutive values to fill |
||
|
||
Returns | ||
------- | ||
Series, DataFrame | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think |
||
An upsampled Series or DataFrame with backward or forwards filled | ||
NaN values. | ||
|
||
Examples | ||
-------- | ||
|
||
Resampling a Series: | ||
|
||
>>> s = pd.Series([1, 2, 3], | ||
... index=pd.date_range('20180101', periods=3, freq='h')) | ||
>>> s | ||
2018-01-01 00:00:00 1 | ||
2018-01-01 01:00:00 2 | ||
2018-01-01 02:00:00 3 | ||
Freq: H, dtype: int64 | ||
|
||
>>> s.resample('30min').fillna("bfill") | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would maybe first show what it does without filling (which is |
||
2018-01-01 00:00:00 1 | ||
2018-01-01 00:30:00 2 | ||
2018-01-01 01:00:00 2 | ||
2018-01-01 01:30:00 3 | ||
2018-01-01 02:00:00 3 | ||
Freq: 30T, dtype: int64 | ||
|
||
>>> s.resample('15min').fillna("bfill", limit=2) | ||
2018-01-01 00:00:00 1.0 | ||
2018-01-01 00:15:00 NaN | ||
2018-01-01 00:30:00 2.0 | ||
2018-01-01 00:45:00 2.0 | ||
2018-01-01 01:00:00 2.0 | ||
2018-01-01 01:15:00 NaN | ||
2018-01-01 01:30:00 3.0 | ||
2018-01-01 01:45:00 3.0 | ||
2018-01-01 02:00:00 3.0 | ||
Freq: 15T, dtype: float64 | ||
|
||
>>> s.resample('30min').fillna("ffill") | ||
2018-01-01 00:00:00 1 | ||
2018-01-01 00:30:00 1 | ||
2018-01-01 01:00:00 2 | ||
2018-01-01 01:30:00 2 | ||
2018-01-01 02:00:00 3 | ||
Freq: 30T, dtype: int64 | ||
|
||
Resampling a DataFrame that has missing values: | ||
|
||
>>> df = pd.DataFrame({'a': [2, np.nan, 6], 'b': [1, 3, 5]}, | ||
... index=pd.date_range('20180101', periods=3, | ||
... freq='h')) | ||
>>> df | ||
a b | ||
2018-01-01 00:00:00 2.0 1 | ||
2018-01-01 01:00:00 NaN 3 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would add such an example with a missing value above for Series as well (or instead of this example). In the end, we can limit the number of examples for DataFrame and basically say that for a DataFrame everything works similar as for Series column-by-column There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done |
||
2018-01-01 02:00:00 6.0 5 | ||
|
||
>>> df.resample('30min').fillna("bfill") | ||
a b | ||
2018-01-01 00:00:00 2.0 1 | ||
2018-01-01 00:30:00 NaN 3 | ||
2018-01-01 01:00:00 NaN 3 | ||
2018-01-01 01:30:00 6.0 5 | ||
2018-01-01 02:00:00 6.0 5 | ||
|
||
>>> df.resample('15min').fillna("bfill", limit=2) | ||
a b | ||
2018-01-01 00:00:00 2.0 1.0 | ||
2018-01-01 00:15:00 NaN NaN | ||
2018-01-01 00:30:00 NaN 3.0 | ||
2018-01-01 00:45:00 NaN 3.0 | ||
2018-01-01 01:00:00 NaN 3.0 | ||
2018-01-01 01:15:00 NaN NaN | ||
2018-01-01 01:30:00 6.0 5.0 | ||
2018-01-01 01:45:00 6.0 5.0 | ||
2018-01-01 02:00:00 6.0 5.0 | ||
|
||
>>> df.resample('30min').fillna("ffill") | ||
a b | ||
2018-01-01 00:00:00 2.0 1 | ||
2018-01-01 00:30:00 2.0 1 | ||
2018-01-01 01:00:00 NaN 3 | ||
2018-01-01 01:30:00 NaN 3 | ||
2018-01-01 02:00:00 6.0 5 | ||
|
||
See Also | ||
-------- | ||
Series.fillna | ||
DataFrame.fillna | ||
backfill : Backward fill NaN values in the resampled data. | ||
pad : Forward fill NaN values in the resampled data. | ||
bfill : Alias of backfill. | ||
ffill: Alias of pad. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can remove these aliases I think, since they go to the same page. |
||
nearest : Fill NaN values in the resampled data | ||
with nearest neighbor starting from center. | ||
pandas.Series.fillna : Fill NaN values in the Series using the | ||
specified method, which can be 'bfill' and 'ffill'. | ||
pandas.DataFrame.fillna : Fill NaN values in the DataFrame using the | ||
specified method, which can be 'bfill' and 'ffill'. | ||
|
||
References | ||
---------- | ||
.. [1] https://en.wikipedia.org/wiki/Imputation_(statistics) | ||
""" | ||
return self._upsample(method, limit=limit) | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be a single line. Does
sound good?