-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
DOC: update the pandas.core.resample.Resampler.fillna docstring #20379
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR!
Added some inline comments. Futher:
- could you add some more explanation between the examples?
pandas/core/resample.py
Outdated
forward fill ('ffill'), on the other hand, will replace NaN values | ||
that appeared in the resampled data with the previous value in the | ||
original sequence. Missing values that existed in the orginal data will | ||
not be modified. | ||
|
||
Parameters | ||
---------- | ||
method : str, method of resampling ('ffill', 'bfill') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you change the type description to method : {'ffill', 'bfill'}
? ("method of resampling" belongs on the next line, and is already there)
pandas/core/resample.py
Outdated
|
||
Parameters | ||
---------- | ||
method : str, method of resampling ('ffill', 'bfill') | ||
Method to use for filling holes in resampled data | ||
* ffill: use previous valid observation to fill gap (forward |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No indentation is needed here (compared to the "Method ..." on the line above), but, sphinx needs a blank line between those two lines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
pandas/core/resample.py
Outdated
2018-01-01 02:00:00 3 | ||
Freq: H, dtype: int64 | ||
|
||
>>> s.resample('30min').fillna("bfill") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would maybe first show what it does without filling (which is s.resample().asfreq()
), of course this is another method, but it will then be easier to see which values have actually been filled by fillna()
pandas/core/resample.py
Outdated
@@ -624,18 +624,134 @@ def backfill(self, limit=None): | |||
|
|||
def fillna(self, method, limit=None): | |||
""" | |||
Fill missing values | |||
Fill the new missing values in the resampled data using different |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be a single line. Does
Fill missing values introduced by upsampling.
sound good?
pandas/core/resample.py
Outdated
appear (e.g., when the resampling frequency is higher than the original | ||
frequency). | ||
|
||
The backward fill ('bfill') will replace NaN values that appeared in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aside from the last sentence, this can be folded into the Parameters section.
pandas/core/resample.py
Outdated
forward fill ('ffill'), on the other hand, will replace NaN values | ||
that appeared in the resampled data with the previous value in the | ||
original sequence. Missing values that existed in the orginal data will | ||
not be modified. | ||
|
||
Parameters | ||
---------- | ||
method : str, method of resampling ('ffill', 'bfill') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
method : {'ffill', 'pad', 'bfill', 'backfill', 'nearst'}
Note that ffilll is an alias for pad and bfill is an alias for backfill.
Can you check that 'nearest
' works as expected?
And move the descriptions from above here.
pandas/core/resample.py
Outdated
|
||
Parameters | ||
---------- | ||
method : str, method of resampling ('ffill', 'bfill') | ||
Method to use for filling holes in resampled data | ||
* ffill: use previous valid observation to fill gap (forward |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quote these, since they're strings.
pandas/core/resample.py
Outdated
limit : integer, optional | ||
limit of how many values to fill | ||
Limit of how many values to fill. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Say consecutive values to fill
pandas/core/resample.py
Outdated
|
||
Returns | ||
------- | ||
Series, DataFrame |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Series or DataFrame
, can't recall.
pandas/core/resample.py
Outdated
backfill : Backward fill NaN values in the resampled data. | ||
pad : Forward fill NaN values in the resampled data. | ||
bfill : Alias of backfill. | ||
ffill: Alias of pad. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can remove these aliases I think, since they go to the same page.
pandas/core/resample.py
Outdated
@@ -624,18 +624,134 @@ def backfill(self, limit=None): | |||
|
|||
def fillna(self, method, limit=None): | |||
""" | |||
Fill missing values | |||
Fill the new missing values in the resampled data using different | |||
methods. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you try to get this on a single line?
>>> df | ||
a b | ||
2018-01-01 00:00:00 2.0 1 | ||
2018-01-01 01:00:00 NaN 3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would add such an example with a missing value above for Series as well (or instead of this example).
I think using a Series will make it easier to understand and easier to focus on that specific behaviour.
In the end, we can limit the number of examples for DataFrame and basically say that for a DataFrame everything works similar as for Series column-by-column
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
The nice thing about the DataFrame example for `limit` is that you can show
side-by-side how only newly-introduced
missing values are filled.
…On Fri, Mar 16, 2018 at 10:54 AM, Joris Van den Bossche < ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In pandas/core/resample.py
<#20379 (comment)>:
> @@ -624,18 +624,134 @@ def backfill(self, limit=None):
def fillna(self, method, limit=None):
"""
- Fill missing values
+ Fill the new missing values in the resampled data using different
+ methods.
Can you try to get this on a single line?
------------------------------
In pandas/core/resample.py
<#20379 (comment)>:
> + 2018-01-01 00:00:00 1
+ 2018-01-01 00:30:00 1
+ 2018-01-01 01:00:00 2
+ 2018-01-01 01:30:00 2
+ 2018-01-01 02:00:00 3
+ Freq: 30T, dtype: int64
+
+ Resampling a DataFrame that has missing values:
+
+ >>> df = pd.DataFrame({'a': [2, np.nan, 6], 'b': [1, 3, 5]},
+ ... index=pd.date_range('20180101', periods=3,
+ ... freq='h'))
+ >>> df
+ a b
+ 2018-01-01 00:00:00 2.0 1
+ 2018-01-01 01:00:00 NaN 3
I would add such an example with a missing value above for Series as well
(or instead of this example).
I think using a Series will make it easier to understand and easier to
focus on that specific behaviour.
In the end, we can limit the number of examples for DataFrame and
basically say that for a DataFrame everything works similar as for Series
column-by-column
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#20379 (review)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABQHIhLxXLjCR0_ZYmYNh9bGaifDMlvCks5te-AzgaJpZM4St5WE>
.
|
Codecov Report
@@ Coverage Diff @@
## master #20379 +/- ##
==========================================
- Coverage 91.79% 91.77% -0.03%
==========================================
Files 152 152
Lines 49184 49184
==========================================
- Hits 45150 45138 -12
- Misses 4034 4046 +12
Continue to review full report at Codecov.
|
Made the requested changes, also adding a little more info between examples. |
[ci skip]
Moved the See Also up. Thanks @prcastro . |
…ame_describe * upstream/master: (158 commits) Add link to "Craft Minimal Bug Report" blogpost (pandas-dev#20431) BUG: fixed json_normalize for subrecords with NoneTypes (pandas-dev#20030) (pandas-dev#20399) BUG: ExtensionArray.fillna for scalar values (pandas-dev#20412) DOC" update the Pandas core window rolling count docstring" (pandas-dev#20264) DOC: update the pandas.DataFrame.plot.hist docstring (pandas-dev#20155) DOC: Only use ~ in class links to hide prefixes. (pandas-dev#20402) Bug: Allow np.timedelta64 objects to index TimedeltaIndex (pandas-dev#20408) DOC: add disallowing of Series construction of len-1 list with index to whatsnew (pandas-dev#20392) MAINT: Remove weird pd file DOC: update the Index.isin docstring (pandas-dev#20249) BUG: Handle all-NA blocks in concat (pandas-dev#20382) DOC: update the pandas.core.resample.Resampler.fillna docstring (pandas-dev#20379) BUG: Don't raise exceptions splitting a blank string (pandas-dev#20067) DOC: update the pandas.DataFrame.cummax docstring (pandas-dev#20336) DOC: update the pandas.core.window.x.mean docstring (pandas-dev#20265) DOC: update the api.types.is_number docstring (pandas-dev#20196) Fix linter (pandas-dev#20389) DOC: Improved the docstring of pandas.Series.dt.to_pytimedelta (pandas-dev#20142) DOC: update the pandas.Series.dt.is_month_end docstring (pandas-dev#20181) DOC: update the window.Rolling.min docstring (pandas-dev#20263) ...
scripts/validate_docstrings.py pandas.core.resample.Resampler.fillna
git diff upstream/master -u -- "*.py" | flake8 --diff
python doc/make.py --single pandas.core.resample.Resampler.fillna
Please include the output of the validation script below between the "```" ticks: