-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
DOC: update the pandas.Series.shift docstring #20472
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
pandas/core/generic.py
Outdated
@@ -6815,16 +6815,21 @@ def mask(self, cond, other=np.nan, inplace=False, axis=None, level=None, | |||
errors=errors) | |||
|
|||
_shared_docs['shift'] = (""" | |||
Shift index by desired number of periods with an optional time freq | |||
Shift index by desired number of periods with an optional time freq. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
References to parameters in the descriptions / notes should be put in backticks, so `freq` wherever applicable (couple of other instances below)
pandas/core/generic.py
Outdated
Shift index by desired number of periods with an optional time freq | ||
Shift index by desired number of periods with an optional time freq. | ||
|
||
When freq is not passed, shift the index without realign the data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"realigning" instead of "realign"
pandas/core/generic.py
Outdated
Shift index by desired number of periods with an optional time freq. | ||
|
||
When freq is not passed, shift the index without realign the data. | ||
If freq is passed (in this case, the index must be date or datetime), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if freq
is passed and the index is not datetime-like? Perhaps this should have a Raises section?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is the docs to "Raises section"? I could not find it in https://python-sprints.github.io/pandas/guide/pandas_docstring.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would look something like:
Returns
--------
same_class : yada yada
Raises
------
ValueError : something went terribly horribly wrong
pandas/core/generic.py
Outdated
freq : DateOffset, timedelta, or time rule string, optional | ||
Increment to use from the tseries module or time rule (e.g. 'EOM'). | ||
See Notes. | ||
axis : %(axes_single_arg)s | ||
axis : int or str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The axis
parameter has it's own standard (see the guide):
axis : {0 or ‘index’, 1 or ‘columns’, None}, default None
pandas/core/generic.py
Outdated
|
||
Examples | ||
-------- | ||
>>> df = pd.DataFrame({'Col1': [10, 20, 30, 20, 15, 30, 45], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a bad example but I think we could save some space - could you limit the example to 5 rows of data? Shouldn't drastically affect your examples below
pandas/core/generic.py
Outdated
>>> df = pd.DataFrame({'Col1': [10, 20, 30, 20, 15, 30, 45], | ||
... 'Col2': [13, 23, 33, 23, 18, 33, 48], | ||
... 'Col3': [17, 27, 37, 27, 22, 37, 52]}) | ||
>>> print(df) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm well your DataFrame may be simple enough that you don't need to explicitly print it. That said, if you do need to print you can just say df
instead of print(df)
.
I'll leave it as a judgment call on your end if you want this, but if so definitely get rid of the surrounding print
statement
pandas/core/generic.py
Outdated
|
||
See Also | ||
-------- | ||
slice_shift: Equivalent to shift without copying data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need whitespace before each colon in this section
pandas/core/generic.py
Outdated
|
||
Parameters | ||
---------- | ||
periods : int | ||
Number of periods to move, can be positive or negative | ||
Number of periods to move, can be positive or negative. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very minor but you want a semicolon instead of a comma here. Perhaps saying "shift" instead of move would be better
pandas/core/generic.py
Outdated
|
||
See Also | ||
-------- | ||
slice_shift: Equivalent to shift without copying data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't list slice_shift, this is being deprecated / removed
Codecov Report
@@ Coverage Diff @@
## master #20472 +/- ##
==========================================
+ Coverage 92.22% 92.23% +<.01%
==========================================
Files 161 161
Lines 51187 51197 +10
==========================================
+ Hits 47209 47220 +11
+ Misses 3978 3977 -1
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple more changes. Can you also add a Raises section to describe what happens when freq
is provided for something other than a DateTimeIndex?
pandas/core/generic.py
Outdated
|
||
>>> import datetime | ||
>>> names = ['João', 'Maria', 'Emanuel', 'Jussara', 'José'] | ||
>>> dates = [datetime.datetime(2018, 3, 1, 11, 15), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can also be done in a more succinct manner by using pd.date_range
. Trying something like:
pd.date_range(start='3/1/18 11:15:00', freq='3D', periods=5)
Can modify date / freq to whatever you want. Can get rid of the date time import then as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good one!
pandas/core/generic.py
Outdated
|
||
Notes | ||
----- | ||
If freq is specified then the index values are shifted but the data | ||
is not realigned. That is, use freq if you would like to extend the | ||
If `freq` is specified then the index values are shifted but the data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To reinforce this concept it would be good to add into the examples where appropriate (see below)
pandas/core/generic.py
Outdated
3 NaN 30.0 33.0 | ||
4 NaN 45.0 48.0 | ||
|
||
>>> import datetime |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Related to above comment, this example could use an introduction along the lines of "When using the freq
parameter with a :class:pandas.DateTimeIndex
the index values are shifted but the data is not realigned."
pandas/core/generic.py
Outdated
If freq is specified then the index values are shifted but the data | ||
is not realigned. That is, use freq if you would like to extend the | ||
If `freq` is specified then the index values are shifted but the data | ||
is not realigned. That is, use `freq` if you would like to extend the | ||
index when shifting and preserve the original data. | ||
|
||
Returns | ||
------- | ||
shifted : %(klass)s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you are just returning a single element then you do not need to list the name of that variable in the returns, so here get rid of "shifted :" The returned object also needs a description indented on the subsequent line
pandas/core/generic.py
Outdated
|
||
Parameters | ||
---------- | ||
periods : int | ||
Number of periods to move, can be positive or negative | ||
Number of periods to shift; can be positive or negative. | ||
freq : DateOffset, timedelta, or time rule string, optional | ||
Increment to use from the tseries module or time rule (e.g. 'EOM'). | ||
See Notes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the entire Notes section is specific to freq
, I'd prefer to remove it and move the content here.
pandas/core/generic.py
Outdated
|
||
Parameters | ||
---------- | ||
periods : int | ||
Number of periods to move, can be positive or negative | ||
Number of periods to shift; can be positive or negative. | ||
freq : DateOffset, timedelta, or time rule string, optional | ||
Increment to use from the tseries module or time rule (e.g. 'EOM'). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is "Increment" supposed to be a DateOffest? If so, let's use that term consistently.
Could you also link to the frequency aliases in timeseries.rst
? timeseries.offset_aliases
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I didn't get it... Can you explain in other words?
pandas/core/generic.py
Outdated
4 NaN 45.0 48.0 | ||
|
||
>>> import datetime | ||
>>> names = ['João', 'Maria', 'Emanuel', 'Jussara', 'José'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The non-ASCII characters will cause encoding issues on Py2 without an encoding declaration. Probably best to stcik to ASCII till we're py3 only.
pandas/core/generic.py
Outdated
2018-03-10 11:15:00 Emanuel | ||
2018-03-15 11:15:00 Jussara | ||
2018-03-20 11:15:00 José | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a bit of an explanation between these? Especially explaining the freq
argument, as people may not know what 'D'
or 'min'
refer to.
pandas/core/generic.py
Outdated
2018-03-15 12:30:00 Jussara | ||
2018-03-20 12:30:00 José | ||
|
||
See Also |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See Also goes before Examples
@YitzhakAndrade seems like you've got other changes besides yours in this PR. Can you do the next in your branch:
Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DOC: update the pandas.Series.shift docstring
Thanks for the update @YitzhakAndrade. Seems like you still have unrelated changes in this PR. Could you try what I said in my last comment? For the next time, I think the problem can be related to using master for the changes, it's better to use a branch. |
can you merge master and update |
c6274d3
to
8854ee3
Compare
Hello @YitzhakAndrade! Thanks for updating the PR.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jbrockmendel one more if you don't mind.
PeriodIndex.shift : Shift values of PeriodIndex. | ||
Number of periods to shift. Can be positive or negative. | ||
freq : DateOffset, timedelta, or str, optional | ||
Offset to use from the tseries module or time rule (e.g. 'EOM'). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
worth specifying tseries.offsets?
pandas/core/generic.py
Outdated
If `freq` is specified then the index values are shifted but the | ||
data is not realigned. That is, use `freq` if you would like to | ||
extend the index when shifting and preserve the original data. | ||
axis : {0 or ‘index’, 1 or ‘columns’, None}, default None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't we usually avoid fancy quotes?
Couple of comments, agreement with points from Will and Tom. Generally looks nice. |
…licate content with the freq param description
Thanks for the review @jbrockmendel, very useful. The docstring can surely be improved, with the raises section, and more examples, but I prefer to leave that for someone else in a separate PR, and just merge this as it is, as the PR is abandoned. |
thanks @YitzhakAndrade |
Checklist for the pandas documentation sprint (ignore this if you are doing an unrelated PR):
scripts/validate_docstrings.py <your-function-or-method>
git diff upstream/master -u -- "*.py" | flake8 --diff
python doc/make.py --single <your-function-or-method>
Please include the output of the validation script below between the "```" ticks: