-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
DOC: update docstring of pandas.Series.add_prefix docstring #20313
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice PR. Added couple of comments.
pandas/core/generic.py
Outdated
|
||
Examples | ||
-------- | ||
>>> s = pd.Series([1,2,3,4]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing spaces after commas to pass PEP-8
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, the PEP-8 passed with any comments, it could be updated with this case?
pandas/core/generic.py
Outdated
|
||
See Also | ||
-------- | ||
pandas.Series.add_suffix: Add a suffix string to panel items names. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the preferred option is to not user the pandas.
prefix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took it away. :)
@@ -2967,11 +2967,33 @@ def add_prefix(self, prefix): | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Panel
is deprecated, we may want to use Series or DataFrame
, or something generic.
I'd try to find uses cases on the internet for this method, and add in the extended summary when this method can be useful, if possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I used Series as it is a Series module in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this and #20315, these apply to both Series and DataFrame (and Panel, but we don't care about that). So perhaps
Prefix row labels with a string `prefix`.
I want to avoid items
, as (to me) I think dict.items
so key-value pairs. But we're just touching the row labels here.
That summary is strange since we use Prefix as a verb and nound. Maybe "Prepend" would be better?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, thank you, my doubt is that on DataFrames add_prefix() adds a prefix on columns names not on rows, for example: df = pd.DataFrame({'A':[1,2,3,4], 'B':[3,4,5,6]}), df.add_suffix('_item')
. For this motivations I decided to leave on Series.
Perhaps I could updated the docstring for pandas.DataFrame.add_prefix with the relative example to avoid confusion and let this as it is.
Let me know what sound better. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, good catch! I would say splitting the docstring is more complexity that warranted. How about somehting like
"""
Prefix labels with string `prefix`.
For Series, the row labels are prefixed. For DataFrame, the column labels are prefixed
...
"""
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I do agree, thanks!
I didn't know about the issue of splitting docstring, so in this case is better have the same for both pandas.Series.add_prefix and pandas.DataFrame.add_prefix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the docstring you've been editing in core/generic.py
is used by both Series.add_prefix
and DataFrame.add_prefix
.
Docstring updated, thanks for the comments! |
Codecov Report
@@ Coverage Diff @@
## master #20313 +/- ##
==========================================
+ Coverage 91.7% 91.73% +0.02%
==========================================
Files 150 150
Lines 49165 49168 +3
==========================================
+ Hits 45087 45102 +15
+ Misses 4078 4066 -12
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Do you have time to update this and #20315 with similar changes @astrastefania?
@@ -2967,11 +2967,33 @@ def add_prefix(self, prefix): | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this and #20315, these apply to both Series and DataFrame (and Panel, but we don't care about that). So perhaps
Prefix row labels with a string `prefix`.
I want to avoid items
, as (to me) I think dict.items
so key-value pairs. But we're just touching the row labels here.
That summary is strange since we use Prefix as a verb and nound. Maybe "Prepend" would be better?
pandas/core/generic.py
Outdated
|
||
Returns | ||
------- | ||
with_prefix : type of caller | ||
Series |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Series or DataFrame
Same type as the calling object, with updated row labels.
pandas/core/generic.py
Outdated
|
||
See Also | ||
-------- | ||
Series.add_suffix: Add a suffix string to Series items names. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"items names" -> "row labels"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see your point for "item names", please check the comment above related to include DataFrame or not (if not could make the change to "row labels" suggested here).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So anywhere we say "item names", let's say "labels." or "row labels" or "column labels".
item_1 2 | ||
item_2 3 | ||
item_3 4 | ||
dtype: int64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a blank line and then an example with DataFrame
.
pandas/core/generic.py
Outdated
|
||
Returns | ||
------- | ||
with_prefix : type of caller | ||
Series or DataFrame | ||
Original Series or DataFrame with updated labels. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This make it sound a bit like the object is modified inplace. Could you instead say
Series or DataFrame
New Series or DataFrame with updated labels.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually is modified inplace, should it be specified?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm I don't think so
In [6]: s = pd.Series(1, ['a', 'b'])
In [7]: s
Out[7]:
a 1
b 1
dtype: int64
In [8]: s.add_prefix('foo_')
Out[8]:
foo_a 1
foo_b 1
dtype: int64
In [9]: s
Out[9]:
a 1
b 1
dtype: int64
s
is unmodified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I meant it's unmodified! I agree is not modified inplace.
pandas/core/generic.py
Outdated
1 2 4 | ||
2 3 5 | ||
3 4 6 | ||
>>> df.add_suffix('_item') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be the add_prefix
example I think :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes!!! 😄
LGTM other than those two minor issues. |
@TomAugspurger @datapythonista, if it's all fine in here now I'll proceed to modify #20315 (otherwise I'll update this one first) |
Looks great! Just added a tiny commit adjusting the spacing. Thanks! |
Checklist for the pandas documentation sprint (ignore this if you are doing
an unrelated PR):
scripts/validate_docstrings.py <your-function-or-method>
git diff upstream/master -u -- "*.py" | flake8 --diff
python doc/make.py --single <your-function-or-method>
Please include the output of the validation script below between the "```" ticks:
If the validation script still gives errors, but you think there is a good reason
to deviate in this case (and there are certainly such cases), please state this
explicitly.