-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
DOC: docstring to series.unique #20474
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
pandas/core/series.py
Outdated
def unique(self): | ||
""" | ||
Return unique values in the object. Uniques are returned in order |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be a single line. Can you simplify? You can use an extended summary for the bit about uniques.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
broken into two lines. and minor rewording.
pandas/core/series.py
Outdated
def unique(self): | ||
""" | ||
Return unique values in the object. Uniques are returned in order | ||
of appearance, this does NOT sort. Hash table-based unique. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure hash table-based unique will be meaningful to many users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree but that's the same in pandas.unique. I think it makes sense to people who understand data structure, and for people who don't understand it there is enough clarification around it. It's important to let people to know it's hash table so no sorting.
pandas/core/series.py
Outdated
unique | ||
Index.unique | ||
Series.unique | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add examples?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aded. and removed self reference in See Also and added descriptions.
Codecov Report
@@ Coverage Diff @@
## master #20474 +/- ##
==========================================
- Coverage 91.85% 91.82% -0.03%
==========================================
Files 152 152
Lines 49231 49248 +17
==========================================
+ Hits 45220 45224 +4
- Misses 4011 4024 +13
Continue to review full report at Codecov.
|
@TomAugspurger comments welcome 👍 |
def unique(self): | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it might be worth trying to share this doc-string with pd.unique
(at least the examples) no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point but they have some differences:
- pd.unique takes param and Series.unique doesn't take.
- pd.unique handles 1d array-like objects including Index and Series.unique applies on self.
- pd.unique examples contain more and Series.unique only series examples
do we have pattern somewhere in regards to conditionally show docstring lines?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/pandas-dev/pandas/pull/20361/files is doing something similar for factorize. It's somewhat complex, since we have pd.unique
, Series/Index.unique
, and Categorical.unique
. I'd be OK with improving the docstring here, and merging theme later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree, this PR is mainly about improve docstring. and should be another PR synthesising docs of all unique methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's fine too
pandas/core/series.py
Outdated
|
||
See Also | ||
-------- | ||
unique : return unique values of 1d array-like objects. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you include pandas.
for this one. Otherwise it's unclear what's being referenced. The explanation can say "Top-level unique method for any 1-d array-like object."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
pandas/core/series.py
Outdated
|
||
An ordered Categorical preserves the category ordering. | ||
|
||
>>> pd.Series(pd.Categorical(list('baabc'), categories=list('abc'), \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the \
here, and start the line with ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
pandas/core/series.py
Outdated
>>> pd.Series([pd.Timestamp('20160101') for _ in range(3)]).unique() | ||
array(['2016-01-01T00:00:00.000000000'], dtype='datetime64[ns]') | ||
|
||
>>> pd.Series([pd.Timestamp('20160101', tz='US/Eastern') \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove the \ here, and start the line with ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
pandas/core/series.py
Outdated
Returns | ||
------- | ||
unique values. | ||
- If the input is an Index, the return is an Index |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part can be simplified, since it is now only the docstring of Series.unique.
So there is also no "input"
In this case it is always an array
, except for categorical dtype.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
@jorisvandenbossche does it look good? |
pandas/core/series.py
Outdated
|
||
Returns | ||
------- | ||
unique values : Series or Categorical |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's never a Series, only an numpy array or Categorical. I would also include something about that like
ndarray or Categorical
The unique values returned as a NumPy array. In case of categorical data type, returned as a Categorical
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
@minggli Thanks a lot! |
Thanks @jorisvandenbossche |
git diff upstream/master -u -- "*.py" | flake8 --diff
seems that no one has picked this up. moving _shared_doc['unique'] used by Series.unique() only to Series.unique.