Skip to content

DOC: docstring to series.unique #20474

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Mar 27, 2018
24 changes: 0 additions & 24 deletions pandas/core/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -1020,30 +1020,6 @@ def value_counts(self, normalize=False, sort=True, ascending=False,
normalize=normalize, bins=bins, dropna=dropna)
return result

_shared_docs['unique'] = (
"""
Return unique values in the object. Uniques are returned in order
of appearance, this does NOT sort. Hash table-based unique.

Parameters
----------
values : 1d array-like

Returns
-------
unique values.
- If the input is an Index, the return is an Index
- If the input is a Categorical dtype, the return is a Categorical
- If the input is a Series/ndarray, the return will be an ndarray

See Also
--------
unique
Index.unique
Series.unique
""")

@Appender(_shared_docs['unique'] % _indexops_doc_kwargs)
def unique(self):
values = self._values

Expand Down
46 changes: 45 additions & 1 deletion pandas/core/series.py
Original file line number Diff line number Diff line change
Expand Up @@ -1429,8 +1429,52 @@ def mode(self):
# TODO: Add option for bins like value_counts()
return algorithms.mode(self)

@Appender(base._shared_docs['unique'] % _shared_doc_kwargs)
def unique(self):
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it might be worth trying to share this doc-string with pd.unique (at least the examples) no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point but they have some differences:

  • pd.unique takes param and Series.unique doesn't take.
  • pd.unique handles 1d array-like objects including Index and Series.unique applies on self.
  • pd.unique examples contain more and Series.unique only series examples

do we have pattern somewhere in regards to conditionally show docstring lines?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/pandas-dev/pandas/pull/20361/files is doing something similar for factorize. It's somewhat complex, since we have pd.unique, Series/Index.unique, and Categorical.unique. I'd be OK with improving the docstring here, and merging theme later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree, this PR is mainly about improve docstring. and should be another PR synthesising docs of all unique methods.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's fine too

Return unique values of Series object.

Uniques are returned in order of appearance. Hash table-based unique,
therefore does NOT sort.

Returns
-------
unique values : Series or Categorical
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's never a Series, only an numpy array or Categorical. I would also include something about that like

ndarray or Categorical
    The unique values returned as a NumPy array. In case of categorical data type, returned as a Categorical

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.


See Also
--------
pandas.unique : top-level unique method for any 1-d array-like object.
Index.unique : return Index with unique values from an Index object.

Examples
--------
>>> pd.Series([2, 1, 3, 3], name='A').unique()
array([2, 1, 3])

>>> pd.Series([2] + [1] * 5).unique()
array([2, 1])

>>> pd.Series([pd.Timestamp('20160101') for _ in range(3)]).unique()
array(['2016-01-01T00:00:00.000000000'], dtype='datetime64[ns]')

>>> pd.Series([pd.Timestamp('20160101', tz='US/Eastern')
... for _ in range(3)]).unique()
array([Timestamp('2016-01-01 00:00:00-0500', tz='US/Eastern')],
dtype=object)

An unordered Categorical will return categories in the order of
appearance.

>>> pd.Series(pd.Categorical(list('baabc'))).unique()
[b, a, c]
Categories (3, object): [b, a, c]

An ordered Categorical preserves the category ordering.

>>> pd.Series(pd.Categorical(list('baabc'), categories=list('abc'),
... ordered=True)).unique()
[b, a, c]
Categories (3, object): [a < b < c]
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add examples?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aded. and removed self reference in See Also and added descriptions.

result = super(Series, self).unique()

if is_datetime64tz_dtype(self.dtype):
Expand Down