-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
API: add DataFrame.nunique() and DataFrameGroupBy.nunique() #14336
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Labels
Milestone
Comments
Agreed, I think this would be welcome functionality. |
Note that these are already defined for Series.
|
Of course, extending the >>> df = pd.DataFrame({'id': ['spam', 'eggs', 'eggs', 'spam', 'ham', 'ham'],
'value1': [1, 5, 5, 2, 5, 5], 'value2': list('abbaxy')})
>>> df
id value1 value2
0 spam 1 a
1 eggs 5 b
2 eggs 5 b
3 spam 2 a
4 ham 5 x
5 ham 5 y
>>> df.groupby('id').filter(lambda g: (g.apply(pd.Series.nunique) > 1).any())
id value1 value2
0 spam 1 a
3 spam 2 a
4 ham 5 x
5 ham 5 y |
4 tasks
Any news? |
just merged. |
AnkurDedania
pushed a commit
to AnkurDedania/pandas
that referenced
this issue
Mar 21, 2017
closes pandas-dev#14336 Author: Sebastian Bank <[email protected]> Closes pandas-dev#14376 from xflr6/nunique and squashes the following commits: a0558e7 [Sebastian Bank] use apply()-kwargs instead of partial, more tests, better examples c8d3ac4 [Sebastian Bank] extend docs and tests fd0f22d [Sebastian Bank] add simple benchmarks 5c4b325 [Sebastian Bank] API: add DataFrame.nunique() and DataFrameGroupBy.nunique()
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
When exploring a data set, I often need to
df.apply(pd.Series.nunique)
ordf.apply(lambda x: x.nunique())
. How about adding this asnunique()
-method parallel toDataFrame.count()
(count
andunique
are also the two most basic infos displayed byDataFrame.describe()
)?I think there are also use cases for this as a
groupby
-method, for example when checking a candidate primary key for different lines (values):The text was updated successfully, but these errors were encountered: