-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Fix docs in groupby.tail #9333
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix docs in groupby.tail #9333
Conversation
@@ -997,16 +997,16 @@ def tail(self, n=5): | |||
Examples | |||
-------- | |||
|
|||
>>> df = DataFrame([[1, 2], [1, 4], [5, 6]], | |||
>>> df = DataFrame([['a', 1], ['a', 2], ['b', 1], ['b', 2]], | |||
columns=['A', 'B']) | |||
>>> df.groupby('A', as_index=False).tail(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get a different index than your result:
In [9]: df.groupby("A", as_index=False).tail(1)
Out[9]:
A B
1 a 2
3 b 2
In [10]: df.groupby("B", as_index=False).head(1)
Out[10]:
A B
0 a 1
1 a 2
Is that a copy-paste error, or a bug?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you did a groupby 'B' instead of 'A'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the index in the first example is indeed not correct, but for the second example it is OK (there Tom used a wrong groupby key)
@kidphys ping when you update this. |
0 1 2 | ||
2 5 6 | ||
0 a 2 | ||
2 b 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the index here is not OK:
In [16]: df.groupby('A', as_index=False).tail(1)
Out[16]:
A B
1 a 2
3 b 2
So 1,3 instead of 0,2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jorisvandenbossche you're right, updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just realize in my case index will be displayed no matter as_index is False or True. In the original doc, one call to groupby doesn't have as_index=False too. Is this a bug?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kidphys head and filter are regarded as 'filter' operations, and not 'aggregation' operations, so in that case as_index
is ignored (see the note at the end of this section: http://pandas.pydata.org/pandas-docs/stable/groupby.html#filtration). So, it is not a bug, but it also shouldn't be in the docs, as this can only cause confusion that it has no effect.
@kidphys so can you remove the |
The old docs is wrong where head() & tail() return the same result. Change input of the example to see group data clearer. Remove as_index parameter has no effect with filter like head(),
6b9b829
to
ab926b6
Compare
Thanks @kidphys |
The old docs is wrong where head() & tail() return the same result.
Change input of the example to see group data clearer.