Skip to content

Fix docs in groupby.tail #9333

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 4, 2015
Merged

Conversation

kidphys
Copy link
Contributor

@kidphys kidphys commented Jan 22, 2015

The old docs is wrong where head() & tail() return the same result.
Change input of the example to see group data clearer.

@@ -997,16 +997,16 @@ def tail(self, n=5):
Examples
--------

>>> df = DataFrame([[1, 2], [1, 4], [5, 6]],
>>> df = DataFrame([['a', 1], ['a', 2], ['b', 1], ['b', 2]],
columns=['A', 'B'])
>>> df.groupby('A', as_index=False).tail(1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get a different index than your result:

In [9]: df.groupby("A", as_index=False).tail(1)
Out[9]:
   A  B
1  a  2
3  b  2

In [10]: df.groupby("B", as_index=False).head(1)
Out[10]:
   A  B
0  a  1
1  a  2

Is that a copy-paste error, or a bug?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you did a groupby 'B' instead of 'A'

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the index in the first example is indeed not correct, but for the second example it is OK (there Tom used a wrong groupby key)

@jreback
Copy link
Contributor

jreback commented Jan 25, 2015

@kidphys ping when you update this.

0 1 2
2 5 6
0 a 2
2 b 2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the index here is not OK:

In [16]: df.groupby('A', as_index=False).tail(1)
Out[16]:
   A  B
1  a  2
3  b  2

So 1,3 instead of 0,2

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jorisvandenbossche you're right, updated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realize in my case index will be displayed no matter as_index is False or True. In the original doc, one call to groupby doesn't have as_index=False too. Is this a bug?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kidphys head and filter are regarded as 'filter' operations, and not 'aggregation' operations, so in that case as_index is ignored (see the note at the end of this section: http://pandas.pydata.org/pandas-docs/stable/groupby.html#filtration). So, it is not a bug, but it also shouldn't be in the docs, as this can only cause confusion that it has no effect.

@jorisvandenbossche jorisvandenbossche added this to the 0.16.0 milestone Jan 25, 2015
@jorisvandenbossche
Copy link
Member

@kidphys so can you remove the as_index=False? And can you then also squash the commits? Apart from that, ready to merge!

The old docs is wrong where head() & tail() return the same result.
Change input of the example to see group data clearer.
Remove as_index parameter has no effect with filter like head(),
@kidphys kidphys force-pushed the groupby.tail.docfix branch from 6b9b829 to ab926b6 Compare January 26, 2015 15:26
shoyer added a commit that referenced this pull request Feb 4, 2015
@shoyer shoyer merged commit 9929973 into pandas-dev:master Feb 4, 2015
@shoyer
Copy link
Member

shoyer commented Feb 4, 2015

Thanks @kidphys

@kidphys kidphys deleted the groupby.tail.docfix branch February 4, 2015 14:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants