Skip to content

test needed: groupby(as_index=False, sort=False).aggregate formerly (?) gave unexpected results with a list-like function #18473

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
bcdarwin opened this issue Nov 24, 2017 · 4 comments · Fixed by #22467
Labels
Bug good first issue Groupby Testing pandas testing functions or related to the test suite
Milestone

Comments

@bcdarwin
Copy link

bcdarwin commented Nov 24, 2017

Code Sample, a copy-pastable example if possible

df = pd.DataFrame({
  'group' : [str(x) for x in range(12)],
  'file'  : [str(x) for x in range(12)]})
#Out[20]: 
#   file group
#0     0     0
#1     1     1
#2     2     2
#3     3     3
#4     4     4
#5     5     5
#6     6     6
#7     7     7
#8     8     8
#9     9     9
#10   10    10
#11   11    11
df.groupby('group', as_index=False, sort=False).aggregate({ 'file' : lambda file : list(file)})
#Out[21]: 
#   group  file
#0      0   [0]
#1      1   [1]
#2      2  [10]
#3      3  [11]
#4      4   [2]
#5      5   [3]
#6      6   [4]
#7      7   [5]
#8      8   [6]
#9      9   [7]
#10    10   [8]
#11    11   [9]

Problem description

The aggregate obviously hasn't aggregated the correct thing, but rather seems to have assumed some incorrect sorting of the rows.

Expected Output

Output of pd.show_versions()

[paste the output of pd.show_versions() here below this line]
INSTALLED VERSIONS

commit: None
python: 3.6.2.final.0
python-bits: 64
OS: Linux
OS-release: 4.4.0-87-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_CA.UTF-8
LOCALE: en_CA.UTF-8

pandas: 0.21.0
pytest: 3.2.2
pip: 9.0.1
setuptools: 28.8.0
Cython: 0.26.1
numpy: 1.13.3
...

@jreback
Copy link
Contributor

jreback commented Nov 25, 2017

The returning of a list is problematic. as this is ok

In [30]: df.groupby('group',as_index=False, sort=False).aggregate({ 'file' : lambda file : file})
Out[30]: 
   group file
0      0    0
1      1    1
2      2    2
3      3    3
4      4    4
5      5    5
6      6    6
7      7    7
8      8    8
9      9    9
10    10   10
11    11   11

If you want to have a look and see if you can pin-point where things are going wrong would be helpful. Note that this is pretty non-idiomatic, aggregating to a list.

@jreback jreback added this to the 0.22.0 milestone Nov 25, 2017
@jreback jreback changed the title groupby/aggregate gives unexpected result when sort=False passed to groupby groupby().aggregate gives unexpected result with as_index=False and a returned list-like function Nov 25, 2017
@jreback jreback changed the title groupby().aggregate gives unexpected result with as_index=False and a returned list-like function groupby(as_index=False, sort=False).aggregate gives unexpected results with a list-like function Nov 25, 2017
@TomAugspurger
Copy link
Contributor

Is this py2 only, or was it fixed in the interim?

In [10]: df.groupby('group', as_index=False, sort=False).aggregate({ 'file' : lambda file : list(file)})
Out[10]:
   group  file
0      0   [0]
1      1   [1]
2      2   [2]
3      3   [3]
4      4   [4]
5      5   [5]
6      6   [6]
7      7   [7]
8      8   [8]
9      9   [9]
10    10  [10]
11    11  [11]

@TomAugspurger TomAugspurger modified the milestones: 0.23.0, 0.23.1 Apr 25, 2018
@bcdarwin
Copy link
Author

bcdarwin commented Apr 25, 2018

This was Python 3.5/3.6. May (inadvertently or not) be working right now but perhaps a test case should be added before closing ...

@TomAugspurger TomAugspurger added Testing pandas testing functions or related to the test suite Effort Low good first issue and removed Effort Medium labels Apr 25, 2018
@TomAugspurger
Copy link
Contributor

Indeed, we'll want to add a test.

@bcdarwin bcdarwin changed the title groupby(as_index=False, sort=False).aggregate gives unexpected results with a list-like function test needed: groupby(as_index=False, sort=False).aggregate formerly (?) gave unexpected results with a list-like function Apr 25, 2018
@jreback jreback modified the milestones: 0.23.1, 0.23.2 Jun 7, 2018
@jreback jreback modified the milestones: 0.23.2, 0.23.3 Jun 26, 2018
@jreback jreback modified the milestones: 0.23.4, 0.24.0 Aug 2, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug good first issue Groupby Testing pandas testing functions or related to the test suite
Projects
None yet
3 participants