Skip to content

pandas 0.16.2 groupby and transform does not properly work with datetime objects #10814

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
agolbin opened this issue Aug 13, 2015 · 3 comments
Closed
Labels
Bug Datetime Datetime data dtype Duplicate Report Duplicate issue or pull request Groupby

Comments

@agolbin
Copy link

agolbin commented Aug 13, 2015

In the 0.16.2 version, transform returns original values, while in the 0.15.2 version it properly transforms the values into group counts. The problem seems to be with how pandas deal with datetime objects.

The only difference is the pandas version, and everything else is the same.


import pandas as pd
import datetime
df = pd.DataFrame([[pd.to_datetime(datetime.date(2015,8,10)), 'A', 10],
                   [pd.to_datetime(datetime.date(2015,8,10)), 'A', 20],
                   [pd.to_datetime(datetime.date(2015,8,11)), 'B', 30],
                   [pd.to_datetime(datetime.date(2015,8,10)), 'B', 40],
                  ], columns=['timestamp', 'name', 'value'])
grp = df.groupby(['timestamp', 'name'])['value']
grp.transform(lambda x: x.count())


0.15.2 version

0 2
1 2
2 1
3 1
Name: value, dtype: int64


0.16.2 version

0 10
1 20
2 30
3 40
Name: value, dtype: int64


Pandas versions

INSTALLED VERSIONS

commit: None
python: 3.3.5.final.0
python-bits: 32
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 62 Stepping 4, GenuineIntel
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.16.2
nose: 1.3.4
Cython: 0.22
numpy: 1.9.1
scipy: 0.15.1
statsmodels: 0.6.1
IPython: 2.4.1
sphinx: 1.2.3
patsy: 0.3.0
dateutil: 2.1
pytz: 2014.9
bottleneck: 0.8.0
tables: 3.1.1
numexpr: 2.3.1
matplotlib: 1.4.2
openpyxl: 1.8.5
xlrd: 0.9.3
xlwt: None
xlsxwriter: 0.6.6
lxml: 3.4.2
bs4: 4.3.2
html5lib: 0.999
httplib2: None
apiclient: None
sqlalchemy: 0.9.8
pymysql: None
psycopg2: 2.5.5 (dt dec pq3 ext)

@jreback
Copy link
Contributor

jreback commented Aug 13, 2015

this is a dupe of #10114
and closed by #10124 in master and the forthcoming 0.17.0 release.

@jreback jreback closed this as completed Aug 13, 2015
@jreback jreback added Bug Datetime Datetime data dtype Groupby Duplicate Report Duplicate issue or pull request labels Aug 13, 2015
@jreback
Copy link
Contributor

jreback commented Aug 13, 2015

FYI, you should use .transform('count') as this is substantially faster

@agolbin
Copy link
Author

agolbin commented Aug 13, 2015

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Datetime Datetime data dtype Duplicate Report Duplicate issue or pull request Groupby
Projects
None yet
Development

No branches or pull requests

2 participants