Skip to content

Groupby transform idxmax return floats #15306

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jesrael opened this issue Feb 4, 2017 · 3 comments · Fixed by #25531
Closed

Groupby transform idxmax return floats #15306

jesrael opened this issue Feb 4, 2017 · 3 comments · Fixed by #25531
Milestone

Comments

@jesrael
Copy link

jesrael commented Feb 4, 2017

Transform with idxmax return wrong output - floats instead datetimes

Sample:

rng = pd.date_range('1/1/2011', periods=15, freq='D') 
np.random.seed(4)

stocks = pd.DataFrame({ 
    'price':(np.random.randn(15).cumsum() + 10) },index = rng)

stocks['week_id'] = pd.to_datetime(stocks.index).week #used for the groupby
print (stocks)
                price  week_id
2011-01-01  10.050562       52
2011-01-02  10.550513       52
2011-01-03   9.554604        1
2011-01-04  10.248203        1
2011-01-05   9.829901        1
2011-01-06   8.245324        1
2011-01-07   7.597617        1
2011-01-08   8.196192        1
2011-01-09   8.528442        1
2011-01-10   7.380966        2
2011-01-11   7.999635        2
2011-01-12   7.911648        2
2011-01-13   8.336721        2
2011-01-14   8.668974        2
2011-01-15   7.512158        2
print (stocks.groupby(stocks['week_id'])['price'].transform('idxmax'))
2011-01-01    1.293926e+18
2011-01-02    1.293926e+18
2011-01-03    1.294099e+18
2011-01-04    1.294099e+18
2011-01-05    1.294099e+18
2011-01-06    1.294099e+18
2011-01-07    1.294099e+18
2011-01-08    1.294099e+18
2011-01-09    1.294099e+18
2011-01-10    1.294963e+18
2011-01-11    1.294963e+18
2011-01-12    1.294963e+18
2011-01-13    1.294963e+18
2011-01-14    1.294963e+18
2011-01-15    1.294963e+18
Freq: D, Name: price, dtype: float64

print (stocks.groupby(stocks['week_id'])['price'].idxmax())
week_id
1    2011-01-04
2    2011-01-14
52   2011-01-02
Name: price, dtype: datetime64[ns]

SO question.

print (pd.show_versions())

INSTALLED VERSIONS
------------------
commit: None
python: 3.5.1.final.0
python-bits: 64
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 42 Stepping 7, GenuineIntel
byteorder: little
LC_ALL: None
LANG: sk_SK
LOCALE: None.None

pandas: 0.19.2+0.g825876c.dirty
nose: 1.3.7
pip: 8.1.1
setuptools: 20.3
Cython: 0.23.4
numpy: 1.11.0
scipy: 0.17.0
statsmodels: 0.6.1
xarray: None
IPython: 4.1.2
sphinx: 1.3.1
patsy: 0.4.0
dateutil: 2.5.1
pytz: 2016.2
blosc: None
bottleneck: 1.0.0
tables: 3.2.2
numexpr: 2.5.1
matplotlib: 1.5.1
openpyxl: 2.3.2
xlrd: 0.9.4
xlwt: 1.0.0
xlsxwriter: 0.8.4
lxml: 3.6.0
bs4: 4.4.1
html5lib: 0.999
httplib2: None
apiclient: None
sqlalchemy: 1.0.12
pymysql: None
psycopg2: None
jinja2: 2.8
boto: 2.39.0
pandas_datareader: 0.2.1
None
@jreback
Copy link
Contributor

jreback commented Feb 4, 2017

yeah .idxmax() is just defined as a passthru back to Series, then it is coerce. so need to be directly defined on GroupBy, xref to #15260

you can achieve this like this currently

In [16]: stocks.index[stocks.reset_index().groupby(stocks.index.week).price.transform('idxmax').astype(int)]
Out[16]: 
DatetimeIndex(['2011-01-02', '2011-01-02', '2011-01-04', '2011-01-04', '2011-01-04', '2011-01-04', '2011-01-04', '2011-01-04', '2011-01-04', '2011-01-14', '2011-01-14', '2011-01-14', '2011-01-14',
               '2011-01-14', '2011-01-14'],
              dtype='datetime64[ns]', freq=None)

@jreback
Copy link
Contributor

jreback commented Feb 4, 2017

if you'd like to put up a PR would be appreciated.

@jreback jreback modified the milestones: 0.20.0, Next Major Release Mar 23, 2017
@dsm054
Copy link
Contributor

dsm054 commented Nov 12, 2018

@jreback: would it suffice just to add idxmax/min to cython_cast_blacklist?

@rbenes rbenes mentioned this issue Mar 20, 2019
3 tasks
@jreback jreback modified the milestones: Contributions Welcome, 0.25.0 Mar 24, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants