Skip to content

Result of DataFrameGroupBy.apply(lambda x: pd.Series(...)) can have values incorrectly cast from object (string) to pd.Timestamp for a DataFrame with at least one datetime64[ns] dtype column #9505

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
skellys opened this issue Feb 17, 2015 · 2 comments
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions Duplicate Report Duplicate issue or pull request
Milestone

Comments

@skellys
Copy link

skellys commented Feb 17, 2015

See repro below:

import pandas as pd
df_with_ts = pd.DataFrame(data=[['BL', pd.Timestamp('2015-02-01')], ['TH', pd.Timestamp('2015-02-02')]], columns=['item', 'date'])
res = df_with_ts.groupby(['date']).apply(lambda x: pd.Series(x['item'].unique()[0]))
In []: res                                                                                                                                                            
Out[]:                                                                                                                                                                
                    0                                                                                                                                                   
date                                                                                                                                                                    
2015-02-01        NaT                                                                                                                                                   
2015-02-02 2015-02-17                                                                                                                                                   

In []: res.dtypes                                                                                                                                                     
Out[]:                                                                                                                                                                
0    datetime64[ns]                                                                                                                                                     
dtype: object              

It looks like this is happening because pd.Timestamp('TH') is valid and results in no exceptions. If you replace the string 'TH' with something else then the resulting column's dtype is object, as expected. The order of columns in the above example does not matter.

For comparison, this case works as expected:

df = pd.DataFrame(data=[['BL', '2015-02-01'], ['TH', '2015-02-02']], columns=['item', 'date'])
res = df.groupby(['date']).apply(lambda x: pd.Series(x['item'].unique()[0]))                                                                                                                                            

In []: res                                                                                                                                                            
Out[]:                                                                                                                                                                
             0                                                                                                                                                          
date                                                                                                                                                                    
2015-02-01  BL                                                                                                                                                          
2015-02-02  TH                                                                                          

In []: res.dtypes                                                                                                                                                     
Out[]:                                                                                                                                                                
0    object                                                                                                                                                             
dtype: object                                                                              

output of show_versions() below (using Miniconda):

In []: from pandas.util.print_versions import show_versions                                                                                                            

In []: show_versions()                                                                                                                                                 

INSTALLED VERSIONS                                                                                                                                                      
------------------                                                                                                                                                      
commit: None                                                                                                                                                            
python: 2.7.9.final.0                                                                                                                                                   
python-bits: 64                                                                                                                                                         
OS: Windows                                                                                                                                                             
OS-release: 7                                                                                                                                                           
machine: AMD64                                                                                                                                                          
processor: Intel64 Family 6 Model 60 Stepping 3, GenuineIntel                                                                                                           
byteorder: little                                                                                                                                                       
LC_ALL: None                                                                                                                                                            
LANG: None                                                                                                                                                              

pandas: 0.15.2                                                                                                                                                          
nose: 1.3.4                                                                                                                                                             
Cython: 0.21.2                                                                                                                                                          
numpy: 1.8.2                                                                                                                                                            
scipy: 0.14.0                                                                                                                                                           
statsmodels: 0.5.0                                                                                                                                                      
IPython: 2.3.0                                                                                                                                                          
sphinx: 1.2.3                                                                                                                                                           
patsy: 0.2.1                                                                                                                                                            
dateutil: 2.2                                                                                                                                                           
pytz: 2014.9                                                                                                                                                            
bottleneck: 0.8.0                                                                                                                                                       
tables: 3.1.1                                                                                                                                                           
numexpr: 2.3.1                                                                                                                                                          
matplotlib: 1.4.0                                                                                                                                                       
openpyxl: 2.0.2                                                                                                                                                         
xlrd: 0.9.3                                                                                                                                                             
xlwt: 0.7.5                                                                                                                                                             
xlsxwriter: 0.6.6                                                                                                                                                       
lxml: 3.4.1                                                                                                                                                             
bs4: 4.3.2                                                                                                                                                              
html5lib: 0.999                                                                                                                                                         
httplib2: None                                                                                                                                                          
apiclient: None                                                                                                                                                         
rpy2: None                                                                                                                                                              
sqlalchemy: 0.9.8                                                                                                                                                       
pymysql: None                                                                                                                                                           
psycopg2: None 
@jreback
Copy link
Contributor

jreback commented Feb 17, 2015

this is a duplicate of #9477 ; being addressed and will be in master shortly.

@jreback jreback added Bug Dtype Conversions Unexpected or buggy dtype conversions Duplicate Report Duplicate issue or pull request labels Feb 17, 2015
@jreback jreback added this to the 0.16.0 milestone Feb 17, 2015
@skellys
Copy link
Author

skellys commented Feb 17, 2015

I see; thanks! That was fast :)

@jreback jreback closed this as completed Feb 18, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Dtype Conversions Unexpected or buggy dtype conversions Duplicate Report Duplicate issue or pull request
Projects
None yet
Development

No branches or pull requests

2 participants