Skip to content

Left Join with timedelta64 does not produce correct nulls #5695

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
cancan101 opened this issue Dec 13, 2013 · 0 comments · Fixed by #5995
Closed

Left Join with timedelta64 does not produce correct nulls #5695

cancan101 opened this issue Dec 13, 2013 · 0 comments · Fixed by #5995
Labels
Bug Reshaping Concat, Merge/Join, Stack/Unstack, Explode Timedelta Timedelta data type
Milestone

Comments

@cancan101
Copy link
Contributor

related example: http://stackoverflow.com/questions/20789976/python-pandas-dataframe-1st-line-issue-with-datetime-timedelta/20802902?noredirect=1#comment31195305_20802902

import datetime
import pandas as pd
parms = {'d':  datetime.datetime(2013, 11, 5, 5, 56), 't':datetime.timedelta(0, 22500)}
df = pd.DataFrame(columns=list('dt'))
df = df.append(parms, ignore_index=True)
erroneous code:
>>> df.append(parms, ignore_index=True)
                    d               t
0 2013-11-05 05:56:00  22500000000000
1 2013-11-05 05:56:00         6:15:00

The notion of nullness is not handled well for timedelta64 columns when performing a left join:

In [194]:
pd.DataFrame(pd.Series([np.timedelta64(300000000),np.timedelta64(300000000)],dtype='m8[ns]',index=["A","B"])).join(
     pd.DataFrame(pd.Series([np.timedelta64(300000000)],dtype='m8[ns]',index=["A"])),rsuffix='r', how="left").info()

Out [194]:
<class 'pandas.core.frame.DataFrame'>
Index: 2 entries, A to B
Data columns (total 2 columns):
0     2  non-null values
0r    1  non-null values
dtypes: float64(1), timedelta64[ns](1)

The column with a mix of timedelta64 and nulls gets cast to a float64.

This seems incorrect since NaT should be usable to indicate the null:

In [196]:
pd.Series([np.timedelta64(300000000), pd.NaT],dtype='m8[ns]')

Out[196]:
0   00:00:00.300000
1               NaT
dtype: timedelta64[ns]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Reshaping Concat, Merge/Join, Stack/Unstack, Explode Timedelta Timedelta data type
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant