Skip to content

BUG: concat of objects with the same timezone get reset to UTC #7795

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jreback opened this issue Jul 18, 2014 · 6 comments · Fixed by #13660
Closed

BUG: concat of objects with the same timezone get reset to UTC #7795

jreback opened this issue Jul 18, 2014 · 6 comments · Fixed by #13660
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves Reshaping Concat, Merge/Join, Stack/Unstack, Explode Timezones Timezone data dtype
Milestone

Comments

@jreback
Copy link
Contributor

jreback commented Jul 18, 2014

http://stackoverflow.com/questions/24830952/changing-timezone-on-pandas-datetimeindex-when-concatenating-dataframes-in-pytho

ix1 = pd.DatetimeIndex(start=pd.Timestamp('20140715', tz='EST5EDT'), end=pd.Timestamp('20140717', tz='EST5EDT'), freq='D', tz='EST5EDT')
ix2 = pd.DatetimeIndex([pd.Timestamp('2014-07-11 00:00:00', tz='EST5EDT'), pd.Timestamp('2014-07-21 00:00:00', tz='EST5EDT')])
df1 = pd.DataFrame(0, index=ix1, columns=['A', 'B'])
df2 = pd.DataFrame(0, index=ix2, columns=['A', 'B'])

I think the because the 2nd index is only length 2 its freq is not inferred causing a reset to UTC.

In [46]: concat([df1,df2])
Out[46]: 
                           A  B
2014-07-15 04:00:00+00:00  0  0
2014-07-16 04:00:00+00:00  0  0
2014-07-17 04:00:00+00:00  0  0
2014-07-11 04:00:00+00:00  0  0
2014-07-21 04:00:00+00:00  0  0

In [47]: concat([df1,df2]).index
Out[47]: 
<class 'pandas.tseries.index.DatetimeIndex'>
[2014-07-15 04:00:00+00:00, ..., 2014-07-21 04:00:00+00:00]
Length: 5, Freq: None, Timezone: UTC
@jreback jreback added this to the 0.15.0 milestone Jul 18, 2014
@sinhrks
Copy link
Member

sinhrks commented Jul 20, 2014

This looks difficult problem than I thought. Because these timezones are distinguished because of DST, need_utc_convert turns True. Need to define how the condition to regard 2 timezones are the same.

ix1.tz, ix2.tz
# <DstTzInfo 'EST5EDT' EDT-1 day, 20:00:00 DST>, <DstTzInfo 'EST5EDT' EST-1 day, 19:00:00 STD>
ix1.tz == ix2.tz
# False

https://github.com/pydata/pandas/blob/master/pandas/tseries/index.py#L2061

@jreback
Copy link
Contributor Author

jreback commented Jul 21, 2014

@sinhrks I think this is a creation issue in the way DatetimeIndex figures out tz's. It is using an incorrect definition (I think) ix2, though not sure why.

@sinhrks
Copy link
Member

sinhrks commented Jul 21, 2014

Ah I see. Actually each creation and inference results in different tz.

ix1.tz
# <DstTzInfo 'EST5EDT' EDT-1 day, 20:00:00 DST>

ix2.tz
# <DstTzInfo 'EST5EDT' EST-1 day, 19:00:00 STD>

ix3 = pd.DatetimeIndex(start='2014-07-15', end='2014-07-17', freq='D', tz='EST5EDT')
ix3.tz
# <DstTzInfo 'EST5EDT' EST-1 day, 19:00:00 STD>

@jreback
Copy link
Contributor Author

jreback commented Sep 9, 2014

@sinhrks pr for this?

@Safrone
Copy link

Safrone commented Nov 11, 2016

I notice this bug also happens when using the + operator on two dataframes

@jreback
Copy link
Contributor Author

jreback commented Nov 11, 2016

you should file a new issue with s reproducible example then

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves Reshaping Concat, Merge/Join, Stack/Unstack, Explode Timezones Timezone data dtype
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants