Skip to content

MultiIndex with dateutil tzlocal data corruption #14106

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
dbivolaru opened this issue Aug 28, 2016 · 2 comments
Closed

MultiIndex with dateutil tzlocal data corruption #14106

dbivolaru opened this issue Aug 28, 2016 · 2 comments
Labels
Duplicate Report Duplicate issue or pull request MultiIndex Timezones Timezone data dtype

Comments

@dbivolaru
Copy link

dbivolaru commented Aug 28, 2016

Code Sample, a copy-pastable example if possible

In [1]: %paste
import datetime
from dateutil.tz import tzlocal, tzutc
import pandas

d = {
    datetime.datetime(2016, 8, 26, 20, 23, tzinfo=tzlocal()): {
        'A': 10481.0,
        'B': 12,
        'C': 'text'
    },
    datetime.datetime(2016, 8, 26, 20, 24, tzinfo=tzlocal()): {
        'A': 10480.5,
        'B': 13,
        'C': 'text'
    }
}

df = pandas.DataFrame(d).T
df2 = df.set_index(['C'], append=True)
## -- End pasted text --

In [2]: df
Out[2]: 
                                 A   B     C
2016-08-26 20:23:00+02:00    10481  12  text
2016-08-26 20:24:00+02:00  10480.5  13  text

In [3]: df2
Out[3]: 
                                      A   B
                          C                
2016-08-26 20:23:00+02:00 text    10481  12
1970-01-01 00:00:00+01:00 text  10480.5  13

Expected Output

In [3]: df2
Out[3]: 
                                      A   B
                          C                
2016-08-26 20:23:00+02:00 text    10481  12
2016-08-26 20:24:00+02:00 text  10480.5  13

Note, issue goes away when removing the tzinfo=tzlocal(). Or just using tzinfo=tzutc(). It can also be reproduced using MultiIndex() with tuples instead of set_index().

Using pytz works

import datetime
import pytz
import tzlocal
import pandas

local_tz = tzlocal.get_localzone()

d = {
    datetime.datetime(2016, 8, 26, 20, 23, tzinfo=pytz.utc).astimezone(local_tz): {
        'A': 10481.0,
        'B': 12,
        'C': 'text'
    },
    datetime.datetime(2016, 8, 26, 20, 24, tzinfo=pytz.utc).astimezone(local_tz): {
        'A': 10480.5,
        'B': 13,
        'C': 'text'
    }
}

df = pandas.DataFrame(d).T
df2 = df.set_index(['C'], append=True)

output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.5.1.final.0
python-bits: 64
OS: Linux
OS-release: 4.6.6-300.fc24.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.utf8

pandas: 0.18.0
nose: None
pip: 8.0.2
setuptools: 20.1.1
Cython: 0.23.4
numpy: 1.11.0
scipy: 0.16.1
statsmodels: 0.6.1
xarray: None
IPython: 3.2.1
sphinx: None
patsy: 0.4.1
dateutil: 2.5.2
pytz: 2016.6.1
blosc: None
bottleneck: 0.6.0
tables: 3.2.2
numexpr: 2.6.1
matplotlib: 1.5.2rc2
openpyxl: None
xlrd: 0.9.4
xlwt: 1.0.0
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: 2.6.1 (dt dec pq3 ext lo64)
jinja2: 2.8
boto: None

@dbivolaru dbivolaru changed the title MultiIndex with local timestamp data corruption MultiIndex with dateutil tzlocal data corruption Aug 28, 2016
@jreback
Copy link
Contributor

jreback commented Aug 28, 2016

duplicate/fixed by #13583 will be in master/0.19.0 (soon)

In [19]: df
Out[19]: 
                                 A   B     C
2016-08-26 20:23:00-04:00    10481  12  text
2016-08-26 20:24:00-04:00  10480.5  13  text

In [20]: df2
Out[20]: 
                                      A   B
                          C                
2016-08-26 20:23:00-04:00 text    10481  12
2016-08-26 20:24:00-04:00 text  10480.5  13

@jreback jreback closed this as completed Aug 28, 2016
@jreback jreback added Duplicate Report Duplicate issue or pull request Timezones Timezone data dtype MultiIndex labels Aug 28, 2016
@jreback jreback added this to the No action milestone Aug 28, 2016
@dbivolaru
Copy link
Author

Thanks @jreback

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Duplicate Report Duplicate issue or pull request MultiIndex Timezones Timezone data dtype
Projects
None yet
Development

No branches or pull requests

2 participants