Skip to content

Round trip through HDF5 with format=table and localized DatetimeIndex discards index name #13884

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jzwinck opened this issue Aug 2, 2016 · 2 comments
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves IO HDF5 read_hdf, HDFStore
Milestone

Comments

@jzwinck
Copy link
Contributor

jzwinck commented Aug 2, 2016

This should work, but the assert fails:

import pandas as pd

df = pd.DataFrame({'a': [1,2,3]})
df.index = pd.DatetimeIndex([1234567890123456787, 1234567890123456788, 1234567890123456789])
df.index = df.index.tz_localize('UTC') # this line makes the index name go away
df.index.name = 'expected'
df.to_hdf('hello.h5', 'world', format='table')

df2 = pd.read_hdf('hello.h5', 'world')
assert df2.index.name == df.index.name, "HDF5 stored name not expected: {}".format(df2.index.name)

It works fine if you don't localize the DatetimeIndex, or if you don't use format='table'.

The index name "expected" is actually stored in the "info" attribute inside the HDF5 file whether it's localized or not. But the format is slightly different. If not localized:

       (dp7
       Vindex_name
       p8
       Vexpected

If localized:

       (tRp10
       sVindex_name
       p11
       Vexpected

I don't know enough to say whether the bug is in read_hdf(), to_hdf(), or PyTables.

I'm using Pandas 0.18.1.

@jreback
Copy link
Contributor

jreback commented Aug 2, 2016

thought there was an issue about this, but can't seem to find it.

So the attribute is saved. Must not be set on the read-back somehow.

In [38]: store.root.world._v_attrs
Out[38]:
/world._v_attrs (AttributeSet), 15 attributes:
   [CLASS := 'GROUP',
    TITLE := '',
    VERSION := '1.0',
    data_columns := [],
    encoding := 'UTF-8',
    index_cols := [(0, 'index')],
    info := {1: {'type': 'Index', 'names': [None]}, 'index': {'tz': <UTC>, 'index_name': 'expected'}},
    levels := 1,
    metadata := [],
    nan_rep := 'nan',
    non_index_axes := [(1, ['a'])],
    pandas_type := 'frame_table',
    pandas_version := '0.15.2',
    table_type := 'appendable_frame',
    values_cols := ['values_block_0']]

@jreback
Copy link
Contributor

jreback commented Aug 2, 2016

@jzwinck HDFStore is a meta-data layer on top of PyTables. pull-requests are welcome.

@jreback jreback added Bug Indexing Related to indexing on series/frames, not to indexes themselves Difficulty Novice IO HDF5 read_hdf, HDFStore labels Aug 2, 2016
@jreback jreback added this to the Next Major Release milestone Aug 2, 2016
@jreback jreback modified the milestones: 0.19.0, Next Major Release Aug 3, 2016
@jreback jreback closed this as completed in 9c1e738 Aug 4, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves IO HDF5 read_hdf, HDFStore
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants