Skip to content

BUG: read_column did not preserve UTC tzinfo #7790

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 22, 2014

Conversation

alorenzo175
Copy link
Contributor

BUG: Fixes #7777, HDFStore.read_column did not preserve timezone information
when fetching a DatetimeIndex column with tz=UTC

@alorenzo175 alorenzo175 changed the title read_column did not preserve UTC tzinfo BUG: read_column did not preserve UTC tzinfo Jul 18, 2014
@jreback jreback added this to the 0.15.0 milestone Jul 18, 2014
@@ -3440,6 +3442,20 @@ def read_column(self, column, where=None, start=None, stop=None, **kwargs):
# column must be an indexable or a data column
c = getattr(self.table.cols, column)
a.set_info(self.info)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better to refactor the tz handling here: https://github.com/pydata/pandas/blob/master/pandas/io/pytables.py#L1467
then I think you can directly reuse it

@alorenzo175
Copy link
Contributor Author

Hmm, I'm not quite sure that will work. self.values probably needs to remain an Index, and when a Series is constructed from an index, it calls _to_embed which removes UTC tzinfo. That's why I workaround _to_embed by constructing the series from a list instead of an index.

@jreback
Copy link
Contributor

jreback commented Jul 18, 2014

that's not what I mean, re-factor it out as a function which you can call from both places, rather than duplicated the code

@alorenzo175
Copy link
Contributor Author

You mean something like the following that can be called in convert and when returning the series (with preserve_UTC = True)?

def _set_tz(values, tz, preserve_UTC=False):
    if tz is not None and isinstance(values, Index):
        tz = _ensure_decoded(tz)
        if values.tz is None:
            values = values.tz_localize('UTC').tz_convert(tz)
        if preserve_UTC:
            if tz == pytz.utc:
                values = list(values)

    return values

@jreback
Copy link
Contributor

jreback commented Jul 18, 2014

yep

though I think you can simply do tz == 'UTC' (which works for dateutil and pytz)

@jreback
Copy link
Contributor

jreback commented Jul 18, 2014

cc @dbew

@alorenzo175
Copy link
Contributor Author

You can't do tz == 'UTC', but you can do tslib.get_timezone(tz) == 'UTC'.

@jreback
Copy link
Contributor

jreback commented Jul 18, 2014

@alorenzo175 ok, gr8!

hmm maybe should document that somewhere .....

@dbew
Copy link
Contributor

dbew commented Jul 21, 2014

That looks good to me and the tests are sensible.

@jreback
Copy link
Contributor

jreback commented Jul 21, 2014

@alorenzo175 this look good. just need release note in v0.15.0 (reference original issue in bug fix section). and I think good to go

@alorenzo175
Copy link
Contributor Author

@jreback should be good to go now

@jreback
Copy link
Contributor

jreback commented Jul 22, 2014

perfect, now just squash to a snigle commit and then I can merge, see here: https://github.com/pydata/pandas/wiki/Using-Git

@alorenzo175
Copy link
Contributor Author

squashed

jreback added a commit that referenced this pull request Jul 22, 2014
BUG: read_column did not preserve UTC tzinfo
@jreback jreback merged commit a0a25c3 into pandas-dev:master Jul 22, 2014
@jreback
Copy link
Contributor

jreback commented Jul 22, 2014

@alorenzo175 thanks for the fix!

@alorenzo175 alorenzo175 deleted the pytables_index_tzutc branch July 22, 2014 15:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug IO HDF5 read_hdf, HDFStore Timezones Timezone data dtype
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: select_column not preserving a UTC timezone
3 participants