-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG Decode to UTF-8 the dtype string read from a hdf file #31756
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor nit on whatsnew otherwise lgtm
Fixes GH31750 The dtype value wasn't being decoded to `UTF-8` when reading a DataFrame from a hdf file. This was a problem when reading a hdf that was created from python 2 with a fixed format as the dtype was being read as `b'datetime'` instead of `datetime`, which caused `HDFStore` to read the data as `int64` instead of coercing it to the correct `datetime64` dtype. move doc to right file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm @jreback
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. thanks @pedroreys
note this might close some older issues, if you'd have a look. (comment on the issue if this is the case; we might need additional validation tests).
thanks |
thanks folks |
@jreback to get this fix backported to 1.0.2, do I need to open a second PR targeting that branch, or is it going to get picked up automatically by that backport bot? |
we don’t backport regular big fixes |
1 similar comment
we don’t backport regular big fixes |
Fixes GH31750
The dtype value wasn't being decoded to
UTF-8
when reading a DataFramefrom a hdf file. This was a problem when reading a hdf that was
created from python 2 with a fixed format as the dtype was being read as
b'datetime'
instead of
datetime
, which causedHDFStore
to read the data asint64
instead of coercing it to the correctdatetime64
dtype.black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff