Skip to content

REGR: read_pickle fallback to encoding=latin_1 upon a UnicodeDecodeError #32055

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Feb 21, 2020

Conversation

pedroreys
Copy link
Contributor

When a reading a pickle with MultiIndex columns generated in py27
pickle_compat.load() with enconding=None would throw an UnicodeDecodeError
when reading a pickle created in py27. Now, read_pickle catches that exception and
fallback to use latin-1 explicitly.

Copy link
Member

@simonjayhawkins simonjayhawkins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @pedroreys generally looks good. couple of suggestions.

@simonjayhawkins simonjayhawkins added IO Pickle read_pickle, to_pickle Regression Functionality that used to work in a prior pandas version labels Feb 18, 2020
@simonjayhawkins simonjayhawkins added this to the 1.0.2 milestone Feb 18, 2020
Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good; question on the whastnew change

Copy link
Member

@simonjayhawkins simonjayhawkins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pedroreys thanks for the changes. lgtm ex @WillAyd comment

@simonjayhawkins simonjayhawkins changed the title BUG: read_pickle fallback to encoding=latin_1 upon a UnicodeDecodeError REGR: read_pickle fallback to encoding=latin_1 upon a UnicodeDecodeError Feb 20, 2020
When a reading a pickle with MultiIndex columns generated in py27
`pickle_compat.load()` with `enconding=None` would throw an UnicodeDecodeError
when reading a pickle created in py27. Now, `read_pickle` catches that exception and
fallback to use `latin-1` explicitly.
Cleanup the code so that it only has a single catch for
UnicodeDecodeError
@jbrockmendel
Copy link
Member

LGTM

@jorisvandenbossche
Copy link
Member

Thanks @pedroreys !

simonjayhawkins pushed a commit that referenced this pull request Feb 21, 2020
roberthdevries pushed a commit to roberthdevries/pandas that referenced this pull request Mar 2, 2020
…ror (pandas-dev#32055)

When a reading a pickle with MultiIndex columns generated in py27
`pickle_compat.load()` with `enconding=None` would throw an UnicodeDecodeError
when reading a pickle created in py27. Now, `read_pickle` catches that exception and
fallback to use `latin-1` explicitly.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO Pickle read_pickle, to_pickle Regression Functionality that used to work in a prior pandas version
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Error in read_pickle when loading a DataFrame with MultiIndex columns from a pickle created in py27
5 participants