Skip to content

COMPAT: remove Categorical pickle compat with Pandas < 0.16 #27538

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Sep 19, 2019

Conversation

topper-123
Copy link
Contributor

@topper-123 topper-123 commented Jul 23, 2019

  • closes #xxxx
  • tests added / passed
  • passes black pandas
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry

Removes Categorical pickle compat with Pandas < 0.16.

I've added a categorical.pickle file to replace the deprecated file. It was create like this:

>>> cat = pd.Categorical(['a', 'b', 'c', 'd'])
>>> pickle.dump(cat, open(<filepath>, 'wb'))

@topper-123 topper-123 added Categorical Categorical Data Type IO Data IO issues that don't fit into a more specific label Compat pandas objects compatability with Numpy or Python functions labels Jul 23, 2019
@topper-123 topper-123 force-pushed the Categorical_pickle_compat branch from 8a8a650 to bcf842d Compare July 23, 2019 09:06
@topper-123 topper-123 closed this Jul 23, 2019
@jbrockmendel
Copy link
Member

why close? ive been looking forward to seeing these big chunks of pickle compat code removed

@topper-123
Copy link
Contributor Author

Sorry, I saw there were errors, and don't have time today to figure them out and didn't want to leave a failing PR open too long.

I'll open the PR again, when I figure out the problem.

@topper-123 topper-123 reopened this Aug 17, 2019
@topper-123 topper-123 force-pushed the Categorical_pickle_compat branch from bcf842d to 890a902 Compare August 17, 2019 20:16
@topper-123
Copy link
Contributor Author

Reopened and tests passed.

@topper-123 topper-123 force-pushed the Categorical_pickle_compat branch 2 times, most recently from 37e5881 to 68ef5a1 Compare August 17, 2019 21:13
@topper-123
Copy link
Contributor Author

@jbrockmendel , agree with these changes?

@@ -222,7 +222,7 @@ def test_read_expands_user_home_dir(
(pd.read_sas, "os", ("io", "sas", "data", "test1.sas7bdat")),
(pd.read_json, "os", ("io", "json", "data", "tsframe_v012.json")),
(pd.read_msgpack, "os", ("io", "msgpack", "data", "frame.mp")),
(pd.read_pickle, "os", ("io", "data", "categorical_0_14_1.pickle")),
(pd.read_pickle, "os", ("io", "data", "categorical.pickle")),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should there be a version number on this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that will probably be easier for future maintenance. Will add it.

@jbrockmendel
Copy link
Member

Small question, otherwise LGTM

@topper-123 topper-123 force-pushed the Categorical_pickle_compat branch from 68ef5a1 to f5f98b9 Compare August 19, 2019 19:19
@topper-123 topper-123 added this to the 1.0 milestone Aug 19, 2019
@topper-123
Copy link
Contributor Author

I’ve renamed the file.

@topper-123
Copy link
Contributor Author

@jbrockmendel ? Ok to pull in?

@jbrockmendel
Copy link
Member

pickle compat is way outside my area of expertise. @jorisvandenbossche?

else:
state["_ordered"] = False

# 0.21.0 CategoricalDtype change
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is there a 025 pickle
file included? this is covered by the legacy pickle dukes no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test test_read_fspath_all tests if reading custom file paths (CustomFS) returns an expected value. Seems reasonable that a pickle are included for that testing. Adding 0.25 in the name is not strictly necessary, but can possibly be handy info to have.

Probably this test could instead test if temp files are read correctly back in, but that would be for another PR, IMO.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we already have round trip categorical pickle tests

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't mean test round-tripping, but use roundtripping in this test, rather than keeping a file in /data.

Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. I don't have a problem with the inclusion of the 0.25 file but am indifferent if we want to do away with it as well

@@ -67,6 +67,8 @@ Removal of prior version deprecations/changes
- :meth:`pandas.Series.str.cat` does not accept list-likes *within* list-likes anymore (:issue:`27611`)
- Removed the previously deprecated :meth:`ExtensionArray._formatting_values`. Use :attr:`ExtensionArray._formatter` instead. (:issue:`23601`)
- Removed the previously deprecated ``IntervalIndex.from_intervals`` in favor of the :class:`IntervalIndex` constructor (:issue:`19263`)
- Ability to read pickles containing :class:`Categorical` instances created with pre-0.16 version of pandas has been removed (:issue:`27538`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The whatsnew for v0.25.0 already states that we dropped pickle support prior to 0.20.3 so this isn't necessary, though I guess doesn't hurt to include

Copy link
Contributor Author

@topper-123 topper-123 Aug 28, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO it is ok to include this, given that something won't work that previously did work...

@WillAyd
Copy link
Member

WillAyd commented Sep 17, 2019

@topper-123 can you merge master on this to be safe? If no other feedback in 2 days let's get this merged

@topper-123 topper-123 force-pushed the Categorical_pickle_compat branch from f5f98b9 to 79663fb Compare September 17, 2019 18:07
@topper-123
Copy link
Contributor Author

I've rebased.

@WillAyd WillAyd merged commit 3f8c0c4 into pandas-dev:master Sep 19, 2019
@WillAyd
Copy link
Member

WillAyd commented Sep 19, 2019

Thanks @topper-123

@jbrockmendel
Copy link
Member

There are other <0.16 pickle tests out there. Can we get rid of those too? tests.tseries.test_offsets.test_pickle_v0_15_2 is a blocker for cythonizing DateOffset.

@WillAyd
Copy link
Member

WillAyd commented Sep 19, 2019

I think so - we are only guaranteed backwards compatible to pandas 0.20.3 as of the 0.25 release

see #27082

@topper-123 topper-123 deleted the Categorical_pickle_compat branch September 19, 2019 19:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Categorical Categorical Data Type Compat pandas objects compatability with Numpy or Python functions IO Data IO issues that don't fit into a more specific label
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants