Skip to content

CLN: Remove pickle support pre-pandas 1.0 #57155

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 36 commits into from
Closed
Show file tree
Hide file tree
Changes from 32 commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
109ae02
Remove old pickle compat
mroeschke Apr 13, 2023
0613b0e
more pickle cleanup
mroeschke Apr 13, 2023
ab12e05
Remove pre-1.0 pickle compat
mroeschke Apr 13, 2023
ae6a732
add note about compat version
mroeschke Apr 13, 2023
b40877c
Merge remote-tracking branch 'upstream/main' into cln/old_pickle_compat
mroeschke May 15, 2023
cdd1bc4
Merge remote-tracking branch 'upstream/main' into cln/old_pickle_compat
mroeschke May 24, 2023
c9bf160
Merge remote-tracking branch 'upstream/main' into cln/old_pickle_compat
mroeschke Jun 1, 2023
c405215
Merge remote-tracking branch 'upstream/main' into cln/old_pickle_compat
mroeschke Jun 20, 2023
14a6efd
Merge remote-tracking branch 'upstream/main' into cln/old_pickle_compat
mroeschke Jun 29, 2023
cea32bd
Merge remote-tracking branch 'upstream/main' into cln/old_pickle_compat
mroeschke Jul 7, 2023
f0e6d28
Merge remote-tracking branch 'upstream/main' into cln/old_pickle_compat
mroeschke Jul 24, 2023
515f3ef
Merge remote-tracking branch 'upstream/main' into cln/old_pickle_compat
mroeschke Jul 31, 2023
93d5d5f
Fix note
mroeschke Jul 31, 2023
2542697
Merge remote-tracking branch 'upstream/main' into cln/old_pickle_compat
mroeschke Aug 4, 2023
972f1c1
Merge remote-tracking branch 'upstream/main' into cln/old_pickle_compat
mroeschke Aug 10, 2023
ac357bc
Merge remote-tracking branch 'upstream/main' into cln/old_pickle_compat
mroeschke Aug 18, 2023
81c45ee
Merge remote-tracking branch 'upstream/main' into cln/old_pickle_compat
mroeschke Aug 25, 2023
71a4f7f
Merge remote-tracking branch 'upstream/main' into cln/old_pickle_compat
mroeschke Aug 28, 2023
fe250eb
Change compat function
mroeschke Aug 28, 2023
c01136c
Merge remote-tracking branch 'upstream/main' into cln/old_pickle_compat
mroeschke Sep 11, 2023
0bdf8d8
Merge remote-tracking branch 'upstream/main' into cln/old_pickle_compat
mroeschke Oct 12, 2023
3b0ac12
Merge remote-tracking branch 'upstream/main' into cln/old_pickle_compat
mroeschke Oct 19, 2023
ca52974
Merge remote-tracking branch 'upstream/main' into cln/old_pickle_compat
mroeschke Oct 30, 2023
6bf0d9c
Merge remote-tracking branch 'upstream/main' into cln/old_pickle_compat
mroeschke Nov 6, 2023
8a4b9fe
Merge remote-tracking branch 'upstream/main' into cln/old_pickle_compat
mroeschke Nov 17, 2023
dd98d35
Merge remote-tracking branch 'upstream/main' into cln/old_pickle_compat
mroeschke Nov 21, 2023
a29eb9d
Merge remote-tracking branch 'upstream/main' into cln/old_pickle_compat
mroeschke Dec 4, 2023
2f28a8c
Merge remote-tracking branch 'upstream/main' into cln/old_pickle_compat
mroeschke Dec 11, 2023
2f4c78c
Merge remote-tracking branch 'upstream/main' into cln/old_pickle_compat
mroeschke Dec 27, 2023
5670a89
Merge remote-tracking branch 'upstream/main' into cln/old_pickle_compat
mroeschke Jan 30, 2024
e583478
Add whatsnew
mroeschke Jan 30, 2024
c6415fd
Add number
mroeschke Jan 30, 2024
368562f
Merge remote-tracking branch 'upstream/main' into cln/old_pickle_compat
mroeschke Feb 1, 2024
49d0613
Remove old pickles
mroeschke Feb 1, 2024
1daac37
Merge remote-tracking branch 'upstream/main' into cln/old_pickle_compat
mroeschke Feb 1, 2024
05feac6
fix tests, remove 27 files
mroeschke Feb 1, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion doc/source/whatsnew/v3.0.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,7 @@ See :ref:`install.dependencies` and :ref:`install.optional_dependencies` for mor
Other API changes
^^^^^^^^^^^^^^^^^
- 3rd party ``py.path`` objects are no longer explicitly supported in IO methods. Use :py:class:`pathlib.Path` objects instead (:issue:`57091`)
-
- pickled objects from pandas version less than ``1.0.0`` are no longer supported (:issue:`57155`)

.. ---------------------------------------------------------------------------
.. _whatsnew_300.deprecations:
Expand Down
4 changes: 2 additions & 2 deletions pandas/_libs/tslibs/nattype.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ cdef _nat_rdivide_op(self, other):
return NotImplemented


def __nat_unpickle(*args):
def _nat_unpickle(*args):
# return constant defined in the module
return c_NaT

Expand Down Expand Up @@ -360,7 +360,7 @@ class NaTType(_NaT):
return self.__reduce__()

def __reduce__(self):
return (__nat_unpickle, (None, ))
return (_nat_unpickle, (None, ))

def __rtruediv__(self, other):
return _nat_rdivide_op(self, other)
Expand Down
195 changes: 44 additions & 151 deletions pandas/compat/pickle_compat.py
Original file line number Diff line number Diff line change
@@ -1,12 +1,12 @@
"""
Support pre-0.12 series pickle compatibility.
Pickle compatibility to pandas version 1.0
"""
from __future__ import annotations

import contextlib
import copy
import copyreg
import io
import pickle as pkl
import pickle
from typing import (
TYPE_CHECKING,
Any,
Expand All @@ -17,7 +17,6 @@
from pandas._libs.arrays import NDArrayBacked
from pandas._libs.tslibs import BaseOffset

from pandas import Index
from pandas.core.arrays import (
DatetimeArray,
PeriodArray,
Expand All @@ -29,111 +28,15 @@
from collections.abc import Generator


def load_reduce(self) -> None:
stack = self.stack
args = stack.pop()
func = stack[-1]

try:
stack[-1] = func(*args)
return
except TypeError as err:
# If we have a deprecated function,
# try to replace and try again.

msg = "_reconstruct: First argument must be a sub-type of ndarray"

if msg in str(err):
try:
cls = args[0]
stack[-1] = object.__new__(cls)
return
except TypeError:
pass
elif args and isinstance(args[0], type) and issubclass(args[0], BaseOffset):
# TypeError: object.__new__(Day) is not safe, use Day.__new__()
cls = args[0]
stack[-1] = cls.__new__(*args)
return
elif args and issubclass(args[0], PeriodArray):
cls = args[0]
stack[-1] = NDArrayBacked.__new__(*args)
return

raise


# If classes are moved, provide compat here.
_class_locations_map = {
("pandas.core.sparse.array", "SparseArray"): ("pandas.core.arrays", "SparseArray"),
# 15477
("pandas.core.base", "FrozenNDArray"): ("numpy", "ndarray"),
# Re-routing unpickle block logic to go through _unpickle_block instead
# for pandas <= 1.3.5
("pandas.core.internals.blocks", "new_block"): (
"pandas._libs.internals",
"_unpickle_block",
),
("pandas.core.indexes.frozen", "FrozenNDArray"): ("numpy", "ndarray"),
("pandas.core.base", "FrozenList"): ("pandas.core.indexes.frozen", "FrozenList"),
# 10890
("pandas.core.series", "TimeSeries"): ("pandas.core.series", "Series"),
("pandas.sparse.series", "SparseTimeSeries"): (
"pandas.core.sparse.series",
"SparseSeries",
),
# 12588, extensions moving
("pandas._sparse", "BlockIndex"): ("pandas._libs.sparse", "BlockIndex"),
("pandas.tslib", "Timestamp"): ("pandas._libs.tslib", "Timestamp"),
# 18543 moving period
("pandas._period", "Period"): ("pandas._libs.tslibs.period", "Period"),
("pandas._libs.period", "Period"): ("pandas._libs.tslibs.period", "Period"),
# 18014 moved __nat_unpickle from _libs.tslib-->_libs.tslibs.nattype
("pandas.tslib", "__nat_unpickle"): (
"pandas._libs.tslibs.nattype",
"__nat_unpickle",
),
("pandas._libs.tslib", "__nat_unpickle"): (
"pandas._libs.tslibs.nattype",
"__nat_unpickle",
),
# 15998 top-level dirs moving
("pandas.sparse.array", "SparseArray"): (
"pandas.core.arrays.sparse",
"SparseArray",
),
("pandas.indexes.base", "_new_Index"): ("pandas.core.indexes.base", "_new_Index"),
("pandas.indexes.base", "Index"): ("pandas.core.indexes.base", "Index"),
("pandas.indexes.numeric", "Int64Index"): (
"pandas.core.indexes.base",
"Index", # updated in 50775
),
("pandas.indexes.range", "RangeIndex"): ("pandas.core.indexes.range", "RangeIndex"),
("pandas.indexes.multi", "MultiIndex"): ("pandas.core.indexes.multi", "MultiIndex"),
("pandas.tseries.index", "_new_DatetimeIndex"): (
"pandas.core.indexes.datetimes",
"_new_DatetimeIndex",
),
("pandas.tseries.index", "DatetimeIndex"): (
"pandas.core.indexes.datetimes",
"DatetimeIndex",
),
("pandas.tseries.period", "PeriodIndex"): (
"pandas.core.indexes.period",
"PeriodIndex",
),
# 19269, arrays moving
("pandas.core.categorical", "Categorical"): ("pandas.core.arrays", "Categorical"),
# 19939, add timedeltaindex, float64index compat from 15998 move
("pandas.tseries.tdi", "TimedeltaIndex"): (
"pandas.core.indexes.timedeltas",
"TimedeltaIndex",
),
("pandas.indexes.numeric", "Float64Index"): (
"pandas.core.indexes.base",
"Index", # updated in 50775
),
# 50775, remove Int64Index, UInt64Index & Float64Index from codabase
# 50775, remove Int64Index, UInt64Index & Float64Index from codebase
("pandas.core.indexes.numeric", "Int64Index"): (
"pandas.core.indexes.base",
"Index",
Expand All @@ -153,30 +56,37 @@ def load_reduce(self) -> None:
}


# our Unpickler sub-class to override methods and some dispatcher
# functions for compat and uses a non-public class of the pickle module.

def load_reduce(self):
stack = self.stack
args = stack.pop()
func = stack[-1]

class Unpickler(pkl._Unpickler):
def find_class(self, module, name):
# override superclass
key = (module, name)
module, name = _class_locations_map.get(key, key)
return super().find_class(module, name)
try:
stack[-1] = func(*args)
return
except TypeError:
# If we have a deprecated function,
# try to replace and try again.

if args and isinstance(args[0], type) and issubclass(args[0], BaseOffset):
# TypeError: object.__new__(Day) is not safe, use Day.__new__()
cls = args[0]
stack[-1] = cls.__new__(*args)
return
elif args and issubclass(args[0], PeriodArray):
cls = args[0]
stack[-1] = NDArrayBacked.__new__(*args)
return

Unpickler.dispatch = copy.copy(Unpickler.dispatch)
Unpickler.dispatch[pkl.REDUCE[0]] = load_reduce
raise


def load_newobj(self) -> None:
args = self.stack.pop()
cls = self.stack[-1]

# compat
if issubclass(cls, Index):
obj = object.__new__(cls)
elif issubclass(cls, DatetimeArray) and not args:
if issubclass(cls, DatetimeArray) and not args:
arr = np.array([], dtype="M8[ns]")
obj = cls.__new__(cls, arr, arr.dtype)
elif issubclass(cls, TimedeltaArray) and not args:
Expand All @@ -190,50 +100,33 @@ def load_newobj(self) -> None:
self.stack[-1] = obj


Unpickler.dispatch[pkl.NEWOBJ[0]] = load_newobj


def load_newobj_ex(self) -> None:
kwargs = self.stack.pop()
args = self.stack.pop()
cls = self.stack.pop()

# compat
if issubclass(cls, Index):
obj = object.__new__(cls)
else:
obj = cls.__new__(cls, *args, **kwargs)
self.append(obj)

class Unpickler(pickle.Unpickler):
dispatch_table = copyreg.dispatch_table.copy()
dispatch_table[pickle.REDUCE[0]] = load_reduce
dispatch_table[pickle.NEWOBJ[0]] = load_newobj

try:
Unpickler.dispatch[pkl.NEWOBJ_EX[0]] = load_newobj_ex
except (AttributeError, KeyError):
pass
def find_class(self, module, name):
# override superclass
key = (module, name)
module, name = _class_locations_map.get(key, key)
return super().find_class(module, name)


def load(fh, encoding: str | None = None, is_verbose: bool = False) -> Any:
def load(fh, encoding: str | None = None) -> Any:
"""
Load a pickle, with a provided encoding,

Parameters
----------
fh : a filelike object
encoding : an optional encoding
is_verbose : show exception output
"""
try:
fh.seek(0)
if encoding is not None:
up = Unpickler(fh, encoding=encoding)
else:
up = Unpickler(fh)
# "Unpickler" has no attribute "is_verbose" [attr-defined]
up.is_verbose = is_verbose # type: ignore[attr-defined]

return up.load()
except (ValueError, TypeError):
raise
fh.seek(0)
if encoding is not None:
up = Unpickler(fh, encoding=encoding)
else:
up = Unpickler(fh)
return up.load()


def loads(
Expand All @@ -257,9 +150,9 @@ def patch_pickle() -> Generator[None, None, None]:
"""
Temporarily patch pickle to use our unpickler.
"""
orig_loads = pkl.loads
orig_loads = pickle.loads
try:
setattr(pkl, "loads", loads)
setattr(pickle, "loads", loads)
yield
finally:
setattr(pkl, "loads", orig_loads)
setattr(pickle, "loads", orig_loads)
17 changes: 0 additions & 17 deletions pandas/core/arrays/sparse/array.py
Original file line number Diff line number Diff line change
Expand Up @@ -1375,23 +1375,6 @@ def _where(self, mask, value):
result = type(self)._from_sequence(naive_implementation, dtype=dtype)
return result

# ------------------------------------------------------------------------
# IO
# ------------------------------------------------------------------------
def __setstate__(self, state) -> None:
"""Necessary for making this object picklable"""
if isinstance(state, tuple):
# Compat for pandas < 0.24.0
nd_state, (fill_value, sp_index) = state
sparse_values = np.array([])
sparse_values.__setstate__(nd_state)

self._sparse_values = sparse_values
self._sparse_index = sp_index
self._dtype = SparseDtype(sparse_values.dtype, fill_value)
else:
self.__dict__.update(state)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we still need this else branch? Or are we totally dropping support for this?


def nonzero(self) -> tuple[npt.NDArray[np.int32]]:
if self.fill_value == 0:
return (self.sp_index.indices,)
Expand Down
5 changes: 0 additions & 5 deletions pandas/core/internals/managers.py
Original file line number Diff line number Diff line change
Expand Up @@ -1933,11 +1933,6 @@ def unpickle_block(values, mgr_locs, ndim: int) -> Block:
else:
raise NotImplementedError("pre-0.14.1 pickles are no longer supported")

self._post_setstate()

def _post_setstate(self) -> None:
pass

@cache_readonly
def _block(self) -> Block:
return self.blocks[0]
Expand Down
2 changes: 1 addition & 1 deletion pandas/io/pickle.py
Original file line number Diff line number Diff line change
Expand Up @@ -156,7 +156,7 @@ def read_pickle(

Notes
-----
read_pickle is only guaranteed to be backwards compatible to pandas 0.20.3
read_pickle is only guaranteed to be backwards compatible to pandas 1.0.0
provided the object was serialized with to_pickle.

Examples
Expand Down