Skip to content

BUG: pickle.load() does not work with pandas1.4.2 files if pandas 2.1.4 is installed #56825

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
3 tasks done
tgutzler opened this issue Jan 11, 2024 · 3 comments
Open
3 tasks done
Labels
Bug IO Pickle read_pickle, to_pickle Needs Triage Issue that has not been reviewed by a pandas team member

Comments

@tgutzler
Copy link

tgutzler commented Jan 11, 2024

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

# conda activate pandas2.1.4-env
import pickle
d = pickle.load(open('orig.pkl', 'rb'))
  File "miniconda3/envs/terratest/lib/python3.9/site-packages/pandas/core/internals/blocks.py", line 2400, in new_block
    return klass(values, ndim=ndim, placement=placement, refs=refs)
TypeError: Argument 'placement' has incorrect type (expected pandas._libs.internals.BlockPlacement, got slice)

# conda activate pandas1.4.2-env
d = pickle.load(open('orig.pkl', 'rb'))
pickle.dump(d, open('mod.pkl', 'wb'))

# conda activate pandas2.1.4-env
import pickle
d = pickle.load(open('mod.pkl', 'rb'))
# no error

Issue Description

Hi, I have a pickle file (uncertain which version it was created with, it's protocol version 4). When I try to open it with pandas 2.1.4, I get this error:
File "[...]/miniconda3/envs/terratest/lib/python3.9/site-packages/pandas/core/internals/blocks.py", line 2400, in new_block
return klass(values, ndim=ndim, placement=placement, refs=refs)
TypeError: Argument 'placement' has incorrect type (expected pandas._libs.internals.BlockPlacement, got slice)

When I load this with pandas 1.4.2 and dump it again, without changing anything, I can load it fine with pandas 2.1.4.
This is all with python 3.9.18.
How can I get to the bottom of this? I don't want to attach the file here but happy to pm it
Happy to provide my pickle file to the person who is investigating, just not the entire internet.

Expected Behavior

orig.pkl should load in pandas2.1.4 environment

Installed Versions

pandas2.1.4-env:

miniconda3/envs/terratest/lib/python3.9/site-packages/_distutils_hack/init.py:33: UserWarning: Setuptools is replacing distutils.
warnings.warn("Setuptools is replacing distutils.")

INSTALLED VERSIONS

commit : a671b5a
python : 3.9.18.final.0
python-bits : 64
OS : Linux
OS-release : 5.15.133.1-microsoft-standard-WSL2
Version : #1 SMP Thu Oct 5 21:02:42 UTC 2023
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : C.UTF-8
LOCALE : en_US.UTF-8

pandas : 2.1.4
numpy : 1.24.2
pytz : 2023.3.post1
dateutil : 2.8.2
setuptools : 68.0.0
pip : 23.2.1
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : None
IPython : None
pandas_datareader : None
bs4 : None
bottleneck : None
dataframe-api-compat: None
fastparquet : None
fsspec : None
gcsfs : None
matplotlib : 3.7.3
numba : None
numexpr : None
odfpy : None
openpyxl : 3.1.2
pandas_gbq : None
pyarrow : None
pyreadstat : None
pyxlsb : None
s3fs : None
scipy : 1.10.1
sqlalchemy : 2.0.6
tables : None
tabulate : None
xarray : None
xlrd : None
zstandard : None
tzdata : 2023.3
qtpy : None
pyqt5 : None

pandas1.4.2-env:
pd.show_versions()
Traceback (most recent call last):
File "", line 1, in
File "miniconda3/envs/joa/lib/python3.9/site-packages/pandas/util/_print_versions.py", line 109, in show_versions
deps = _get_dependency_info()
File "miniconda3/envs/joa/lib/python3.9/site-packages/pandas/util/_print_versions.py", line 88, in _get_dependency_info
mod = import_optional_dependency(modname, errors="ignore")
File "miniconda3/envs/joa/lib/python3.9/site-packages/pandas/compat/_optional.py", line 138, in import_optional_dependency
module = importlib.import_module(name)
File "miniconda3/envs/joa/lib/python3.9/importlib/init.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1030, in _gcd_import
File "", line 1007, in _find_and_load
File "", line 986, in _find_and_load_unlocked
File "", line 680, in _load_unlocked
File "", line 850, in exec_module
File "", line 228, in _call_with_frames_removed
File "miniconda3/envs/joa/lib/python3.9/site-packages/setuptools/init.py", line 7, in
import _distutils_hack.override # noqa: F401
File "miniconda3/envs/joa/lib/python3.9/site-packages/_distutils_hack/override.py", line 1, in
import('_distutils_hack').do_override()
File "miniconda3/envs/joa/lib/python3.9/site-packages/_distutils_hack/init.py", line 77, in do_override
ensure_local_distutils()
File "miniconda3/envs/joa/lib/python3.9/site-packages/_distutils_hack/init.py", line 64, in ensure_local_distutils
assert '_distutils' in core.file, core.file
AssertionError: miniconda3/envs/joa/lib/python3.9/distutils/core.py

but alternatively:
$ pip freeze
brotlipy==0.7.0
certifi==2020.6.20
cffi==1.16.0
click==8.1.3
colorama==0.4.4
cycler==0.11.0
fonttools==4.33.3
kiwisolver==1.4.2
matplotlib==3.5.1
numpy==1.22.3
packaging==21.3
pandas==1.4.2
Pillow==9.1.0
pycparser==2.21
pyparsing==3.0.8
PySide2==5.15.2
python-dateutil==2.8.2
pytz==2022.1
ruamel.yaml==0.15.87
scipy==1.8.0
seaborn==0.11.2
shiboken2==5.15.2
six==1.16.0
wincertstore==0.2

@tgutzler tgutzler added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Jan 11, 2024
@tgutzler tgutzler changed the title BUG: BUG: pickle.load() does not work with pandas1.4.2 files if pandas 2.1.4 is installed Jan 11, 2024
@simonjayhawkins simonjayhawkins added the IO Pickle read_pickle, to_pickle label Feb 6, 2024
@Huang1720
Copy link

same problem, so I use pd.read_pickle() instead and it works well.

@JulPRL
Copy link

JulPRL commented Feb 23, 2024

Thanks for the tips. pd.read_pickle() works very well :)

@vetsasai
Copy link

pd.read_pickle() is so slow that it takes almost 3 times the time a normal pickle.load() would take with old pandas 1.3.4 version

Upgrading pandas 2.2.2 and loading this pickle which was saved with pandas 1.3.4 gives this behaviour.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug IO Pickle read_pickle, to_pickle Needs Triage Issue that has not been reviewed by a pandas team member
Projects
None yet
Development

No branches or pull requests

5 participants