-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: Regression in 1.1.0. "invalid slicing for a 1-ndim ExtensionArray" #37631
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@buhrmann Thanks for the report! I can confirm this is a regression in 1.1 (and on master) compared to 1.0. |
There is something wrong with the Block layout of the Series then went through the pickle roundtrip:
The categorical block should not be 2D, here |
assertion added in #32959. could confirm with bisect if needed. |
It looks like the incorrect |
Indeed, but so then we didn't have the AssertionError checking it, and things were (here) still working |
Code Sample, a copy-pastable example
Problem description
Not sure what changes in the serialization roundtrip through pickle, but it seems the copied Series cannot be indexed with a Boolean slice anymore, tripping up
dropna()
as a result. The following code more directly exposes the error:The error happens e.g. when processing multiple Series in parallel (triggering serialization with pickle), and when a categorical Series has been filtered down to a single row. With another dtype, or more than one row, this error doesn't get triggered.
The regression must been introduced in version 1.1.0, as in 1.0.5 the above code works as expected.
Expected Output
Behaviour of slicing,
dropna
etc. should be same before and after pickling a Series, and independent of the number of rows.Output of
pd.show_versions()
INSTALLED VERSIONS
commit : 67a3d42
python : 3.7.6.final.0
python-bits : 64
OS : Darwin
OS-release : 19.6.0
Version : Darwin Kernel Version 19.6.0: Mon Aug 31 22:12:52 PDT 2020; root:xnu-6153.141.2~1/RELEASE_X86_64
machine : x86_64
processor : i386
byteorder : little
LC_ALL : None
LANG : None
LOCALE : None.UTF-8
pandas : 1.1.4
numpy : 1.18.1
pytz : 2019.3
dateutil : 2.8.0
pip : 20.0.2
setuptools : 46.1.1.post20200322
Cython : None
pytest : 5.4.1
hypothesis : None
sphinx : 2.4.4
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.5.0
html5lib : None
pymysql : 0.9.3
psycopg2 : 2.8.4 (dt dec pq3 ext lo64)
jinja2 : 2.11.1
IPython : 7.13.0
pandas_datareader: None
bs4 : 4.9.1
bottleneck : None
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : 3.2.1
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : 1.0.0
pytables : None
pyxlsb : None
s3fs : None
scipy : 1.4.1
sqlalchemy : 1.3.15
tables : None
tabulate : None
xarray : None
xlrd : 1.2.0
xlwt : None
numba : 0.46.0
The text was updated successfully, but these errors were encountered: