Skip to content

BUG: reindex on empty Series raises ValueError: zero-size array to reduction operation #40235

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
2 of 3 tasks
TomGoBravo opened this issue Mar 4, 2021 · 0 comments · Fixed by #41358
Closed
2 of 3 tasks
Labels
Bug Needs Triage Issue that has not been reviewed by a pandas team member
Milestone

Comments

@TomGoBravo
Copy link

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • (optional) I have confirmed this bug exists on the master branch of pandas.


Code Sample, a copy-pastable example

index = pd.MultiIndex.from_tuples([], names=["first", "second"])
s = pd.Series([], index=index, dtype=object)
print(f"s...\n{s}\n")
print(f"reindexed...\n{s.reindex(['a'], level='first')}")

prints

s...
Series([], dtype: object)

and then raises.

Problem description

reindex raises ValueError: zero-size array to reduction operation maximum which has no identity from numpy/core/_methods.py. Traceback:

--------------------------------------------------------------------------- ValueError Traceback (most recent call last) in 3 s = pd.Series(range(len(index)), index=index, dtype=object) 4 print(f"s...\n{s}\n") ----> 5 print(f"reindexed...\n{s.reindex(['a'], level='first')}")

~/.pyenv/versions/3.7.9/envs/covid-data-model/lib/python3.7/site-packages/pandas/core/series.py in reindex(self, index, **kwargs)
4343 )
4344 def reindex(self, index=None, **kwargs):
-> 4345 return super().reindex(index=index, **kwargs)
4346
4347 def drop(

~/.pyenv/versions/3.7.9/envs/covid-data-model/lib/python3.7/site-packages/pandas/core/generic.py in reindex(self, *args, **kwargs)
4810 # perform the reindex on the axes
4811 return self._reindex_axes(
-> 4812 axes, level, limit, tolerance, method, fill_value, copy
4813 ).finalize(self, method="reindex")
4814

~/.pyenv/versions/3.7.9/envs/covid-data-model/lib/python3.7/site-packages/pandas/core/generic.py in _reindex_axes(self, axes, level, limit, tolerance, method, fill_value, copy)
4826 ax = self._get_axis(a)
4827 new_index, indexer = ax.reindex(
-> 4828 labels, level=level, limit=limit, tolerance=tolerance, method=method
4829 )
4830

~/.pyenv/versions/3.7.9/envs/covid-data-model/lib/python3.7/site-packages/pandas/core/indexes/multi.py in reindex(self, target, method, level, limit, tolerance)
2469 target = ensure_index(target)
2470 target, indexer, _ = self._join_level(
-> 2471 target, level, how="right", return_indexers=True, keep_order=False
2472 )
2473 else:

~/.pyenv/versions/3.7.9/envs/covid-data-model/lib/python3.7/site-packages/pandas/core/indexes/base.py in _join_level(self, other, level, how, return_indexers, keep_order)
3922 else: # tie out the order with other
3923 if level == 0: # outer most level, take the fast route
-> 3924 ngroups = 1 + new_lev_codes.max()
3925 left_indexer, counts = libalgos.groupsort_indexer(
3926 new_lev_codes, ngroups

~/.pyenv/versions/3.7.9/envs/covid-data-model/lib/python3.7/site-packages/numpy/core/_methods.py in _amax(a, axis, out, keepdims, initial, where)
37 def _amax(a, axis=None, out=None, keepdims=False,
38 initial=_NoValue, where=True):
---> 39 return umr_maximum(a, axis, None, out, keepdims, initial, where)
40
41 def _amin(a, axis=None, out=None, keepdims=False,

ValueError: zero-size array to reduction operation maximum which has no identity

Expected Output

I think it makes sense for s.reindex(['a'], level='first') to return an empty Series, as it does when s is not empty:

index = pd.MultiIndex.from_tuples([('bar', 'one'),('foo', 'two')], names=["first", "second"])
s = pd.Series(range(len(index)), index=index, dtype=object)
print(f"s...\n{s}\n")
print(f"reindexed...\n{s.reindex(['a'], level='first')}")

prints

s...
first  second
bar    one       0
foo    two       1
dtype: object

reindexed...
Series([], dtype: object)

In 1.0.3 I could use df.loc[['label']] to get a slice without an exception when no data was found but that started raising KeyError when I upgraded to 1.2.3. I'm using reindex per https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#reindexing

Output of pd.show_versions()

INSTALLED VERSIONS

commit : f2c8480
python : 3.7.9.final.0
python-bits : 64
OS : Linux
OS-release : 4.19.0-11-amd64
Version : #1 SMP Debian 4.19.146-1 (2020-09-17)
machine : x86_64
processor :
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8

pandas : 1.2.3
numpy : 1.20.1
pytz : 2020.5
dateutil : 2.8.1
pip : 21.0.1
setuptools : 47.1.0
Cython : None
pytest : 6.2.1
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 2.11.2
IPython : 7.19.0
pandas_datareader: None
bs4 : None
bottleneck : None
fsspec : None
fastparquet : 0.5.0
gcsfs : None
matplotlib : 3.2.1
numexpr : 2.7.3
odfpy : None
openpyxl : 3.0.6
pandas_gbq : None
pyarrow : None
pyxlsb : None
s3fs : None
scipy : 1.6.1
sqlalchemy : 1.3.23
tables : None
tabulate : 0.8.7
xarray : 0.16.2
xlrd : 1.2.0
xlwt : None
numba : 0.52.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Needs Triage Issue that has not been reviewed by a pandas team member
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants