Skip to content

Non-intuitive behavior when multi-indexing a Series #35349

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
konstantinmiller opened this issue Jul 20, 2020 · 2 comments · Fixed by #39372
Closed

Non-intuitive behavior when multi-indexing a Series #35349

konstantinmiller opened this issue Jul 20, 2020 · 2 comments · Fixed by #39372
Labels
Bug Error Reporting Incorrect or improved errors from pandas Indexing Related to indexing on series/frames, not to indexes themselves MultiIndex Series Series data structure
Milestone

Comments

@konstantinmiller
Copy link

Most likely that is not a bug but intended behavior but I find it quite confusing.

Consider the following example

idx = pd.IndexSlice
s = pd.Series(index=pd.MultiIndex.from_tuples([('A', '0'), ('A', '1'), ('B', '0')]), 
              data=[21, 22, 23])
s.loc[idx['A', :], :]

which gives

A  0    21
   1    22
B  0    23
dtype: int64

while intuitively, one would expect

A  0    21
   1    22
dtype: int64

I would expect the latter because that's what you would get with a pd.DataFrame where the second : is referring to the column index. Not paying attention that the object is a series, gives you a completely different result than if you would have a data frame.

@konstantinmiller konstantinmiller added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Jul 20, 2020
@simonjayhawkins
Copy link
Member

Thanks @konstantinmiller for the report.

on master this now raises an Exception instead of producing the unexpected behaviour. What version of pandas are you using.

>>> import pandas as pd
>>>
>>> pd.__version__
'1.1.0rc0'
>>>
>>> idx = pd.IndexSlice
>>> s = pd.Series(
...     index=pd.MultiIndex.from_tuples([("A", "0"), ("A", "1"), ("B", "0")]),
...     data=[21, 22, 23],
... )
>>> s
A  0    21
   1    22
B  0    23
dtype: int64
>>>
>>> s.loc[idx["A", :], :]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\simon\pandas\pandas\core\indexing.py", line 873, in __getitem__
    return self._getitem_tuple(key)
  File "C:\Users\simon\pandas\pandas\core\indexing.py", line 1044, in _getitem_tuple
    return self._getitem_lowerdim(tup)
  File "C:\Users\simon\pandas\pandas\core\indexing.py", line 766, in _getitem_lowerdim
    return self._getitem_nested_tuple(tup)
  File "C:\Users\simon\pandas\pandas\core\indexing.py", line 834, in _getitem_nested_tuple
    return self._getitem_axis(tup, axis=axis)
  File "C:\Users\simon\pandas\pandas\core\indexing.py", line 1103, in _getitem_axis
    locs = labels.get_locs(key)
  File "C:\Users\simon\pandas\pandas\core\indexes\multi.py", line 3108, in get_locs
    indexer = self._reorder_indexer(seq, indexer)
  File "C:\Users\simon\pandas\pandas\core\indexes\multi.py", line 3137, in _reorder_indexer
    k_codes = self.levels[i].get_indexer(k)
  File "C:\Users\simon\pandas\pandas\core\indexes\base.py", line 3000, in get_indexer
    indexer = self._engine.get_indexer(target._get_engine_target())
  File "pandas\_libs\index.pyx", line 254, in pandas._libs.index.IndexEngine.get_indexer
    return self.mapping.lookup(values)
  File "pandas\_libs\hashtable_class_helper.pxi", line 1724, in pandas._libs.hashtable.PyObjectHashTable.lookup
    hash(val)
TypeError: unhashable type: 'slice'
>>>

I would say that maybe this is still incorrect and I think that the error should reflect that too many indexers are involved.

@simonjayhawkins simonjayhawkins added Error Reporting Incorrect or improved errors from pandas Indexing Related to indexing on series/frames, not to indexes themselves MultiIndex Series Series data structure and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Jul 20, 2020
@konstantinmiller
Copy link
Author

I'm using 1.0.5. Sorry for not checking the behavior on the mater branch, I should have done that. So, great that it's now raising an exception.

Of course, if possible, it would be helpful to have a more expressive error message pointing clearer to the actual error :)

@simonjayhawkins simonjayhawkins added this to the Contributions Welcome milestone Jul 31, 2020
@jreback jreback modified the milestones: Contributions Welcome, 1.3 Jan 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Error Reporting Incorrect or improved errors from pandas Indexing Related to indexing on series/frames, not to indexes themselves MultiIndex Series Series data structure
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants