Skip to content

BUG: Creating a MultiIndex using pd.Index and the name argument is now broken (worked in 1.2.5 with names) #44052

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
3 tasks done
Dr-Irv opened this issue Oct 16, 2021 · 3 comments
Labels
Bug Index Related to the Index class or subclasses MultiIndex

Comments

@Dr-Irv
Copy link
Contributor

Dr-Irv commented Oct 16, 2021

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the master branch of pandas.

Reproducible Example

In [1]: import pandas as pd

In [2]: pd.__version__
Out[2]: '1.4.0.dev0+905.gdace93d694'

In [3]: pd.Index([(1,0), (1,1)], names=["foo", "bar"])
<ipython-input-3-99fe0b033485>:1: FutureWarning: Passing keywords other than 'data', 'dtype', 'copy', 'name', 'tupleize_cols' is deprecated and will raise TypeError in a future version.  Use the specific Index subclass directly instead.
  pd.Index([(1,0), (1,1)], names=["foo", "bar"])
Out[3]:
MultiIndex([(1, 0),
            (1, 1)],
           names=['foo', 'bar'])

In [4]: pd.Index([(1,0), (1,1)], name=["foo", "bar"])
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-4-138c3fe61553> in <module>
----> 1 pd.Index([(1,0), (1,1)], name=["foo", "bar"])

c:\Code\pandas_dev\pandas\pandas\core\indexes\base.py in __new__(cls, data, dtype, copy, name, tupleize_cols, **kwargs)
    403         from pandas.core.indexes.range import RangeIndex
    404
--> 405         name = maybe_extract_name(name, data, cls)
    406
    407         if dtype is not None:

c:\Code\pandas_dev\pandas\pandas\core\indexes\base.py in maybe_extract_name(name, obj, cls)
   6876     # GH#29069
   6877     if not is_hashable(name):
-> 6878         raise TypeError(f"{cls.__name__}.name must be a hashable type")
   6879
   6880     return name

TypeError: Index.name must be a hashable type

Issue Description

In pandas 1.2.5, you could create a MultiIndex using pd.Index with the names argument:

In [1]: import pandas as pd

In [2]: pd.__version__
Out[2]: '1.2.5'

In [3]: pd.Index([(1,0), (1,1)], names=["foo", "bar"])
Out[3]:
MultiIndex([(1, 0),
            (1, 1)],
           names=['foo', 'bar'])

With names being removed as an argument in the future, the user then has to be explicit about using the pd.MultiIndex() constructor.

See discussion with @jreback starting here: #35292 (comment)

Expected Behavior

Unclear. Should we accept names in pd.Index() ? Because you can create a MultiIndex without names via

In [5]: pd.Index([(1,0), (1,1)])
Out[5]:
MultiIndex([(1, 0),
            (1, 1)],
           )

Installed Versions

INSTALLED VERSIONS

commit : dace93d
python : 3.8.6.final.0
python-bits : 64
OS : Windows
OS-release : 10
Version : 10.0.19041
machine : AMD64
processor : Intel64 Family 6 Model 158 Stepping 13, GenuineIntel
byteorder : little
LC_ALL : None
LANG : None
LOCALE : English_United States.1252

pandas : 1.4.0.dev0+905.gdace93d694
numpy : 1.21.2
pytz : 2021.1
dateutil : 2.8.2
pip : 21.2.4
setuptools : 49.6.0.post20210108
Cython : 0.29.24
pytest : 6.2.5
hypothesis : 6.19.0
sphinx : 4.1.2
blosc : None
feather : None
xlsxwriter : 3.0.1
lxml.etree : 4.6.3
html5lib : 1.1
pymysql : None
psycopg2 : None
jinja2 : 3.0.1
IPython : 7.27.0
pandas_datareader: None
bs4 : 4.10.0
bottleneck : 1.3.2
fsspec : 2021.05.0
fastparquet : 0.7.1
gcsfs : 2021.05.0
matplotlib : 3.4.2
numexpr : 2.7.3
odfpy : None
openpyxl : 3.0.7
pandas_gbq : None
pyarrow : 4.0.0
pyxlsb : 1.0.6
s3fs : 2021.05.0
scipy : 1.7.1
sqlalchemy : 1.4.23
tables : 3.6.1
tabulate : 0.8.9
xarray : 0.18.2
xlrd : 2.0.1
xlwt : 1.3.0
numba : 0.53.1

@Dr-Irv Dr-Irv added Bug MultiIndex Index Related to the Index class or subclasses Needs Triage Issue that has not been reviewed by a pandas team member labels Oct 16, 2021
@erfannariman
Copy link
Member

erfannariman commented Oct 16, 2021

Could you elaborate what the bug is here exactly? TypeError: Index.name must be a hashable type makes sense, although it is confusing there is both a name and names argument (with the latter being removed in the future).

@Dr-Irv
Copy link
Contributor Author

Dr-Irv commented Oct 17, 2021

although it is confusing there is both a name and names argument (with the latter being removed in the future).

That's the point. With 1.2.5 and earlier, you could create a MultiIndex using pd.Index() and names= to specify the names. Since there was a decision to deprecate names, there isn't a path forward for people to create a MultiIndex using pd.Index() .

So we either do not deprecate names or we allow name to include multiple names when pd.Index() is used. We need to fix the inconsistency in the API.

@jreback
Copy link
Contributor

jreback commented Oct 17, 2021

this is really for an internal change - i think the external api is fine

@mroeschke mroeschke removed the Needs Triage Issue that has not been reviewed by a pandas team member label Oct 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Index Related to the Index class or subclasses MultiIndex
Projects
None yet
Development

No branches or pull requests

4 participants