ENH: Allow index names to be included in itertuples() result #27407

Dr-Irv · 2019-07-15T22:29:25Z

Code Sample, a copy-pastable example if possible

In [1]: import pandas as pd

In [2]: df = pd.DataFrame({'x': [1, 2, 3, 4], 'y': [10, 20, 30, 40]},
            index=pd.MultiIndex.from_product([['a', 'b'], ['c', 'd']],
                                                names=['ab', 'cd']))
        df
Out[2]:
       x   y
ab cd
a  c   1  10
   d   2  20
b  c   3  30
   d   4  40

In [3]: for it in df.itertuples():
               print(it)

Pandas(Index=('a', 'c'), x=1, y=10)
Pandas(Index=('a', 'd'), x=2, y=20)
Pandas(Index=('b', 'c'), x=3, y=30)
Pandas(Index=('b', 'd'), x=4, y=40)

Problem description

When iterating through a DataFrame, the names of the Index are lost.

It would be really convenient if when a MultiIndex is used, the names of the MultiIndex were included in the result of itertuples().

Propose to add named argument to itertuples() called nameIndex with default value False to retain current behavior, and nameIndex=True causing output as shown below.

Expected Output

Pandas(Index=Index(ab='a', cd='c'), x=1, y=10)
Pandas(Index=Index(ab='a', cd='d'), x=2, y=20)
Pandas(Index=Index(ab='b', cd='c'), x=3, y=30)
Pandas(Index=Index(ab='b', cd='d'), x=4, y=40)

Output of `pd.show_versions()`

INSTALLED VERSIONS

commit : b57d523
python : 3.7.3.final.0
python-bits : 64
OS : Windows
OS-release : 10
machine : AMD64
processor : Intel64 Family 6 Model 60 Stepping 3, GenuineIntel
byteorder : little
LC_ALL : None
LANG : None
LOCALE : None.None

pandas : 0.25.0rc0+62.gb57d523b3
numpy : 1.16.4
pytz : 2019.1
dateutil : 2.8.0
pip : 19.1.1
setuptools : 41.0.1
Cython : 0.29.11
pytest : 5.0.0
hypothesis : 4.23.6
sphinx : 1.8.5
blosc : None
feather : None
xlsxwriter : 1.1.8
lxml.etree : 4.3.4
html5lib : 1.0.1
pymysql : None
psycopg2 : None
jinja2 : 2.10.1
IPython : 7.6.1
pandas_datareader: None
bs4 : 4.7.1
bottleneck : 1.2.1
fastparquet : 0.3.0
gcsfs : None
lxml.etree : 4.3.4
matplotlib : 3.1.0
numexpr : 2.6.9
odfpy : None
openpyxl : 2.6.2
pandas_gbq : None
pyarrow : 0.11.1
pytables : None
s3fs : 0.2.1
scipy : 1.2.1
sqlalchemy : 1.3.5
tables : 3.5.2
xarray : 0.12.1
xlrd : 1.2.0
xlwt : 1.3.0
xlsxwriter : 1.1.8

The text was updated successfully, but these errors were encountered:

akukuq · 2020-01-27T13:25:03Z

I would like to add that even when using a normal index (i.e. not a MultiIndex) it is possible to assign that index a name (df.index.name = "some_name"). In such cases it would make sense to use the assigned name in the namedtuples yielded by itertuples.

Expected Output

>>> df = pd.DataFrame({"foo": [0,1,2]})
>>> df.index.name = "bar"
>>> next(df.itertuples())
Pandas(bar=0, foo=0) #Currently returns Pandas(Index=0, foo=0)

konstantinmiller · 2020-08-04T12:42:50Z

I would expect the same should work with iterrows():

import pandas as pd

df = pd.DataFrame(
    index=pd.MultiIndex(
        names=['ind1'],
        levels=[['a']],
        codes=[[0]]
    ),
    data={'C': [42]})

ind, row = next(df.iterrows())
row.C
ind.ind1

Since ind is not a named tuple, it throws an AttributeError: 'tuple' object has no attribute 'ind1'

Dr-Irv mentioned this issue Jul 30, 2019

API: Meta-issue for making consistent API's to refer to column names and index names #27652

Open

mroeschke added API Design Enhancement labels Nov 2, 2019

simonjayhawkins mentioned this issue Aug 4, 2020

Can't access multi-index names when iterating over rows #35538

Closed

mroeschke removed the API Design label Jul 10, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: Allow index names to be included in itertuples() result #27407

ENH: Allow index names to be included in itertuples() result #27407

Dr-Irv commented Jul 15, 2019

INSTALLED VERSIONS

akukuq commented Jan 27, 2020

konstantinmiller commented Aug 4, 2020

ENH: Allow index names to be included in itertuples() result #27407

ENH: Allow index names to be included in itertuples() result #27407

Comments

Dr-Irv commented Jul 15, 2019

Code Sample, a copy-pastable example if possible

Problem description

Expected Output

Output of pd.show_versions()

INSTALLED VERSIONS

akukuq commented Jan 27, 2020

Expected Output

konstantinmiller commented Aug 4, 2020

Output of `pd.show_versions()`