Skip to content

BUG: indexing with string in multi-index for Period gives KeyError #9892

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jorisvandenbossche opened this issue Apr 14, 2015 · 6 comments · Fixed by #30646
Closed

BUG: indexing with string in multi-index for Period gives KeyError #9892

jorisvandenbossche opened this issue Apr 14, 2015 · 6 comments · Fixed by #30646
Labels
good first issue Needs Tests Unit test(s) needed to prevent regressions
Milestone

Comments

@jorisvandenbossche
Copy link
Member

From SO:

import pandas as pd
import itertools
a = pd.period_range('2013Q1','2013Q4', freq='Q')
i = (1111, 2222, 3333)
idx = pd.MultiIndex.from_tuples(list(itertools.product(a, i)),
                            names=('Periode', 'CVR'))
df = pd.DataFrame(index=idx,
              columns=('OMS', 'OMK','RES','DRIFT_IND','OEVRIG_IND','FIN_IND','VARE_UD','LOEN_UD','FIN_UD'))

Trying to select with a string gives a ValueError:

In [15]: df.loc[('2013Q1',1111),'OMS']
---------------------------------------------------------------------------
KeyError: ('2013Q1', 1111)

while it works with an explicit Period:

In [16]: df.loc[(pd.Period('2013Q1'),1111),'OMS']
Out[16]: nan

and it also works when not using multi-index notation (only accessing the first level): df.loc['2013Q1', 'OMS']

and this did work in 0.14.1.

@jorisvandenbossche jorisvandenbossche added Bug Regression Functionality that used to work in a prior pandas version Period Period data type MultiIndex labels Apr 14, 2015
@jorisvandenbossche jorisvandenbossche added this to the 0.16.1 milestone Apr 14, 2015
@jorisvandenbossche
Copy link
Member Author

Further, the string syntax also does not work for assigning (but this was also the case in 0.14.2). While for a DatetimeIndex both accessing and assigning using a string indexer works.

@jreback
Copy link
Contributor

jreback commented Apr 14, 2015

this is essentially a dupe of #3462 as the multi-index delegation is to the PeriodIndex which doesn't support this ATM.

@jorisvandenbossche
Copy link
Member Author

@jreback Possibly, but this did work in 0.14.1, while the issue you linked to was already broken (or not implemented) then.

But of course possible the underlying issue is the same

@jreback
Copy link
Contributor

jreback commented Apr 14, 2015

hmm, ok, Period does need a bit of attention

@jreback jreback modified the milestones: 0.17.0, 0.16.1 Apr 29, 2015
@jreback jreback modified the milestones: Next Major Release, 0.17.0 Aug 15, 2015
@mroeschke
Copy link
Member

This works again in 0.19.1:

In [31]: df.loc[('2013Q1',1111),'OMS']
Out[31]: 
Periode  CVR 
2013Q1   1111    NaN
Name: OMS, dtype: object

However the result isn't the same as:

In [32]: df.loc[(pd.Period('2013Q1'),1111),'OMS']
Out[32]: nan

Should these return the same result?

@mroeschke mroeschke added good first issue Needs Tests Unit test(s) needed to prevent regressions and removed Bug MultiIndex Period Period data type Regression Functionality that used to work in a prior pandas version labels Oct 16, 2019
@simonjayhawkins simonjayhawkins modified the milestones: Contributions Welcome, 1.0 Jan 3, 2020
@jbrockmendel
Copy link
Member

jbrockmendel commented Jun 29, 2021

@mroeschke was there discussion somewhere else about the expected result here? i think the scalar np.nan makes more sense than a Series

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Needs Tests Unit test(s) needed to prevent regressions
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants