Skip to content

Series.reset_index(level_name, drop=True) accepts invalid name when index is flat #20925

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
toobaz opened this issue May 2, 2018 · 10 comments · Fixed by #21016
Closed

Series.reset_index(level_name, drop=True) accepts invalid name when index is flat #20925

toobaz opened this issue May 2, 2018 · 10 comments · Fixed by #21016
Labels
Error Reporting Incorrect or improved errors from pandas good first issue Indexing Related to indexing on series/frames, not to indexes themselves

Comments

@toobaz
Copy link
Member

toobaz commented May 2, 2018

Code Sample, a copy-pastable example if possible

In [2]: s = pd.Series(range(4))

In [3]: s.reset_index('wrong', drop=True)
Out[3]: 
0    0
1    1
2    2
3    3
dtype: int64

In [4]: s.reset_index('wrong')
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-4-a7b872dfe0cf> in <module>()
----> 1 s.reset_index('wrong')

~/nobackup/repo/pandas/pandas/core/series.py in reset_index(self, level, drop, name, inplace)
   1215         else:
   1216             df = self.to_frame(name)
-> 1217             return df.reset_index(level=level, drop=drop)
   1218 
   1219     def __unicode__(self):

~/nobackup/repo/pandas/pandas/core/frame.py in reset_index(self, level, drop, inplace, col_level, col_fill)
   4098             if not isinstance(level, (tuple, list)):
   4099                 level = [level]
-> 4100             level = [self.index._get_level_number(lev) for lev in level]
   4101             if isinstance(self.index, MultiIndex):
   4102                 if len(level) < self.index.nlevels:

~/nobackup/repo/pandas/pandas/core/frame.py in <listcomp>(.0)
   4098             if not isinstance(level, (tuple, list)):
   4099                 level = [level]
-> 4100             level = [self.index._get_level_number(lev) for lev in level]
   4101             if isinstance(self.index, MultiIndex):
   4102                 if len(level) < self.index.nlevels:

~/nobackup/repo/pandas/pandas/core/indexes/base.py in _get_level_number(self, level)
   1944 
   1945     def _get_level_number(self, level):
-> 1946         self._validate_index_level(level)
   1947         return 0
   1948 

~/nobackup/repo/pandas/pandas/core/indexes/base.py in _validate_index_level(self, level)
   1941         elif level != self.name:
   1942             raise KeyError('Level %s must be same as name (%s)' %
-> 1943                            (level, self.name))
   1944 
   1945     def _get_level_number(self, level):

KeyError: 'Level wrong must be same as name (None)'

In [5]: s = pd.Series(range(4), index=pd.MultiIndex.from_product([[1, 2]]*2))

In [6]: s.reset_index('wrong', drop=True)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
~/nobackup/repo/pandas/pandas/core/indexes/multi.py in _get_level_number(self, level)
    775                                  'level number' % level)
--> 776             level = self.names.index(level)
    777         except ValueError:

ValueError: 'wrong' is not in list

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
<ipython-input-6-e0d143f488d8> in <module>()
----> 1 s.reset_index('wrong', drop=True)

~/nobackup/repo/pandas/pandas/core/series.py in reset_index(self, level, drop, name, inplace)
   1199                 if not isinstance(level, (tuple, list)):
   1200                     level = [level]
-> 1201                 level = [self.index._get_level_number(lev) for lev in level]
   1202                 if len(level) < len(self.index.levels):
   1203                     new_index = self.index.droplevel(level)

~/nobackup/repo/pandas/pandas/core/series.py in <listcomp>(.0)
   1199                 if not isinstance(level, (tuple, list)):
   1200                     level = [level]
-> 1201                 level = [self.index._get_level_number(lev) for lev in level]
   1202                 if len(level) < len(self.index.levels):
   1203                     new_index = self.index.droplevel(level)

~/nobackup/repo/pandas/pandas/core/indexes/multi.py in _get_level_number(self, level)
    777         except ValueError:
    778             if not isinstance(level, int):
--> 779                 raise KeyError('Level %s not found' % str(level))
    780             elif level < 0:
    781                 level += self.nlevels

KeyError: 'Level wrong not found'

Problem description

In [3]: should raise too.

Expected Output

Same as In [4]:

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.5.3.final.0
python-bits: 64
OS: Linux
OS-release: 4.9.0-6-amd64
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: it_IT.UTF-8
LOCALE: it_IT.UTF-8

pandas: 0.23.0.dev0+825.ga3facfc55
pytest: 3.5.0
pip: 9.0.1
setuptools: 39.0.1
Cython: 0.25.2
numpy: 1.14.1
scipy: 0.19.0
pyarrow: None
xarray: None
IPython: 6.2.1
sphinx: 1.5.6
patsy: 0.5.0
dateutil: 2.7.0
pytz: 2017.2
blosc: None
bottleneck: 1.2.0dev
tables: 3.3.0
numexpr: 2.6.1
feather: 0.3.1
matplotlib: 2.0.0
openpyxl: 2.3.0
xlrd: 1.0.0
xlwt: 1.3.0
xlsxwriter: 0.9.6
lxml: 4.1.1
bs4: 4.5.3
html5lib: 0.999999999
sqlalchemy: 1.0.15
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: 0.2.1

@toobaz toobaz added Indexing Related to indexing on series/frames, not to indexes themselves Error Reporting Incorrect or improved errors from pandas good first issue labels May 2, 2018
@misupova
Copy link

misupova commented May 3, 2018

Can I try to deal with this one as my first issue?

@toobaz
Copy link
Member Author

toobaz commented May 3, 2018

Can I try to deal with this one as my first issue?

Sure!

Notice the following should also raise:

In [2]: s = pd.Series(range(4), name='valid')

In [3]: s.reset_index(['valid', 'valid'], drop=True)
Out[3]: 
0    0
1    1
2    2
3    3
Name: valid, dtype: int64

... and bonus points if the following raises too (but leaving it aside for the moment is also fine):

In [2]: s = pd.Series(range(4))

In [3]: s.reset_index([None, None])
Out[3]: 
   index  0
0      0  0
1      1  1
2      2  2
3      3  3

@rahulgo
Copy link

rahulgo commented May 4, 2018

Hi, I'm a beginner and would like to contribute as my first bug fix. I'm not quite certain I understand the bug. Could you please explain?

@toobaz
Copy link
Member Author

toobaz commented May 4, 2018

@rahulgo this bug is probably already being worked on by @misupova : I suggest you pick another one (still marked as good first issue ), possibly one for which you have a better understanding

@KalyanGokhale
Copy link
Contributor

@misupova sorry, since this issue was open since some time I thought I'd give it a try.

@toobaz
Copy link
Member Author

toobaz commented May 13, 2018

this issue was open since some time

@KalyanGokhale an 11 days old bug is open "since some time"?! Have you seen the list?!

@misupova ... hope you weren't already working on this, and that you can find another "good first issue" that suits you

@KalyanGokhale
Copy link
Contributor

Apologies @toobaz @misupova ...really not sure what the protocol is in terms of wait times / when you can start working on something or not....will be more mindful about it in future.

@misupova
Copy link

@toobaz I started working on that, but what’s done is done :)

@KalyanGokhale
Copy link
Contributor

@misupova sincere apologies once again.

@toobaz may be I was quick to jump on this one - but wanted to understand for future if there is any protocol you guys follow (e.g. if x # of days have passed and the person who signed up to work on an issue has not provided any update, then its fair play for someone else to pick it up etc?) - is there a middle ground between the time an issue is open and someone who has a (ready/near ready) solution can also post it....may be there is another collaborative way which I have not considered....just thinking out loud. Thanks!

@toobaz
Copy link
Member Author

toobaz commented May 14, 2018

@KalyanGokhale no official protocol (we have so many bugs we don't typically fight over them ;-) ), and might depend on how urgent the issue is, but I guess a rule of thumb could be "always feel free to ask if the person who signed up to work on an issue is actually/still working on it, and if s/he isn't, or doesn't reply within a week, then feel free to steal it". (and in general, when you start working on an issue, always mention it)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Error Reporting Incorrect or improved errors from pandas good first issue Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants