-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: .reset_index() should raise with an invalid level name (GH20925) #21016
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #21016 +/- ##
==========================================
+ Coverage 91.83% 91.83% +<.01%
==========================================
Files 153 153
Lines 49497 49498 +1
==========================================
+ Hits 45456 45458 +2
+ Misses 4041 4040 -1
Continue to review full report at Codecov.
|
Could you add a test that the correct error and error message is raised? Use |
Hello @KalyanGokhale! Thanks for updating the PR. Cheers ! There are no PEP8 issues in this Pull Request. 🍻 Comment last updated on May 18, 2018 at 01:27 Hours UTC |
@TomAugspurger have added the test - please tell if any further edits are needed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My test case was wrong... removing the set()
will maybe fix the correct one?
with tm.assert_raises_regex(KeyError, 'not found'): | ||
s.reset_index('wrong', drop=True) | ||
# Data for Test Case 4 | ||
s = pd.Series(range(4), name='valid') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This (and I mean: my testcase) doesn't make much sense, as we are giving a name to s
itself, not to its index. I should have written
s = pd.Series(range(4), index=pd.RangeIndex(4, name='valid'))
Notice that if you replace with the above, the following doesn't raise any more. But it might be a bug in MultiIndex.droplevel
, in which case OK to postpone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will delete this particular test case now
pandas/core/series.py
Outdated
if not isinstance(level, (tuple, list)): | ||
level = [level] | ||
level = set(level) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why?!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
level = set(level)
is not required now. Did not remove it after combining two separate blocks post-testing. Will delete it.
pandas/core/series.py
Outdated
if len(level) < len(self.index.levels): | ||
new_index = self.index.droplevel(level) | ||
if isinstance(self.index, MultiIndex): | ||
if (len(level) < len(self.index.levels)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
len(self.index.levels)
can be replaced with self.index.nlevels
(and then I guess you'll be able to drop the MultiIndex
test above). But is this test actually needed? (It's not a rhetorical question)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion.
Yes replacing len(self.index.levels)
with self.index.nlevels
helps and MultiIndex
test can then be removed. However, have retained the MultiIndex
test to avoid any unforeseen outcomes - retaining it also produces the same outcomes (at least in the test cases checked).
len(level) < self.index.nlevels
test is needed, in case this is dropped (assuming MultiIndex
test is also dropped) then following errors are observed for these cases:
Test Data
arrays = [np.array(['bar', 'bar', 'baz', 'baz']), np.array(['one', 'two', 'one', 'two'])]
s = pd.Series(range(4), name='foo', index=pd.MultiIndex.from_arrays(arrays, names=['a', 'b']))
Test Case
s.reset_index(['a', 'b'], drop=True)
Outcome
ValueError: Must pass non-zero number of levels/labels
Test Data
s = pd.Series(range(4), index=pd.RangeIndex(4, name='valid'))
Test Cases
s.reset_index(['valid', 'valid'], drop=True)
s.reset_index(['valid'], drop=True)
s.reset_index('valid', drop=True)
Outcome for each of 3 test cases above
AttributeError: 'RangeIndex' object has no attribute 'droplevel'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
len(level) < self.index.nlevels test is needed
Right, hadn't considered that zero-levels MultiIndex
es can't exist.
The isinstance(self.index, MultiIndex)
doesn't make much sense but it's OK to leave it, as it guarantees that if pd.Series(range(4)).reset_index([], drop=True)
doesn't fail (pd.Index.droplevel()
doesn't exist). Could you add a test case for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
# https://github.com/pandas-dev/pandas/issues/20925 | ||
# Data for Test Case 1 and 2 | ||
s = pd.Series(range(4)) | ||
# Test Case 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
put blank lines between cases. Don't use a 'Test Case 1' as a lable, either remove of put what it is testing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Have introduced blank cases and, replaced labels with comments of what is being tested
def test_reset_index_drop_errmsg(): | ||
# https://github.com/pandas-dev/pandas/issues/20925 | ||
|
||
# Check KeyError raised for series where no 'level' name is defined |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
series -> series index
no 'level' name is defined -> passed level name is missing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
with tm.assert_raises_regex(KeyError, 'must be same as name'): | ||
s.reset_index('wrong') | ||
|
||
# Check KeyError raised for series where 'level' to be dropped is undefined |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
undefined -> missing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
I mistakenly wrote in an obsolete discussion, copying here:
Right, hadn't considered that zero-levels The After that and a couple of minor fixes to comments, I think we're ready. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes OK, just move to correct file
@@ -768,3 +768,24 @@ def test_head_tail(test_data): | |||
assert_series_equal(test_data.series.head(0), test_data.series[0:0]) | |||
assert_series_equal(test_data.series.tail(), test_data.series[-5:]) | |||
assert_series_equal(test_data.series.tail(0), test_data.series[0:0]) | |||
|
|||
|
|||
def test_reset_index_drop_errmsg(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just noticed this is in the wrong file... reset_index
is tested in pandas/tests/series/test_alter_axes.py
, not test_indexing.py
.
While you're at it, please replace # https://github.com/pandas-dev/pandas/issues/20925
with the standard # GH 20925
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks didn't know that the right file was test_alter_axes.py
Have moved the function there (now renamed as test_reset_index_drop_errors
based on @TomAugspurger 's comment) - also updated comment for # GH 20925
with tm.assert_raises_regex(KeyError, 'not found'): | ||
s.reset_index('wrong', drop=True) | ||
|
||
# Check that .reset_index([],drop=True) doesn't fail |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fine but probably not in the right test: could you append it to test_reset_index_level
(in pandas/tests/series/test_alter_axes.py
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Have moved it to function test_reset_index_level
outside the for
loop
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Release note can go in 0.23.1 (maybe have to merge in master for it to show up).
@@ -768,3 +768,24 @@ def test_head_tail(test_data): | |||
assert_series_equal(test_data.series.head(0), test_data.series[0:0]) | |||
assert_series_equal(test_data.series.tail(), test_data.series[-5:]) | |||
assert_series_equal(test_data.series.tail(0), test_data.series[0:0]) | |||
|
|||
|
|||
def test_reset_index_drop_errmsg(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
errmsg
-> errors
or error_message
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done. renamed def as test_reset_index_drop_errors
Looks all good to me, ready to merge after the whatsnew note is added. |
Updating to 0.23.0
Updating to 0.23.0
Updated v0.23.1.txt under Bug Fixes > Indexing as: |
doc/source/whatsnew/v0.23.1.txt
Outdated
@@ -59,7 +59,7 @@ Conversion | |||
Indexing | |||
^^^^^^^^ | |||
|
|||
- | |||
- Bug in :meth:`Series.reset_index` where appropriate error was not raised with a non-named level (:issue:`20925`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get the "a non-named level". Wouldn't it be "an invalid level name"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. Done.
@KalyanGokhale merged, thanks! |
…das-dev#21016) closes pandas-dev#20925 (cherry picked from commit e033c06)
#20925 Raises appropriate error for Series.reset_index(level_name, drop=True) when index is flat and an invalid level is supplied
git diff upstream/master -u -- "*.py" | flake8 --diff