-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: set_levels set illegal levels. #14236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: set_levels set illegal levels. #14236
Conversation
Current coverage is 85.26% (diff: 100%)@@ master #14236 diff @@
==========================================
Files 140 140
Lines 50633 50634 +1
Methods 0 0
Messages 0 0
Branches 0 0
==========================================
+ Hits 43170 43172 +2
+ Misses 7463 7462 -1
Partials 0 0
|
89ba77f
to
fb16c65
Compare
set_levels
set illegal levels.@@ -116,12 +116,13 @@ def __new__(cls, levels=None, labels=None, sortorder=None, names=None, | |||
|
|||
return result | |||
|
|||
def _verify_integrity(self): | |||
def _verify_integrity(self, new_labels=None, new_levels=None): | |||
"""Raises ValueError if length of levels and labels don't match or any |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a Parameters
section
with assertRaisesRegexp(ValueError, "^On"): | ||
self.index.set_levels(['c'], level=0, inplace=True) | ||
assert_matching(self.index.levels, original_index.levels) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you test set_labels
(with invalid labels)
# NOTE: Currently does not check, among other things, that cached | ||
# nlevels matches nor that sortorder matches actually sortorder. | ||
labels, levels = self.labels, self.levels | ||
labels = new_labels or self.labels |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these need a check for list-like, otherwise the len
check will give an odd error message. use is_list_like
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also add a test which exercises this check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a test for this, but didn't add a separate check within _verify_integrity
because the check for list-like is already present in set_levels
and set_labels
: https://github.com/pydata/pandas/blob/master/pandas/indexes/multi.py#L221 and https://github.com/pydata/pandas/blob/master/pandas/indexes/multi.py#L322.
@jreback the Travis OSX build keeps on failing on conda install. Could you maybe delete the cache and restart the build? Or any other ideas? I'm pretty sure the failure is not caused by these changes. |
49be0b7
to
2cb6d8c
Compare
|
||
with assertRaisesRegexp(TypeError, "^Levels"): | ||
self.index.set_levels('c', level=0, inplace=True) | ||
assert_matching(self.index.levels, original_index.levels) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add tests for inplace=False
as well (you can loop over the param I think)
minor comments, ping when green. |
lgtm. ping on green. |
@@ -154,7 +154,7 @@ def assert_matching(actual, expected): | |||
# as much as possible | |||
self.assertEqual(len(actual), len(expected)) | |||
for act, exp in zip(actual, expected): | |||
act = np.asarray(act) | |||
act = np.asarray(act, dtype=np.object_) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why did you need to change this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the expected is changed to np.object_
and if I don't change act
also the test fails because the expected and actual are not changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you show me an example? it is passing now, so unclear why you had to change it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just pushed up commit e0c8897 that undoes this change and fails with the error:
E AssertionError: numpy array are different
E
E Attribute "dtype" are different
E [left]: int8
E [right]: object
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
int8 is right
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But when I take out the coercion to np.object_
, the test test_set_levels
fails with
E AssertionError: numpy array are different
E
E Attribute "dtype" are different
E [left]: object
E [right]: |S4
This is in the test
# level changing [w/o mutation]
ind2 = self.index.set_levels(new_levels)
assert_matching(ind2.levels, new_levels)
assert_matching(self.index.levels, levels)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like maybe _ensure_index
is doing the conversion to np.object_
in that test case (https://github.com/pydata/pandas/blob/master/pandas/indexes/base.py#L3616 and https://github.com/pydata/pandas/blob/master/pandas/lib.pyx#L1029).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed this issue.
you need to rebase as just merged #14303 |
d636fdf
to
5e8af7c
Compare
# NOTE: Currently does not check, among other things, that cached | ||
# nlevels matches nor that sortorder matches actually sortorder. | ||
labels, levels = self.labels, self.levels | ||
labels = new_labels or self.labels |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can new_levels
or new_labels
ever be a NumPy array (e.g. user passes idx.set_levels(np.array['a', 'b'])
, or will they have been coerced to a list by now? If they're an array this will raise a ValueError. You could use labels = new_labels if new_labels is not None else self.labels
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, looks like it'll be a FrozenList
by this point. Never mind then.
@jreback I believe I have addressed all of the comments. |
- Bug in ``df.groupby`` where ``.median()`` returns arbitrary values if grouped dataframe contains empty bins (:issue:`13629`) | ||
- Bug in ``Index.copy()`` where ``name`` parameter was ignored (:issue:`14302`) | ||
- Bug in ``MultiIndex.set_levels`` where illegal level values were still set after raising an error (:issue:`13754`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move to 0.19.1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved.
Labels to check for validity. Defaults to current labels. | ||
new_levels : optional list | ||
Levels to check for validity. Defaults to current levels. | ||
""" | ||
# NOTE: Currently does not check, among other things, that cached |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a Raises
section
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
@@ -116,12 +116,22 @@ def __new__(cls, levels=None, labels=None, sortorder=None, names=None, | |||
|
|||
return result | |||
|
|||
def _verify_integrity(self): | |||
def _verify_integrity(self, new_labels=None, new_levels=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
call these labels, levels
. the new is confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
@@ -149,13 +149,16 @@ def test_set_levels(self): | |||
levels = self.index.levels | |||
new_levels = [[lev + 'a' for lev in level] for level in levels] | |||
|
|||
def assert_matching(actual, expected): | |||
def assert_matching(actual, expected, coerce_to_obj=True): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
call this check_dtype
and simply pass onto assert_numpy_array_equal
as the same parameter name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
1b0037a
to
b9b08fb
Compare
@jreback I think all comments are addressed now. |
thanks @bkandel lgtm. @jorisvandenbossche can be the first back-port test! |
`MultiIndex.set_levels`, when passed in illegal level values, raises an error. When `inplace=True`, though, the illegal level values are still accepted. This commit fixes that behavior by checking that the proposed level values are legal before setting them. added tests and docs. lint added tests. force build force build force build force build for osx
b9b08fb
to
39b7bde
Compare
@bkandel Thanks a lot! |
* commit 'v0.19.0-14-ga40e185': (37 commits) BUG: Bug in localizing an ambiguous timezone when a boolean is passed Convert readthedocs links for their .org -> .io migration for hosted projects (pandas-dev#14406) DOC: formatting in basics.rst BLD/CI: cython cache pxd files (pandas-dev#14363) BUG: set_levels set illegal levels. (pandas-dev#14236) DOC: add whitespace to v0.19.1 bug fix section change impl details slightly for pandas-dev#14292 BUG: Fix concat key name DOC: add 0.19.1 whatsnew file (pandas-dev#14366) DOC: to_csv warns regarding quoting behaviour for floats pandas-dev#14195 (pandas-dev#14228) DOC: fix formatting issue with msgpack table TST: pandas-dev#14345 fixes TestDatetimeIndexOps test_nat AssertionErrors on 32-bit docs: Remove old warning from dsintro.rst (pandas-dev#14365) DOC: minor v0.19.0 whatsnew corrections RLS: v0.19.0 DOC: update release notes DOC: Latest fixes for whatsnew file to_latex encoding follows the documentation (py2 ascii, py3 utf8) (pandas-dev#14329) DOC: fix some sphinx build issues (pandas-dev#14332) TST: fix period tests for numpy 1.9.3 (GH14183) (pandas-dev#14331) ...
`MultiIndex.set_levels`, when passed in illegal level values, raises an error. When `inplace=True`, though, the illegal level values are still accepted. This commit fixes that behavior by checking that the proposed level values are legal before setting them.
git diff upstream/master | flake8 --diff
MultiIndex.set_levels
, when given illegal level values, raises an error.When
inplace=True
, though, the illegal level values are still accepted. Thiscommit fixes that behavior by checking that the proposed level values are legal
before setting them.