Skip to content

BUG: incorrect set_labels in MI #19058

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jan 6, 2018
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/whatsnew/v0.23.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -325,6 +325,7 @@ Indexing
- Bug in indexing non-scalar value from ``Series`` having non-unique ``Index`` will return value flattened (:issue:`17610`)
- Bug in :func:`DatetimeIndex.insert` where inserting ``NaT`` into a timezone-aware index incorrectly raised (:issue:`16357`)
- Bug in ``__setitem__`` when indexing a :class:`DataFrame` with a 2-d boolean ndarray (:issue:`18582`)
- Bug in :func:`MultiIndex.set_labels` which would cause casting (and potentially clipping) of the new labels if the ``level`` argument is not 0 or a list like [0, 1, ... ] (:issue:`19057`)


I/O
Expand Down
5 changes: 3 additions & 2 deletions pandas/core/indexes/multi.py
Original file line number Diff line number Diff line change
Expand Up @@ -328,8 +328,9 @@ def _set_labels(self, labels, level=None, copy=False, validate=True,
else:
level = [self._get_level_number(l) for l in level]
new_labels = list(self._labels)
for l, lev, lab in zip(level, self.levels, labels):
new_labels[l] = _ensure_frozen(
for lev_idx, lab in zip(level, labels):
lev = self.levels[lev_idx]
new_labels[lev_idx] = _ensure_frozen(
lab, lev, copy=copy)._shallow_copy()
new_labels = FrozenList(new_labels)

Expand Down
22 changes: 22 additions & 0 deletions pandas/tests/indexes/test_multi.py
Original file line number Diff line number Diff line change
Expand Up @@ -327,6 +327,28 @@ def assert_matching(actual, expected):
assert_matching(ind2.labels, new_labels)
assert_matching(self.index.labels, labels)

def test_set_label_distinctly_sized_levels(self):
# label changing for levels of different magnitude of categories
def assert_equal(actual, expected):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see below this is not necessary

assert len(actual) == len(expected)
for act, exp in zip(actual, expected):
act = np.asarray(act)
exp = np.asarray(exp)
tm.assert_numpy_array_equal(act, exp, check_dtype=True)

ind = pd.MultiIndex.from_tuples([(0, i) for i in range(130)])
new_labels = range(129, -1, -1)
ind_reference = pd.MultiIndex.from_tuples(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

call this expected

[(0, i) for i in new_labels])

# [w/o mutation]
ind2 = ind.set_labels(labels=new_labels, level=1)
assert_equal(ind2, ind_reference)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

call these result, then you can simply do

assert result.equals(expected)

for the inplace case you need to copy before


# [w/ mutation]
ind.set_labels(labels=new_labels, level=1, inplace=True)
assert_equal(ind, ind_reference)

def test_set_levels_labels_names_bad_input(self):
levels, labels = self.index.levels, self.index.labels
names = self.index.names
Expand Down