Skip to content

Backport PR #31651: PERF: Cache MultiIndex.levels #31664

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Feb 4, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/whatsnew/v1.0.1.rst
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ Fixed regressions
- Fixed regression in objTOJSON.c fix return-type warning (:issue:`31463`)
- Fixed regression in :meth:`qcut` when passed a nullable integer. (:issue:`31389`)
- Fixed regression in assigning to a :class:`Series` using a nullable integer dtype (:issue:`31446`)
- Fixed performance regression when indexing a ``DataFrame`` or ``Series`` with a :class:`MultiIndex` for the index using a list of labels (:issue:`31648`)

.. ---------------------------------------------------------------------------
Expand Down
29 changes: 19 additions & 10 deletions pandas/core/indexes/multi.py
Original file line number Diff line number Diff line change
Expand Up @@ -620,16 +620,6 @@ def from_frame(cls, df, sortorder=None, names=None):

# --------------------------------------------------------------------

@property
def levels(self):
result = [
x._shallow_copy(name=name) for x, name in zip(self._levels, self._names)
]
for level in result:
# disallow midx.levels[0].name = "foo"
level._no_setting_name = True
return FrozenList(result)

@property
def _values(self):
# We override here, since our parent uses _data, which we don't use.
Expand Down Expand Up @@ -659,6 +649,22 @@ def array(self):
"'MultiIndex.to_numpy()' to get a NumPy array of tuples."
)

# --------------------------------------------------------------------
# Levels Methods

@cache_readonly
def levels(self):
# Use cache_readonly to ensure that self.get_locs doesn't repeatedly
# create new IndexEngine
# https://github.com/pandas-dev/pandas/issues/31648
result = [
x._shallow_copy(name=name) for x, name in zip(self._levels, self._names)
]
for level in result:
# disallow midx.levels[0].name = "foo"
level._no_setting_name = True
return FrozenList(result)

def _set_levels(
self, levels, level=None, copy=False, validate=True, verify_integrity=False
):
Expand Down Expand Up @@ -1227,6 +1233,9 @@ def _set_names(self, names, level=None, validate=True):
)
self._names[lev] = name

# If .levels has been accessed, the names in our cache will be stale.
self._reset_cache()

names = property(
fset=_set_names, fget=_get_names, doc="""\nNames of levels in MultiIndex.\n"""
)
Expand Down
2 changes: 1 addition & 1 deletion pandas/tests/indexes/multi/test_get_set.py
Original file line number Diff line number Diff line change
Expand Up @@ -159,7 +159,7 @@ def test_set_levels_codes_directly(idx):
minor_codes = [(x + 1) % 1 for x in minor_codes]
new_codes = [major_codes, minor_codes]

msg = "can't set attribute"
msg = "[Cc]an't set attribute"
with pytest.raises(AttributeError, match=msg):
idx.levels = new_levels
with pytest.raises(AttributeError, match=msg):
Expand Down