Skip to content

DOC: Add DataFrame.index.levels #55437

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 28 commits into from
Oct 12, 2023
Merged
Changes from 3 commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
50ff6c2
modified: pandas/core/indexes/multi.py
shiersansi Oct 7, 2023
f5b7e29
modified: pandas/core/indexes/multi.py
shiersansi Oct 7, 2023
2c8b861
modified: pandas/core/indexes/multi.py
shiersansi Oct 7, 2023
6241262
modified: pandas/core/indexes/multi.py
shiersansi Oct 8, 2023
aac99e1
modified: pandas/core/indexes/multi.py
shiersansi Oct 8, 2023
a3984e1
modified: pandas/core/indexes/multi.py
shiersansi Oct 8, 2023
c11f34a
modified: pandas/core/indexes/multi.py
shiersansi Oct 8, 2023
f2bf96a
modified: pandas/core/indexes/multi.py
shiersansi Oct 8, 2023
164df6f
modified: pandas/core/indexes/multi.py
shiersansi Oct 8, 2023
14be467
modified: pandas/core/indexes/multi.py
shiersansi Oct 8, 2023
f5085f8
modified: pandas/core/indexes/multi.py
shiersansi Oct 8, 2023
1bca09d
modified: pandas/core/indexes/multi.py
shiersansi Oct 9, 2023
ad66dfb
modified: pandas/core/indexes/multi.py
shiersansi Oct 9, 2023
9ea6ebb
modified: pandas/core/indexes/multi.py
shiersansi Oct 9, 2023
329e45c
modified: pandas/core/indexes/multi.py
shiersansi Oct 9, 2023
b871123
modified: ../pandas/core/indexes/multi.py
shiersansi Oct 11, 2023
ca13b9a
modified: ../pandas/core/indexes/multi.py
shiersansi Oct 11, 2023
4865b2a
modified: pandas/core/indexes/multi.py
shiersansi Oct 11, 2023
6eaa176
modified: pandas/core/indexes/multi.py
shiersansi Oct 12, 2023
ed8ec94
modified: pandas/core/indexes/multi.py
shiersansi Oct 12, 2023
39eacd2
modified: pandas/core/indexes/multi.py
shiersansi Oct 12, 2023
19298c4
modified: pandas/core/indexes/multi.py
shiersansi Oct 12, 2023
324b682
modified: pandas/core/indexes/multi.py
shiersansi Oct 12, 2023
1e1c873
modified: pandas/core/indexes/multi.py
shiersansi Oct 12, 2023
1d658e0
modified: pandas/core/indexes/multi.py
shiersansi Oct 12, 2023
efe5e0a
modified: pandas/core/indexes/multi.py
shiersansi Oct 12, 2023
596d55c
modified: pandas/core/indexes/multi.py
shiersansi Oct 12, 2023
786abee
Update pandas/core/indexes/multi.py
datapythonista Oct 12, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 38 additions & 3 deletions pandas/core/indexes/multi.py
Original file line number Diff line number Diff line change
Expand Up @@ -840,9 +840,44 @@ def size(self) -> int:

@cache_readonly
def levels(self) -> FrozenList:
# Use cache_readonly to ensure that self.get_locs doesn't repeatedly
# create new IndexEngine
# https://github.com/pandas-dev/pandas/issues/31648
"""
Returns a tuple of Index objects representing the levels
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the short summary needs to fit in one line. I wouldn't go into technical details like if the return is a tuple. The best I can think of as a short summary is Levels of the MultiIndex.. Then in the following paragraph I would explain what are levels, what they mean conceptually, how they can be seen in pandas...

of the MultiIndex.

Each level is an Index object containing unique values
from that level of the MultiIndex.

Returns
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case this is an attribute (MultiIndex.levels not MultiIndex.levels()), so for the users it doesn't return anything, it contains the value, and this section is not expected to exist. I'm not sure how we do for other attributes in the docs, they can be a good reference.

-------
levels : tuple of Index
Tuple of Index objects representing the levels
of the MultiIndex.

Notes
-----
The levels are returned in the order they appear in the MultiIndex.
If the MultiIndex is sliced, this method still returns the original
levels of the MultiIndex.
When using this method on a DataFrame index, it may show "extra"
values if the DataFrame has been sliced.
This is because the levels are determined based on the original
DataFrame index, not the sliced one.

Examples
--------
>>> idx = pd.MultiIndex.from_product([['John', 'Josh', 'Alex'],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you reuse the index from above, so this is less verbose please.

...list('abcde')], names=['Person', 'Letter'])
>>> large = pd.DataFrame(data=np.random.randn(15, 2),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to just call this df instead of large

...index=idx, columns=['one', 'two'])
>>> small = large.loc[['Jo'==d[0:2] for d in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you show the dataframe here first, so the user can better understand the data of the example? I think you'll want to avoid using random data in the example, as it'll make your life more difficult. And if you use data that is meaningful instead of some random things, that will also help users understand better. As an idea, you can use a MultiIndex of animals with class/animal (e.g. insect/ant, insect/spides, mammal/goat...) and use a dataframe with just one column, for example number of legs (we use this example in other docs). The example is silly, but meaningful and any user can understand it very quickly, and focus on the concepts, not in understanding the data.

Then, as said, you can show the dataframe first, show the levels of the multiindex from it (i.e. df.index.levels), and then you filter the data many_leg_animals = animals[animals.num_legs > 4] to finally show that even if now there are no mammals, in the data, the multiindex still contains all the levels.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

屏幕截图 2023-10-08 164751

Thank you for your suggestion. Is it easier to understand the example part?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a problem that I may need your help with. When I tested the code after I committed it, four tests failed, but I couldn't figure out exactly what went wrong from the error log provided.

...large.index.get_level_values('Person')]]

>>> print(large.index.levels)
FrozenList([['Alex', 'John', 'Josh'], ['a', 'b', 'c', 'd', 'e']])

>>> print(large.index.levels)
FrozenList([['Alex', 'John', 'Josh'], ['a', 'b', 'c', 'd', 'e']])
"""
result = [x._rename(name=name) for x, name in zip(self._levels, self._names)]
for level in result:
# disallow midx.levels[0].name = "foo"
Expand Down