-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
DOC: Add DataFrame.index.levels #55437
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 9 commits
50ff6c2
f5b7e29
2c8b861
6241262
aac99e1
a3984e1
c11f34a
f2bf96a
164df6f
14be467
f5085f8
1bca09d
ad66dfb
9ea6ebb
329e45c
b871123
ca13b9a
4865b2a
6eaa176
ed8ec94
39eacd2
19298c4
324b682
1e1c873
1d658e0
efe5e0a
596d55c
786abee
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -840,6 +840,46 @@ def size(self) -> int: | |
|
||
@cache_readonly | ||
def levels(self) -> FrozenList: | ||
""" | ||
Levels of the MultiIndex. | ||
|
||
What are levels? | ||
Levels refer to the different hierarchical levels or layers in a MultiIndex. | ||
In a MultiIndex, each level represents a distinct dimension or category of | ||
the index. | ||
|
||
How can levels be seen in pandas? | ||
To access the levels, you can use the levels attribute of the MultiIndex, | ||
which returns a tuple of Index objects. Each Index object represents a | ||
level in the MultiIndex and contains the unique values found in that | ||
specific level. | ||
|
||
Notes | ||
----- | ||
If the MultiIndex is sliced, this method still returns the original | ||
levels of the MultiIndex. | ||
When using this method on a DataFrame index, it may show "extra". | ||
datapythonista marked this conversation as resolved.
Show resolved
Hide resolved
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this last sentence talking about the MultiIndex returning all original levels? I personally find the |
||
|
||
Examples | ||
-------- | ||
Example 1: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think you can remove the |
||
>>> idx = pd.MultiIndex.from_product([(0, 1, 2), ('green', 'purple')], | ||
... names=['number', 'color']) | ||
>>> idx.levels | ||
FrozenList([[0, 1, 2], ['green', 'purple']]) | ||
|
||
Example 2: | ||
>>> idx = pd.MultiIndex.from_product([['John', 'Josh', 'Alex'], | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you reuse the index from above, so this is less verbose please. |
||
... list('abcde')], names=['Person', 'Letter']) | ||
>>> large = pd.DataFrame(data=np.random.randn(15, 2), | ||
datapythonista marked this conversation as resolved.
Show resolved
Hide resolved
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it's better to just call this |
||
... index=idx, columns=['one', 'two']) | ||
>>> small = large.loc[['Jo'==d[0:2] for d in | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you show the dataframe here first, so the user can better understand the data of the example? I think you'll want to avoid using random data in the example, as it'll make your life more difficult. And if you use data that is meaningful instead of some random things, that will also help users understand better. As an idea, you can use a MultiIndex of animals with class/animal (e.g. insect/ant, insect/spides, mammal/goat...) and use a dataframe with just one column, for example number of legs (we use this example in other docs). The example is silly, but meaningful and any user can understand it very quickly, and focus on the concepts, not in understanding the data. Then, as said, you can show the dataframe first, show the levels of the multiindex from it (i.e. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I have a problem that I may need your help with. When I tested the code after I committed it, four tests failed, but I couldn't figure out exactly what went wrong from the error log provided. |
||
... large.index.get_level_values('Person')]] | ||
>>> large.index.levels | ||
FrozenList([['Alex', 'John', 'Josh'], ['a', 'b', 'c', 'd', 'e']]) | ||
>>> small.index.levels | ||
FrozenList([['Alex', 'John', 'Josh'], ['a', 'b', 'c', 'd', 'e']]) | ||
""" | ||
# Use cache_readonly to ensure that self.get_locs doesn't repeatedly | ||
datapythonista marked this conversation as resolved.
Show resolved
Hide resolved
|
||
# create new IndexEngine | ||
# https://github.com/pandas-dev/pandas/issues/31648 | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to remove the questions. In isolation I think they are fine, but since we don't use them in any other API documentation page, they feel a bit strange and inconsistent. I think it's easy to understand the explanations without the questions, so no big deal.