Skip to content

Fix the docstring of xs in pandas/core/generic.py #22892 #23913

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Dec 2, 2018
Merged
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
121 changes: 71 additions & 50 deletions pandas/core/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -3270,71 +3270,92 @@ class max_speed

def xs(self, key, axis=0, level=None, drop_level=True):
"""
Returns a cross-section (row(s) or column(s)) from the
Series/DataFrame. Defaults to cross-section on the rows (axis=0).
Return cross-section from the Series/DataFrame.

Returns a cross-section (row(s) or column(s))
from the Series/DataFrame.
Defaults to cross-section on the rows (axis=0).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit repetitive of the short summary. Can you try to explain in an easier way what the method does. I don't think a person who hasn't used it can understand by just reading the description.


Parameters
----------
key : object
Some label contained in the index, or partially in a MultiIndex
axis : int, default 0
Axis to retrieve cross-section on
key : label
Some label contained in the index, or partially in a MultiIndex.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think more than one level can be provided, right? Can you make it clear here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure to understand this one, do you want me to precise the key ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can dodf.xs(key=('mammal', 'dog')), and if that's the case, the type label seems incorrect (or not clear). label or tuple of label and an a description that explains what is expected would be better

axis : {0 or 'index', 1 or 'columns'}, default 0
Axis to retrieve cross-section on.
level : object, defaults to first n levels (n=1 or len(key))
In case of a key partially contained in a MultiIndex, indicate
which levels are used. Levels can be referred by label or position.
drop_level : boolean, default True
drop_level : bool, default True
If False, returns object with same levels as self.

Examples
--------
>>> df
A B C
a 4 5 2
b 4 0 9
c 9 7 3
>>> df.xs('a')
A 4
B 5
C 2
Name: a
>>> df.xs('C', axis=1)
a 2
b 9
c 3
Name: C

>>> df
A B C D
first second third
bar one 1 4 1 8 9
two 1 7 5 5 0
baz one 1 6 6 8 0
three 2 5 3 5 3
>>> df.xs(('baz', 'three'))
A B C D
third
2 5 3 5 3
>>> df.xs('one', level=1)
A B C D
first third
bar 1 4 1 8 9
baz 1 6 6 8 0
>>> df.xs(('baz', 2), level=[0, 'third'])
A B C D
second
three 5 3 5 3

Returns
-------
xs : Series or DataFrame
Series or DataFrame
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a short description on what is being returned.


See Also
--------
DataFrame.loc : Access a group of rows and columns
by label(s) or a boolean array.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The right indentation is 4 spaces to the right of the D in DataFrame.loc. Also, in the previous line, can you continue the description until it's close to the maximum of 79 characters?

Same things in the next item too.

DataFrame.iloc : Purely integer-location based indexing
for selection by position.

Notes
-----
xs is only for getting, not setting values.
'xs' can not be used to set values.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
'xs' can not be used to set values.
`xs` can not be used to set values.

the quoting of xs should be backticks, no single quotes. Same in the next case.


MultiIndex Slicers is a generic way to get/set values on any level or
levels. It is a superset of xs functionality, see
:ref:`MultiIndex Slicers <advanced.mi_slicers>`
levels. It is a superset of 'xs' functionality, see
:ref:`MultiIndex Slicers <advanced.mi_slicers>`.

Examples
--------
>>> df = pd.DataFrame({"num_legs": [4, 4, 2],
... "num_arms": [0, 0, 0],
... "num_wings": [0, 0, 2]},
... ["dog", "cat", "duck"])
>>> df
num_legs num_arms num_wings
dog 4 0 0
cat 4 0 0
duck 2 0 2
>>> df.xs('dog')
num_legs 4
num_arms 0
num_wings 0
Name: dog, dtype: int64
>>> df.xs('num_wings', axis=1)
dog 0
cat 0
duck 2
Name: num_wings, dtype: int64
>>> d = {'num_legs': [4, 4, 2, 4],
... 'num_arms': [0, 0, 0, 0],
... 'num_wings': [0, 0, 2, 0],
... 'num_spec_seen': [9, 2, 7, 3],
... 'class': ['mammal', 'mammal', 'sauropsida', 'sauropsida'],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd probably use mammal and bird (and change turtle by a bird). The idea is to show what xs does in the simplest possible way, so anything that adds complexity, like using a specialized concept like sauropsida, is something I'd try to avoid.

I'd also remove num_arms, and num_spec_seen, as they are not used, and they just add "noise" that makes focusing on what xs does more difficult.

... 'animal': ['cat', 'dog', 'duck', 'turtle'],
... 'area': [1, 1, 2, 3]}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if instead of area (which for what I understand is arbitrary), we use something like "flies", "walks" (you can use bat and penguin, so you have the examples crossed). I think this way the last example will be clearer.

>>> df = pd.DataFrame(data=d)
>>> df = df.set_index(['class', 'animal', 'area'])
>>> df
num_legs num_arms num_wings num_spec_seen
class animal area
mammal cat 1 4 0 0 9
dog 1 4 0 0 2
sauropsida duck 2 2 0 2 7
turtle 3 4 0 0 3
>>> df.xs(('mammal', 'dog'))
num_legs num_arms num_wings num_spec_seen
area
1 4 0 0 2
>>> df.xs('cat', level=1)
num_legs num_arms num_wings num_spec_seen
class area
mammal 1 4 0 0 9
>>> df.xs(('sauropsida', 2), level=[0, 'area'])
num_legs num_arms num_wings num_spec_seen
animal
duck 2 0 2 7
"""
axis = self._get_axis_number(axis)
labels = self._get_axis(axis)
Expand Down