-
-
Notifications
You must be signed in to change notification settings - Fork 18.6k
DOC: update the DataFrame.loc[] docstring #20229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 7 commits
2f359b9
1a93d2a
a3238d9
78f342c
c28a796
64c698b
0902b36
a23a8e9
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1413,7 +1413,8 @@ def _get_slice_axis(self, slice_obj, axis=None): | |
|
||
|
||
class _LocIndexer(_LocationIndexer): | ||
"""Purely label-location based indexer for selection by label. | ||
""" | ||
Access a group of rows and columns by label(s) or a boolean array. | ||
|
||
``.loc[]`` is primarily label based, but may also be used with a | ||
boolean array. | ||
|
@@ -1424,16 +1425,229 @@ class _LocIndexer(_LocationIndexer): | |
interpreted as a *label* of the index, and **never** as an | ||
integer position along the index). | ||
- A list or array of labels, e.g. ``['a', 'b', 'c']``. | ||
- A slice object with labels, e.g. ``'a':'f'`` (note that contrary | ||
to usual python slices, **both** the start and the stop are included!). | ||
- A boolean array. | ||
- A slice object with labels, e.g. ``'a':'f'``. | ||
|
||
.. warning:: Note that contrary to usual python slices, **both** the start | ||
and the stop are included | ||
|
||
- A boolean array of the same length as the axis being sliced, | ||
e.g. ``[True, False, True]``. | ||
- A ``callable`` function with one argument (the calling Series, DataFrame | ||
or Panel) and that returns valid output for indexing (one of the above) | ||
|
||
``.loc`` will raise a ``KeyError`` when the items are not found. | ||
|
||
See more at :ref:`Selection by Label <indexing.label>` | ||
|
||
See Also | ||
-------- | ||
DateFrame.at : Access a single value for a row/column label pair | ||
DateFrame.iloc : Access group of rows and columns by integer position(s) | ||
DataFrame.xs : Returns a cross-section (row(s) or column(s)) from the | ||
Series/DataFrame. | ||
Series.loc : Access group of values using labels | ||
|
||
Examples | ||
-------- | ||
**Getting values** | ||
|
||
>>> df = pd.DataFrame([[1, 2], [4, 5], [7, 8]], | ||
... index=['cobra', 'viper', 'sidewinder'], | ||
... columns=['max_speed', 'shield']) | ||
>>> df | ||
max_speed shield | ||
cobra 1 2 | ||
viper 4 5 | ||
sidewinder 7 8 | ||
|
||
Single label. Note this returns the row as a Series. | ||
|
||
>>> df.loc['viper'] | ||
max_speed 4 | ||
shield 5 | ||
Name: viper, dtype: int64 | ||
|
||
List of labels. Note using ``[[]]`` returns a DataFrame. | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. only a single blank line (below as well) |
||
>>> df.loc[['viper', 'sidewinder']] | ||
max_speed shield | ||
viper 4 5 | ||
sidewinder 7 8 | ||
|
||
Single label for row and column | ||
|
||
>>> df.loc['cobra', 'shield'] | ||
2 | ||
|
||
Slice with labels for row and single label for column. As mentioned | ||
above, note that both the start and stop of the slice are included. | ||
|
||
>>> df.loc['cobra':'viper', 'max_speed'] | ||
cobra 1 | ||
viper 4 | ||
Name: max_speed, dtype: int64 | ||
|
||
Boolean list with the same length as the row axis | ||
|
||
>>> df.loc[[False, False, True]] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would be nice to have small bits of text breaking these up. Like "Indexing with a boolean array." There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this is not a very common thing to do (directly), the boolean indexing right below is MUCH more important. |
||
max_speed shield | ||
sidewinder 7 8 | ||
|
||
Conditional that returns a boolean Series | ||
|
||
>>> df.loc[df['shield'] > 6] | ||
max_speed shield | ||
sidewinder 7 8 | ||
|
||
Conditional that returns a boolean Series with column labels specified | ||
|
||
>>> df.loc[df['shield'] > 6, ['max_speed']] | ||
max_speed | ||
sidewinder 7 | ||
|
||
Callable that returns a boolean Series | ||
|
||
>>> df.loc[lambda df: df['shield'] == 8] | ||
max_speed shield | ||
sidewinder 7 8 | ||
|
||
**Setting values** | ||
|
||
Set value for all items matching the list of labels | ||
|
||
>>> df.loc[['viper', 'sidewinder'], ['shield']] = 50 | ||
>>> df | ||
max_speed shield | ||
cobra 1 2 | ||
viper 4 50 | ||
sidewinder 7 50 | ||
|
||
Set value for an entire row | ||
|
||
>>> df.loc['cobra'] = 10 | ||
>>> df | ||
max_speed shield | ||
cobra 10 10 | ||
viper 4 50 | ||
sidewinder 7 50 | ||
|
||
Set value for an entire column | ||
|
||
>>> df.loc[:, 'max_speed'] = 30 | ||
>>> df | ||
max_speed shield | ||
cobra 30 10 | ||
viper 30 50 | ||
sidewinder 30 50 | ||
|
||
Set value for rows matching callable condition | ||
|
||
>>> df.loc[df['shield'] > 35] = 0 | ||
>>> df | ||
max_speed shield | ||
cobra 30 10 | ||
viper 0 0 | ||
sidewinder 0 0 | ||
|
||
**Getting values on a DataFrame with an index that has integer labels** | ||
|
||
Another example using integers for the index | ||
|
||
>>> df = pd.DataFrame([[1, 2], [4, 5], [7, 8]], | ||
... index=[7, 8, 9], columns=['max_speed', 'shield']) | ||
>>> df | ||
max_speed shield | ||
7 1 2 | ||
8 4 5 | ||
9 7 8 | ||
|
||
Slice with integer labels for rows. As mentioned above, note that both | ||
the start and stop of the slice are included. | ||
|
||
>>> df.loc[7:9] | ||
max_speed shield | ||
7 1 2 | ||
8 4 5 | ||
9 7 8 | ||
|
||
**Getting values with a MultiIndex** | ||
|
||
A number of examples using a DataFrame with a MultiIndex | ||
|
||
>>> tuples = [ | ||
... ('cobra', 'mark i'), ('cobra', 'mark ii'), | ||
... ('sidewinder', 'mark i'), ('sidewinder', 'mark ii'), | ||
... ('viper', 'mark ii'), ('viper', 'mark iii') | ||
... ] | ||
>>> index = pd.MultiIndex.from_tuples(tuples) | ||
>>> values = [[12, 2], [0, 4], [10, 20], | ||
... [1, 4], [7, 1], [16, 36]] | ||
>>> df = pd.DataFrame(values, columns=['max_speed', 'shield'], index=index) | ||
>>> df | ||
max_speed shield | ||
cobra mark i 12 2 | ||
mark ii 0 4 | ||
sidewinder mark i 10 20 | ||
mark ii 1 4 | ||
viper mark ii 7 1 | ||
mark iii 16 36 | ||
|
||
Single label. Note this returns a DataFrame with a single index. | ||
|
||
>>> df.loc['cobra'] | ||
max_speed shield | ||
mark i 12 2 | ||
mark ii 0 4 | ||
|
||
Single index tuple. Note this returns a Series. | ||
|
||
>>> df.loc[('cobra', 'mark ii')] | ||
max_speed 0 | ||
shield 4 | ||
Name: (cobra, mark ii), dtype: int64 | ||
|
||
Single label for row and column. Similar to passing in a tuple, this | ||
returns a Series. | ||
|
||
>>> df.loc['cobra', 'mark i'] | ||
max_speed 12 | ||
shield 2 | ||
Name: (cobra, mark i), dtype: int64 | ||
|
||
Single tuple. Note using ``[[]]`` returns a DataFrame. | ||
|
||
>>> df.loc[[('cobra', 'mark ii')]] | ||
max_speed shield | ||
cobra mark ii 0 4 | ||
|
||
Single tuple for the index with a single label for the column | ||
|
||
>>> df.loc[('cobra', 'mark i'), 'shield'] | ||
2 | ||
|
||
Slice from index tuple to single label | ||
|
||
>>> df.loc[('cobra', 'mark i'):'viper'] | ||
max_speed shield | ||
cobra mark i 12 2 | ||
mark ii 0 4 | ||
sidewinder mark i 10 20 | ||
mark ii 1 4 | ||
viper mark ii 7 1 | ||
mark iii 16 36 | ||
|
||
Slice from index tuple to index tuple | ||
|
||
>>> df.loc[('cobra', 'mark i'):('viper', 'mark ii')] | ||
max_speed shield | ||
cobra mark i 12 2 | ||
mark ii 0 4 | ||
sidewinder mark i 10 20 | ||
mark ii 1 4 | ||
viper mark ii 7 1 | ||
|
||
Raises | ||
------ | ||
KeyError: | ||
when any items are not found | ||
""" | ||
|
||
_valid_types = ("labels (MUST BE IN THE INDEX), slices of labels (BOTH " | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd add
DataFrame.xs
too.