Skip to content

DOC:Improve the docstring of DataFrame.iloc() #20228

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Jul 7, 2018
118 changes: 115 additions & 3 deletions pandas/core/indexing.py
Original file line number Diff line number Diff line change
Expand Up @@ -1629,7 +1629,8 @@ def _getitem_axis(self, key, axis=None):


class _iLocIndexer(_LocationIndexer):
"""Purely integer-location based indexing for selection by position.
"""
Purely integer-location based indexing for selection by position.

``.iloc[]`` is primarily integer position based (from ``0`` to
``length-1`` of the axis), but may also be used with a boolean
Expand All @@ -1642,14 +1643,125 @@ class _iLocIndexer(_LocationIndexer):
- A slice object with ints, e.g. ``1:7``.
- A boolean array.
- A ``callable`` function with one argument (the calling Series, DataFrame
or Panel) and that returns valid output for indexing (one of the above)
or Panel) and that returns valid output for indexing (one of the above).
This is useful in method chains, when you don't have a reference to the
calling object, but would like to base your selection on some value.

``.iloc`` will raise ``IndexError`` if a requested indexer is
out-of-bounds, except *slice* indexers which allow out-of-bounds
indexing (this conforms with python/numpy *slice* semantics).

See more at :ref:`Selection by Position <indexing.integer>`
See more at ref:`Selection by Position <indexing.integer>`.

See Also
--------
DataFrame.iat : Fast integer location scalar accessor.
DataFrame.loc : Purely label-location based indexer for selection by label.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add Series.iloc

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added Series.iloc

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you look at other PRs for those accessors to have a consistent way to reference them? See eg https://github.com/pandas-dev/pandas/pull/20229/files, They use:

    DateFrame.at : Access a single value for a row/column label pair
    DateFrame.iloc : Access group of rows and columns by integer position(s)
    Series.loc : Access group of values using labels

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tuhinmahmud make sure to pull my changes before looking into this.

Series.iloc : Purely integer-location based indexing for
selection by position.

Examples
--------

>>> mydict = [{'a': 1, 'b': 2, 'c': 3, 'd': 4},
... {'a': 100, 'b': 200, 'c': 300, 'd': 400},
... {'a': 1000, 'b': 2000, 'c': 3000, 'd': 4000 }]
>>> df = pd.DataFrame(mydict)
>>> df
a b c d
0 1 2 3 4
1 100 200 300 400
2 1000 2000 3000 4000

**Indexing just the rows**

With a scalar integer.

>>> type(df.iloc[0])
<class 'pandas.core.series.Series'>
>>> df.iloc[0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blank lines between cases.

show additional examples, including selecting with .iloc[0] vs .iloc[[0]], and use a multi-axis selction .iloc[0, 0] and lists for the last, e.g. .iloc[[0, 1], [0, 1]]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also add a sentence for each case explaining it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added sentences and put different types of example of .iloc in paragraph

a 1
b 2
c 3
d 4
Name: 0, dtype: int64

With a list of integers.

>>> df.iloc[[0]]
a b c d
0 1 2 3 4
>>> type(df.iloc[[0]])
<class 'pandas.core.frame.DataFrame'>

>>> df.iloc[[0, 1]]
a b c d
0 1 2 3 4
1 100 200 300 400

With a `slice` object.

>>> df.iloc[:3]
a b c d
0 1 2 3 4
1 100 200 300 400
2 1000 2000 3000 4000

With a boolean mask the same length as the index.

>>> df.iloc[[True, False, True]]
a b c d
0 1 2 3 4
2 1000 2000 3000 4000

With a callable, useful in method chains. The `x` passed
to the ``lambda`` is the DataFrame being sliced. This selects
the rows whose index label even.

>>> df.iloc[lambda x: x.index % 2 == 0]
a b c d
0 1 2 3 4
2 1000 2000 3000 4000

**Indexing both axes**

You can mix the indexer types for the index and columns. Use ``:`` to
select the entire axis.

With scalar integers.

>>> df.iloc[0, 1]
2

With lists of integers.

>>> df.iloc[[0, 2], [1, 3]]
b d
0 2 4
2 2000 4000

With `slice` objects.

>>> df.iloc[1:3, 0:3]
a b c
1 100 200 300
2 1000 2000 3000

With a boolean array whose length matches the columns.

>>> df.iloc[:, [True, False, True, False]]
a c
0 1 3
1 100 300
2 1000 3000

With a callable function that expects the Series or DataFrame.

>>> df.iloc[:, lambda df: [0, 2]]
a c
0 1 3
1 100 300
2 1000 3000
"""

_valid_types = ("integer, integer slice (START point is INCLUDED, END "
Expand Down