Skip to content

DOC:Improve the docstring of DataFrame.iloc() #20228

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Jul 7, 2018

Conversation

tuhinmahmud
Copy link
Contributor

@tuhinmahmud tuhinmahmud commented Mar 10, 2018

Checklist for the pandas documentation sprint (ignore this if you are doing
an unrelated PR):

  • [X ] PR title is "DOC: update the docstring"
  • [ X] The validation script passes: scripts/validate_docstrings.py <your-function-or-method>
  • [ X] The PEP8 style check passes: git diff upstream/master -u -- "*.py" | flake8 --diff
  • [ X] The html version looks good: python doc/make.py --single <your-function-or-method>
  • [X ] It has been proofread on language by another sprint participant

Please include the output of the validation script below between the "```" ticks:

(pandas_dev) tuhins-mbp:pandas [email protected]$ python scripts/validate_docstrings.py pandas.DataFrame.iloc

################################################################################
###################### Docstring (pandas.DataFrame.iloc)  ######################
################################################################################

Purely integer-location based indexing for selection by position.

``.iloc[]`` is primarily integer position based (from ``0`` to
``length-1`` of the axis), but may also be used with a boolean
array.

Allowed inputs are:

- An integer, e.g. ``5``.
- A list or array of integers, e.g. ``[4, 3, 0]``.
- A slice object with ints, e.g. ``1:7``.
- A boolean array.
- A ``callable`` function with one argument (the calling Series, DataFrame
  or Panel) and that returns valid output for indexing (one of the above)

``.iloc`` will raise ``IndexError`` if a requested indexer is
out-of-bounds, except *slice* indexers which allow out-of-bounds
indexing (this conforms with python/numpy *slice* semantics).

See Also
--------
DataFrame.ix : A primarily label-location based indexer, with integer position fallback.
DataFrame.loc : Fast integer location scalar accessor.

Examples
--------
>>> import pandas as pd
>>> mydict = [{'a': 1, 'b': 2, 'c': 3, 'd': 4},
...           {'a': 100,  'b': 200, 'c': 300, 'd': 400},
...           {'a': 1000,  'b': 2000,  'c': 3000,  'd': 4000 }]
>>> df = pd.DataFrame(mydict)
>>> print(df.head())
      a     b     c     d
0     1     2     3     4
1   100   200   300   400
2  1000  2000  3000  4000
>>> print(df.iloc[0])
a    1
b    2
c    3
d    4
Name: 0, dtype: int64
>>> print(df.iloc[0:2])
     a    b    c    d
0        1    2    3    4
1  100  200  300  400

ref:`Selection by Position <indexing.integer>`

################################################################################
################################## Validation ##################################
################################################################################

Errors found:
	No returns section found



If the validation script still gives errors, but you think there is a good reason
to deviate in this case (and there are certainly such cases), please state this
explicitly.

Error Returned because Class do not need return

@pep8speaks
Copy link

pep8speaks commented Mar 10, 2018

Hello @tuhinmahmud! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on July 07, 2018 at 19:10 Hours UTC

0 1 2 3 4
1 100 200 300 400

ref:`Selection by Position <indexing.integer>`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you use the "extended summary" section for this, right below the opening summary? I think the prose docs are the most important for indexing, and I don't want them to get lost.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved it to "extended summary"

See more at :ref:`Selection by Position <indexing.integer>`
See Also
--------
DataFrame.ix : A primarily label-location based indexer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ix is deprecated, you can remove.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

See Also
--------
DataFrame.ix : A primarily label-location based indexer
DataFrame.loc : Fast integer location scalar accessor.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you may want the " A primarily label-location based indexer" for the .loc description.

You may be thinking of DataFrame.iat for this dsecription.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove ix and update loc description:

  • DataFrame.ix : A primarily label-location based indexer
  • DataFrame.loc : Fast integer location scalar accessor.
  • DataFrame.iat : Fast integer location scalar accessor.
  • DataFrame.loc : Purely label-location based indexer for selection by label.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missed your comment about .loc .. but instead got the description from https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.loc.html

--------
>>> import pandas as pd
>>> mydict = [{'a': 1, 'b': 2, 'c': 3, 'd': 4},
... {'a': 100, 'b': 200, 'c': 300, 'd': 400},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PEP8: single spaces.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated the double spaces with single

Name: 0, dtype: int64
>>> print(df.iloc[0:2])
a b c d
0 1 2 3 4
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

formatting seems a bit off. Can you double check this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reran the command and updated.

@TomAugspurger TomAugspurger added the Indexing Related to indexing on series/frames, not to indexes themselves label Mar 10, 2018
c 3
d 4
Name: 0, dtype: int64
>>> print(df.iloc[0:2])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't need the prints here and above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

took off the prints

Copy link
Contributor Author

@tuhinmahmud tuhinmahmud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

0 1 2 3 4
1 100 200 300 400

ref:`Selection by Position <indexing.integer>`
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved it to "extended summary"

See Also
--------
DataFrame.iat : Fast integer location scalar accessor.
DataFrame.loc : Purely label-location based indexer for selection by label.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add Series.iloc

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added Series.iloc

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you look at other PRs for those accessors to have a consistent way to reference them? See eg https://github.com/pandas-dev/pandas/pull/20229/files, They use:

    DateFrame.at : Access a single value for a row/column label pair
    DateFrame.iloc : Access group of rows and columns by integer position(s)
    Series.loc : Access group of values using labels

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tuhinmahmud make sure to pull my changes before looking into this.


Examples
--------
>>> import pandas as pd
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't need the pandas import

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed pandas import

--------
>>> import pandas as pd
>>> mydict = [{'a': 1, 'b': 2, 'c': 3, 'd': 4},
... {'a': 100, 'b': 200, 'c': 300, 'd': 400},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to indent here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

undated the indentation

0 1 2 3 4
1 100 200 300 400
2 1000 2000 3000 4000
>>> df.iloc[0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blank lines between cases.

show additional examples, including selecting with .iloc[0] vs .iloc[[0]], and use a multi-axis selction .iloc[0, 0] and lists for the last, e.g. .iloc[[0, 1], [0, 1]]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also add a sentence for each case explaining it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added sentences and put different types of example of .iloc in paragraph

tuhinmahmud and others added 3 commits March 11, 2018 13:23
added  5 types of examples for iloc
 * Select using integer.r
 * Select via index slicing.
 * Select using boolean array.
 * Select using callable function.
 * Multi index selection.

Updated indentation
removed import
@TomAugspurger
Copy link
Contributor

@tuhinmahmud pushed an update to your examples to make the ordering a bit more logical. We now show each of the valid indexer values (scalar, sequence, slice, mask, callable) for just indexing the rows. Then we show the same for rows and columns.

@TomAugspurger
Copy link
Contributor

fireshot capture 001 - pandas dataframe iloc pandas 0 23 0_ - file____users_taugspurger_sandbox_

Copy link
Contributor Author

@tuhinmahmud tuhinmahmud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

made requested changes
i) Remove ix and update loc description
ii) removed pandas import
iii) single space issue

Copy link
Contributor Author

@tuhinmahmud tuhinmahmud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.. Do you know what is the next step to get it commited?

@codecov
Copy link

codecov bot commented Mar 26, 2018

Codecov Report

❗ No coverage uploaded for pull request base (master@dcbf8b5). Click here to learn what that means.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff            @@
##             master   #20228   +/-   ##
=========================================
  Coverage          ?   91.95%           
=========================================
  Files             ?      160           
  Lines             ?    49837           
  Branches          ?        0           
=========================================
  Hits              ?    45830           
  Misses            ?     4007           
  Partials          ?        0
Flag Coverage Δ
#multiple 90.34% <ø> (?)
#single 42.08% <ø> (?)
Impacted Files Coverage Δ
pandas/core/indexing.py 93.73% <ø> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update dcbf8b5...329b05b. Read the comment docs.

@TomAugspurger
Copy link
Contributor

Merged in master to restart the CI. They were complaining about a non-ASCII character, but that may have been on master when you made the PR, and not in your actual branch.

Ping if you notice that the tests all pass before we do.

@tuhinmahmud
Copy link
Contributor Author

tuhinmahmud commented Apr 3, 2018

@TomAugspurger Hi .. I am kind of lost as to what I can do to make it work .. I see one of the test continuous-integration/travi-ci/pr failing.. not sure what to make of it

image

image

@TomAugspurger
Copy link
Contributor

Sometimes travis jobs take much longer than others, and exceed Travis' time limit. I restarted that one.

@mroeschke
Copy link
Member

mroeschke commented Jul 7, 2018

Looks like everything was addressed. Will merge on green since it looks like CI caught an issue before.

@mroeschke mroeschke added this to the 0.24.0 milestone Jul 7, 2018
@jreback jreback merged commit f49355d into pandas-dev:master Jul 7, 2018
Sup3rGeo pushed a commit to Sup3rGeo/pandas that referenced this pull request Oct 1, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Docs Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants