DOC: Improve the docstring of pd.Index.contains and closes PR #20211 #23100

TanyaaCJain · 2018-10-12T08:56:08Z

closes DOC: update the pd.Index.contains docstring #20211
tests added / passed
passes git diff upstream/master -u -- "*.py" | flake8 --diff

This PR is an update to PR #20211, improving the Index.contains docstring.

pep8speaks · 2018-10-12T08:56:11Z

Hello @Tanya-Jain! Thanks for submitting the PR.

There are no PEP8 issues in the file pandas/core/indexes/base.py !

datapythonista

Good work, I added some comments about minor things.

datapythonista · 2018-10-12T13:26:46Z

pandas/core/indexes/base.py

+
+        See Also
+        --------
+        CategoricalIndex : Returns a bool if the 1-dimensional, Categorical key


CategoricalIndex is the class, do you mean CategoricalIndex.contains here?

Yes, I mean this. Apparently, there isn't any docstring written for the same. Should I write something like
CategoricalIndex : Its :method: contains returns a bool if the key is in index. The key can be of the type same as this class, like the Index.contains
or should I still refer to CategoricalIndex.contains : Returns a bool if the key is in index . . .

The Index class is a parent of CategoricalIndex as well as all the other index classes like Float64Index... It could make sense to list CategoricalIndex.contains, Float64Index.contains... but I don't think it adds that much value in this case, and it adds noise to the docstring.

I think the best is just leave Index.isin, but it's up to you.

I agree as CategoricalIndex.contains behaves similarly to Index.contains. Let's just keep Index.isin.

datapythonista · 2018-10-12T13:27:44Z

pandas/core/indexes/base.py

+
+        Examples
+        --------
+        >>> l1 = pd.Index([1, 2, (3, 4), 5])


By convention we use idx for variables soring an index in the examples (df for dataframes, and s for series). Can you use that?

Also, can you show the content of the index after you store it (i.e. >>> idx)

Yes, I can change the variable naming and show the content, it would be better.

datapythonista · 2018-10-12T13:28:29Z

pandas/core/indexes/base.py

+        >>> l1 = pd.Index([1, 2, (3, 4), 5])
+        >>> t = (3, 4)
+        >>> num1 = 1
+        >>> num2 = 6


I'd save a bit of space by using the values directly in the calls, instead of saving variables first (e.g. idx.contains(1)

I will improve this.

datapythonista · 2018-10-12T13:37:08Z

pandas/core/indexes/base.py


        Parameters
        ----------
-        key : object
+        key : label
+            The key requested. Immutable-like, 1-dimensional.


While true, I think users will find this a bit confusing. I think it probably makes more sense to avoid saying that is 1-dimensional, as it gives the impression that more than one key can be provided.

Personally, I think the users should look for what a label can be (immutable, 1-dimensional if a tuple...) in some other part of the documentation, and here focus more on what contains exactly expects (the key that will be searched in the index). But if you prefer to be specific, I'd start this by something like The key can be of the same type as the labels in index, so it needs to be immutable....

This is exactly what I tried to explain, thank you!

datapythonista · 2018-10-12T13:38:18Z

Any reason to close this? I guess it's been by mistake?

codecov · 2018-10-12T13:38:33Z

Codecov Report

Merging #23100 into master will decrease coverage by <.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master   #23100      +/-   ##
==========================================
- Coverage    92.2%    92.2%   -0.01%     
==========================================
  Files         162      162              
  Lines       51701    51700       -1     
==========================================
- Hits        47671    47670       -1     
  Misses       4030     4030

Flag	Coverage Δ
#multiple	`90.6% <100%> (-0.01%)`	⬇️
#single	`43.02% <100%> (-0.01%)`	⬇️

Impacted Files	Coverage Δ
pandas/core/indexes/multi.py	`95.55% <100%> (ø)`	⬆️
pandas/core/indexes/category.py	`97.9% <100%> (ø)`	⬆️
pandas/core/indexes/period.py	`93.06% <100%> (ø)`	⬆️
pandas/core/indexes/datetimelike.py	`97.29% <100%> (ø)`	⬆️
pandas/core/indexes/base.py	`96.27% <100%> (-0.01%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 03134cb...93dcd9d. Read the comment docs.

TanyaaCJain · 2018-10-12T13:55:23Z

#23100 contained an additional commit for merging commits from the pandas-dev/pandas master.
To have only the commit for the file changes made in base.py, I closed #23100 and opened #23107

TanyaaCJain · 2018-10-12T14:15:31Z

I tried to get the problem fixed but ended up having 0 commits being shown as well as having this PR automatically closed.

- Description for key has more clarity on its behaviour. - Removed :class: `CategoricalIndex` from the See Also section as the methods: CategoricalIndex.contains, Float64Index.contains and like, behave similarly to `Index.contains`. Hence, keeping only `Index.isin` - Use `idx` for naming variables for objects of :class: `Index` instead of `l` for passing list-like key, to encourage pandas docstring standards and clarity. - Directly call the values in :method: `Index.contains` to prevent additional memory usage.

…-update

datapythonista

It looks very good. Just couple of small things, and it's ready to be merged.

datapythonista · 2018-10-13T07:43:18Z

pandas/core/indexes/base.py

@@ -2005,15 +2005,35 @@ def __contains__(self, key):
            return False

    _index_shared_docs['contains'] = """
-        return a boolean if this key is IN the index
+        Return a boolean if this key is in the index.


Minor thing, and I know this is the original description and not yours, but this sounds like it a boolean is returned only if the key is in the index. Not that a boolean is always returned, on whether the key is in the index. Do you mind rephrasing it to be more correct please?

Sure, I'll rephrase it in the next commit as a boolean indicator. Apparently, there are few more docstrings with similar phrasing like Series.is_unique, Series.is_monotonic, Series.is_monotonic_increasing, Series.is_monotonic_decreasing. These too can be corrected in future if seems appropriate.

datapythonista · 2018-10-13T07:44:59Z

pandas/core/indexes/base.py


        Parameters
        ----------
-        key : object
+        key : label
+            The key can be of the same type as the label of :class: `Index`,


I think the space after :class: is not needed. Did you check if this is rendered correctly in the html version? You should be able to generate this page with ./doc/make.py html --single=pandas.Index.contains

Thanks, it renders correctly after removing the space after :class:

datapythonista · 2018-10-13T07:48:56Z

pandas/core/indexes/base.py

+        >>> idx
+        Index([1, 2, (3, 4), 5], dtype='object')
+        >>> idx.contains((3,4))
+        True


Small thing, but do you mind moving this example after the other two? I think it helps if we show the simple/easy cases first, and the more complex later.

Also, you there is a missing space after the comma.

When you're done with the changes, it may be helpful to run ./scripts/validate_docstrings.py pandas.Index.contains to make sure there are not errors (I don't think there are).

And also flake8 --doctests pandas/core/indexes/base.py. Note that this will report all doctests problems in all the docstrings of the file. Just look that there is no problem in the lines of this docstring 2008-2036.

I have made the said changes.

Validation tests passed with a warning of absence of an extended description which has been suggested to avoid in the review of the original PR. Tested via command ./scripts/validate_docstrings.py pandas.Index.contains

No doctests problem were reported for lines 2008-2036 by testing via command flake8 --doctests pandas/core/indexes/base.py

- Rephrased description as a boolean indicator - Removed a space after :class: for correct rendering checked by the command `./doc/make.py html --single=pandas.Index.contains` - Moved the easier example cases first. - Added the missing space after comma following PEP8 - Validation tests passed with a warning of absence of an extended description. Tested via command `./scripts/validate_docstrings.py pandas.Index.contains` - No doctests problem were reported for lines 2008-2036 by testing via command `flake8 --doctests pandas/core/indexes/base.py`

TanyaaCJain · 2018-10-23T23:37:57Z

The html page generates correctly with ./doc/make.py html --single=pandas.Index.contains
Validation tests passed with a warning of absence of an extended description as asked in the DOC: update the pd.Index.contains docstring #20211 PR review. Tested via command ./scripts/validate_docstrings.py pandas.Index.contains
No doctests problem were reported for lines 2008-2036 by testing via command flake8 --doctests pandas/core/indexes/base.py

Validation Script:

################################################################################
###################### Docstring (pandas.Index.contains)  ######################
################################################################################

Return a boolean indicating whether this key is in the index.

Parameters
----------
key : label
    The key can be of the same type as the label of :class:`Index`,
    hence immutable-like and 1-dimensional if it is a tuple.

Returns
-------
bool
    Result indicating whether the key search is in the index.

See Also
--------
Index.isin : Returns an ndarray of boolean dtype indicating whether the
    list-like key is in the index.

Examples
--------
>>> idx = pd.Index([1, 2, (3, 4), 5])
>>> idx
Index([1, 2, (3, 4), 5], dtype='object')
>>> idx.contains(1)
True
>>> idx.contains(6)
False
>>> idx.contains((3, 4))
True

################################################################################
################################## Validation ##################################
################################################################################

Warnings found:
	No extended summary found
Docstring for "pandas.Index.contains" correct. :)

datapythonista

lgtm. Thanks @Tanya-Jain

datapythonista · 2018-11-20T14:48:31Z

@jreback if you don't mind reviewing and merging this one too

jreback · 2018-11-20T15:19:35Z

pandas/core/indexes/base.py

+
+        Examples
+        --------
+        >>> idx = pd.Index([1, 2, (3, 4), 5])


I would prefer an example with a simple values like pd.Index(list('abcd')). This is a rather esoteric case.

datapythonista · 2018-12-02T00:43:04Z

@Tanya-Jain do you have time to address the comments of the last review and fix the conflicts?

datapythonista · 2018-12-07T12:40:50Z

@jreback addressed your comments, and made some other fixes to this docstring, if you can take a look.

jreback · 2018-12-07T14:17:05Z

thanks @Tanya-Jain and @datapythonista

…dev#20211 (pandas-dev#23100)

TanyaaCJain closed this Oct 12, 2018

datapythonista reviewed Oct 12, 2018

View reviewed changes

datapythonista added Docs Indexing Related to indexing on series/frames, not to indexes themselves labels Oct 12, 2018

datapythonista reopened this Oct 12, 2018

datapythonista mentioned this pull request Oct 12, 2018

DOC: Improve the docstring of pd.Index.contains #23107

Closed

3 tasks

TanyaaCJain closed this Oct 12, 2018

TanyaaCJain force-pushed the pr-update branch from 1d83db4 to 12a0dc4 Compare October 12, 2018 14:05

DOC: Improve the docstring of pd.Index.contains

05fe477

TanyaaCJain reopened this Oct 12, 2018

datapythonista mentioned this pull request Oct 12, 2018

DOC: update the pd.Index.contains docstring #20211

Closed

9 tasks

TanyaaCJain added 2 commits October 13, 2018 10:43

Merge branch 'master' of https://github.com/pandas-dev/pandas into pr…

e176a5d

…-update

datapythonista reviewed Oct 13, 2018

View reviewed changes

TanyaaCJain added 2 commits October 24, 2018 04:27

Merge remote-tracking branch 'upstream/master' into pr-update

c54ac4f

datapythonista approved these changes Oct 24, 2018

View reviewed changes

jreback added this to the 0.24.0 milestone Nov 20, 2018

jreback requested changes Nov 20, 2018

View reviewed changes

jreback removed this from the 0.24.0 milestone Nov 27, 2018

datapythonista added 2 commits December 7, 2018 12:26

Merging from master

b9ba90a

Improving Index.contains example, and reusing docstring for __contains__

dedb267

Improving texts of the docstring

93dcd9d

datapythonista self-assigned this Dec 7, 2018

jreback added this to the 0.24.0 milestone Dec 7, 2018

jreback approved these changes Dec 7, 2018

View reviewed changes

jreback merged commit f492be6 into pandas-dev:master Dec 7, 2018

Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019

DOC: Improve the docstring of pd.Index.contains and closes PR pandas-…

1ecb1ba

…dev#20211 (pandas-dev#23100)

Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019

DOC: Improve the docstring of pd.Index.contains and closes PR pandas-…

7b174e5

…dev#20211 (pandas-dev#23100)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DOC: Improve the docstring of pd.Index.contains and closes PR #20211 #23100

DOC: Improve the docstring of pd.Index.contains and closes PR #20211 #23100

TanyaaCJain commented Oct 12, 2018

pep8speaks commented Oct 12, 2018

datapythonista left a comment

datapythonista Oct 12, 2018

TanyaaCJain Oct 12, 2018

datapythonista Oct 12, 2018

TanyaaCJain Oct 13, 2018

datapythonista Oct 12, 2018

TanyaaCJain Oct 12, 2018

datapythonista Oct 12, 2018

TanyaaCJain Oct 12, 2018

datapythonista Oct 12, 2018

TanyaaCJain Oct 12, 2018

datapythonista commented Oct 12, 2018

codecov bot commented Oct 12, 2018 •

edited

Loading

TanyaaCJain commented Oct 12, 2018

TanyaaCJain commented Oct 12, 2018

datapythonista left a comment

datapythonista Oct 13, 2018

TanyaaCJain Oct 23, 2018

datapythonista Oct 13, 2018

TanyaaCJain Oct 23, 2018

datapythonista Oct 13, 2018

TanyaaCJain Oct 23, 2018

TanyaaCJain commented Oct 23, 2018

datapythonista left a comment

datapythonista commented Nov 20, 2018

jreback Nov 20, 2018

datapythonista commented Dec 2, 2018

datapythonista commented Dec 7, 2018

jreback commented Dec 7, 2018

DOC: Improve the docstring of pd.Index.contains and closes PR #20211 #23100

DOC: Improve the docstring of pd.Index.contains and closes PR #20211 #23100

Conversation

TanyaaCJain commented Oct 12, 2018

pep8speaks commented Oct 12, 2018

datapythonista left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

datapythonista commented Oct 12, 2018

codecov bot commented Oct 12, 2018 • edited Loading

Codecov Report

TanyaaCJain commented Oct 12, 2018

TanyaaCJain commented Oct 12, 2018

datapythonista left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

TanyaaCJain commented Oct 23, 2018

datapythonista left a comment

Choose a reason for hiding this comment

datapythonista commented Nov 20, 2018

Choose a reason for hiding this comment

datapythonista commented Dec 2, 2018

datapythonista commented Dec 7, 2018

jreback commented Dec 7, 2018

codecov bot commented Oct 12, 2018 •

edited

Loading