-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
DOC: Add missing docstrings #31047
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOC: Add missing docstrings #31047
Changes from 3 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -1163,6 +1163,9 @@ def to_frame(self, index=True, name=None): | |||||
|
||||||
@property | ||||||
def name(self): | ||||||
""" | ||||||
Return Index or MultiIndex name. | ||||||
""" | ||||||
return self._name | ||||||
|
||||||
@name.setter | ||||||
|
@@ -1644,21 +1647,235 @@ def is_unique(self) -> bool: | |||||
|
||||||
@property | ||||||
def has_duplicates(self) -> bool: | ||||||
""" | ||||||
Check if the Index has duplicate values. | ||||||
|
||||||
Returns | ||||||
------- | ||||||
bool | ||||||
Whether or not the Index has duplicate values. | ||||||
|
||||||
Examples | ||||||
-------- | ||||||
>>> idx = pd.Index([1, 5, 7, 7]) | ||||||
>>> idx.has_duplicates | ||||||
True | ||||||
|
||||||
>>> idx = pd.Index([1, 5, 7]) | ||||||
>>> idx.has_duplicates | ||||||
False | ||||||
|
||||||
>>> idx = pd.Index(["Watermelon", "Orange", "Apple", | ||||||
... "Watermelon"]).astype("category") | ||||||
>>> idx.has_duplicates | ||||||
True | ||||||
|
||||||
>>> idx = pd.Index(["Orange", "Apple", | ||||||
... "Watermelon"]).astype("category") | ||||||
>>> idx.has_duplicates | ||||||
False | ||||||
""" | ||||||
return not self.is_unique | ||||||
|
||||||
def is_boolean(self) -> bool: | ||||||
""" | ||||||
Check if the Index only consists of booleans. | ||||||
|
||||||
Returns | ||||||
------- | ||||||
bool | ||||||
Whether or not the Index only consists of booleans. | ||||||
|
||||||
See Also | ||||||
-------- | ||||||
is_integer : Check if the Index only consists of integers. | ||||||
is_floating : Check if the Index is a floating type. | ||||||
is_numeric : Check if the Index only consists of numeric data. | ||||||
is_object : Check if the Index is of the object dtype. | ||||||
is_categorical : Check if the Index holds categorical data. | ||||||
is_interval : Check if the Index holds Interval objects. | ||||||
is_mixed : Check if the Index holds data with mixed data types. | ||||||
|
||||||
Examples | ||||||
-------- | ||||||
>>> idx = pd.Index([True, False, True]) | ||||||
>>> idx.is_boolean() | ||||||
True | ||||||
|
||||||
>>> idx = pd.Index(["True", "False", "True"]) | ||||||
>>> idx.is_boolean() | ||||||
False | ||||||
|
||||||
>>> idx = pd.Index([True, False, "True"]) | ||||||
>>> idx.is_boolean() | ||||||
False | ||||||
""" | ||||||
return self.inferred_type in ["boolean"] | ||||||
|
||||||
def is_integer(self) -> bool: | ||||||
""" | ||||||
Check if the Index only consists of integers. | ||||||
|
||||||
See Also | ||||||
-------- | ||||||
is_boolean : Check if the Index only consists of booleans. | ||||||
is_floating : Check if the Index is a floating type. | ||||||
is_numeric : Check if the Index only consists of numeric data. | ||||||
is_object : Check if the Index is of the object dtype. | ||||||
is_categorical : Check if the Index holds categorical data. | ||||||
is_interval : Check if the Index holds Interval objects. | ||||||
is_mixed : Check if the Index holds data with mixed data types. | ||||||
|
||||||
Returns | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nit: "Returns" section above the "See also" |
||||||
------- | ||||||
bool | ||||||
Whether or not the Index only consists of integers. | ||||||
|
||||||
Examples | ||||||
-------- | ||||||
>>> idx = pd.Index([1, 2, 3, 4]) | ||||||
>>> idx.is_integer() | ||||||
True | ||||||
|
||||||
>>> idx = pd.Index([1.0, 2.0, 3.0, 4.0]) | ||||||
>>> idx.is_integer() | ||||||
False | ||||||
|
||||||
>>> idx = pd.Index([1, 2, 3, 4.0]) | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In practice, this is the same as above, as this will be parsed into a FloatIndex. So if we want a third example, I would maybe rather shows strings (or just leave it out) |
||||||
>>> idx.is_integer() | ||||||
False | ||||||
""" | ||||||
return self.inferred_type in ["integer"] | ||||||
|
||||||
def is_floating(self) -> bool: | ||||||
""" | ||||||
Check if the Index is a floating type. | ||||||
|
||||||
The Index may consist of only floats, NaNs, or a mix of floats, | ||||||
integers, or NaNs. | ||||||
|
||||||
Returns | ||||||
------- | ||||||
bool | ||||||
Whether or not the Index only consists of only consists of floats, NaNs, or | ||||||
a mix of floats, integers, or NaNs. | ||||||
|
||||||
See Also | ||||||
-------- | ||||||
is_boolean : Check if the Index only consists of booleans. | ||||||
is_integer : Check if the Index only consists of integers. | ||||||
is_numeric : Check if the Index only consists of numeric data. | ||||||
is_object : Check if the Index is of the object dtype. | ||||||
is_categorical : Check if the Index holds categorical data. | ||||||
is_interval : Check if the Index holds Interval objects. | ||||||
is_mixed : Check if the Index holds data with mixed data types. | ||||||
|
||||||
Examples | ||||||
-------- | ||||||
>>> idx = pd.Index([1.0, 2.0, 3.0, 4.0]) | ||||||
>>> idx.is_floating() | ||||||
True | ||||||
|
||||||
>>> idx = pd.Index([1, 2, 3, 4.0]) | ||||||
>>> idx.is_floating() | ||||||
True | ||||||
|
||||||
>>> idx = pd.Index([1, 2, 3, 4.0, np.nan]) | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same comment here as above. I understand that from looking at the lists that are used to create the index, it looks like different cases, but all those are Float64Index objects. So for this case, I find it makes it actually more confusing (it would be rather an example to show in the main Index docstring to illustrate the constructor). Thoughts? Showing that it can contain NaN is of course useful. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see; I was thinking by showing it, people who are not familiar with Float64Index objects will now what to expect, but I do agree that it's probably better to be an example to show in the main Index docstring. Will modify it & leave the NaNs |
||||||
>>> idx.is_floating() | ||||||
True | ||||||
|
||||||
>>> idx = pd.Index([1, 2, 3, 4, np.nan]) | ||||||
>>> idx.is_floating() | ||||||
True | ||||||
|
||||||
>>> idx = pd.Index([1, 2, 3, 4]) | ||||||
>>> idx.is_integer() | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
I think this is a typo |
||||||
False | ||||||
""" | ||||||
return self.inferred_type in ["floating", "mixed-integer-float", "integer-na"] | ||||||
|
||||||
def is_numeric(self) -> bool: | ||||||
""" | ||||||
datapythonista marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
Check if the Index only consists of numeric data. | ||||||
|
||||||
Returns | ||||||
------- | ||||||
bool | ||||||
Whether or not the Index only only consists of numeric | ||||||
data. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
|
||||||
See Also | ||||||
-------- | ||||||
is_boolean : Check if the Index only consists of booleans. | ||||||
is_integer : Check if the Index only consists of integers. | ||||||
is_floating : Check if the Index is a floating type. | ||||||
is_object : Check if the Index is of the object dtype. | ||||||
is_categorical : Check if the Index holds categorical data. | ||||||
is_interval : Check if the Index holds Interval objects. | ||||||
is_mixed : Check if the Index holds data with mixed data types. | ||||||
|
||||||
Examples | ||||||
-------- | ||||||
>>> idx = pd.Index([1.0, 2.0, 3.0, 4.0]) | ||||||
>>> idx.is_numeric() | ||||||
True | ||||||
|
||||||
>>> idx = pd.Index([1, 2, 3, 4.0]) | ||||||
>>> idx.is_numeric() | ||||||
True | ||||||
|
||||||
>>> idx = pd.Index([1, 2, 3, 4]) | ||||||
>>> idx.is_numeric() | ||||||
True | ||||||
|
||||||
>>> idx = pd.Index([1, 2, 3, 4.0, np.nan]) | ||||||
>>> idx.is_numeric() | ||||||
True | ||||||
|
||||||
>>> idx = pd.Index([1, 2, 3, 4.0, np.nan, "Apple"]) | ||||||
>>> idx.is_numeric() | ||||||
False | ||||||
""" | ||||||
return self.inferred_type in ["integer", "floating"] | ||||||
|
||||||
def is_object(self) -> bool: | ||||||
""" | ||||||
Check if the Index is of the object dtype. | ||||||
|
||||||
Returns | ||||||
------- | ||||||
bool | ||||||
Whether or not the Index is of the object dtype. | ||||||
|
||||||
See Also | ||||||
-------- | ||||||
is_boolean : Check if the Index only consists of booleans. | ||||||
is_integer : Check if the Index only consists of integers. | ||||||
is_floating : Check if the Index is a floating type. | ||||||
is_numeric : Check if the Index only consists of numeric data. | ||||||
is_categorical : Check if the Index holds categorical data. | ||||||
is_interval : Check if the Index holds Interval objects. | ||||||
is_mixed : Check if the Index holds data with mixed data types. | ||||||
|
||||||
Examples | ||||||
-------- | ||||||
>>> idx = pd.Index(["Apple", "Mango", "Watermelon"]) | ||||||
>>> idx.is_object() | ||||||
True | ||||||
|
||||||
>>> idx = pd.Index(["Apple", "Mango", 2.0]) | ||||||
>>> idx.is_object() | ||||||
True | ||||||
|
||||||
>>> idx = pd.Index(["Watermelon", "Orange", "Apple", | ||||||
... "Watermelon"]).astype("category") | ||||||
>>> idx.object() | ||||||
False | ||||||
|
||||||
>>> idx = pd.Index([1.0, 2.0, 3.0, 4.0]) | ||||||
>>> idx.is_object() | ||||||
False | ||||||
""" | ||||||
return is_object_dtype(self.dtype) | ||||||
|
||||||
def is_categorical(self) -> bool: | ||||||
|
@@ -1667,12 +1884,19 @@ def is_categorical(self) -> bool: | |||||
|
||||||
Returns | ||||||
------- | ||||||
boolean | ||||||
bool | ||||||
True if the Index is categorical. | ||||||
|
||||||
See Also | ||||||
-------- | ||||||
CategoricalIndex : Index for categorical data. | ||||||
is_boolean : Check if the Index only consists of booleans. | ||||||
is_integer : Check if the Index only consists of integers. | ||||||
is_floating : Check if the Index is a floating type. | ||||||
is_numeric : Check if the Index only consists of numeric data. | ||||||
is_object : Check if the Index is of the object dtype. | ||||||
is_interval : Check if the Index holds Interval objects. | ||||||
is_mixed : Check if the Index holds data with mixed data types. | ||||||
|
||||||
Examples | ||||||
-------- | ||||||
|
@@ -1698,9 +1922,67 @@ def is_categorical(self) -> bool: | |||||
return self.inferred_type in ["categorical"] | ||||||
|
||||||
def is_interval(self) -> bool: | ||||||
""" | ||||||
Check if the Index holds Interval objects. | ||||||
|
||||||
Returns | ||||||
------- | ||||||
bool | ||||||
Whether or not the Index holds Interval objects. | ||||||
|
||||||
See Also | ||||||
-------- | ||||||
IntervalIndex : Index for Interval objects. | ||||||
is_boolean : Check if the Index only consists of booleans. | ||||||
is_integer : Check if the Index only consists of integers. | ||||||
is_floating : Check if the Index is a floating type. | ||||||
is_numeric : Check if the Index only consists of numeric data. | ||||||
is_object : Check if the Index is of the object dtype. | ||||||
is_categorical : Check if the Index holds categorical data. | ||||||
is_mixed : Check if the Index holds data with mixed data types. | ||||||
|
||||||
Examples | ||||||
-------- | ||||||
>>> idx = pd.Index([pd.Interval(left=0, right=5), | ||||||
... pd.Interval(left=5, right=10)]) | ||||||
>>> idx.is_interval() | ||||||
True | ||||||
|
||||||
>>> idx = pd.Index([1, 3, 5, 7]) | ||||||
>>> idx.is_interval() | ||||||
False | ||||||
""" | ||||||
return self.inferred_type in ["interval"] | ||||||
|
||||||
def is_mixed(self) -> bool: | ||||||
""" | ||||||
Check if the Index holds data with mixed data types. | ||||||
|
||||||
Returns | ||||||
------- | ||||||
bool | ||||||
Whether or not the Index holds data with mixed data types. | ||||||
|
||||||
See Also | ||||||
-------- | ||||||
is_boolean : Check if the Index only consists of booleans. | ||||||
is_integer : Check if the Index only consists of integers. | ||||||
is_floating : Check if the Index is a floating type. | ||||||
is_numeric : Check if the Index only consists of numeric data. | ||||||
is_object : Check if the Index is of the object dtype. | ||||||
is_categorical : Check if the Index holds categorical data. | ||||||
is_interval : Check if the Index holds Interval objects. | ||||||
|
||||||
Examples | ||||||
-------- | ||||||
>>> idx = pd.Index(['a', np.nan, 'b']) | ||||||
>>> idx.is_mixed() | ||||||
True | ||||||
|
||||||
>>> idx = pd.Index([1.0, 2.0, 3.0, 5.0]) | ||||||
>>> idx.is_mixed() | ||||||
False | ||||||
""" | ||||||
return self.inferred_type in ["mixed"] | ||||||
|
||||||
def holds_integer(self): | ||||||
|
@@ -1718,6 +2000,9 @@ def inferred_type(self): | |||||
|
||||||
@cache_readonly | ||||||
def is_all_dates(self) -> bool: | ||||||
""" | ||||||
Whether or not the index values only consist of dates. | ||||||
""" | ||||||
return is_datetime_array(ensure_object(self.values)) | ||||||
|
||||||
# -------------------------------------------------------------------- | ||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For a case like this, this might be a bit duplicative with the first line.
Do we (or the validation script) always require an explanation of the return type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes the validation script requires a description for return values. Related error code:
'RT03': 'Return value has no description'
. I confirmed this by removing one of the explanations, leaving only the return type, and the error appears when I ranpython3 scripts/validate_docstrings.py --errors=RT03
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(this discussion is certainly not a blocker for this PR, to be clear)
@datapythonista what's your view on this? It's of course easiest to be consistent / have a clear rule in the validation. But personally, I find that it doesn't add any value in this specific case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree it's probably a bit repetitive. I think it may add value, even if from the short summary and the the name of the function, it should be easier for most people to infer what is the output (what True and False mean), I guess beginners can appreciate having it explicit. It's difficult sometimes to know if what is obvious for us it's for other people.
In any case, assuming it literally doesn't add any value, with all the work we've got with docstrings, I would just simply move forward, since there are so many other things that I think are more important and worth more our time. I think this looks fine to me, even if the repetition is not ideal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duplicate content can also add noise, as you might need to read both to ensure you don't miss something.
Anyway, not a discussion to continue on this PR