Skip to content

DOC: Add missing docstrings #31047

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
282 changes: 281 additions & 1 deletion pandas/core/indexes/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -1163,6 +1163,9 @@ def to_frame(self, index=True, name=None):

@property
def name(self):
"""
Return Index or MultiIndex name.
"""
return self._name

@name.setter
Expand Down Expand Up @@ -1644,21 +1647,230 @@ def is_unique(self) -> bool:

@property
def has_duplicates(self) -> bool:
"""
Check if the Index has duplicate values.

Returns
-------
bool
Whether or not the Index has duplicate values.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a case like this, this might be a bit duplicative with the first line.

Do we (or the validation script) always require an explanation of the return type?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes the validation script requires a description for return values. Related error code: 'RT03': 'Return value has no description'. I confirmed this by removing one of the explanations, leaving only the return type, and the error appears when I ran python3 scripts/validate_docstrings.py --errors=RT03

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(this discussion is certainly not a blocker for this PR, to be clear)

@datapythonista what's your view on this? It's of course easiest to be consistent / have a clear rule in the validation. But personally, I find that it doesn't add any value in this specific case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it's probably a bit repetitive. I think it may add value, even if from the short summary and the the name of the function, it should be easier for most people to infer what is the output (what True and False mean), I guess beginners can appreciate having it explicit. It's difficult sometimes to know if what is obvious for us it's for other people.

In any case, assuming it literally doesn't add any value, with all the work we've got with docstrings, I would just simply move forward, since there are so many other things that I think are more important and worth more our time. I think this looks fine to me, even if the repetition is not ideal.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate content can also add noise, as you might need to read both to ensure you don't miss something.

Anyway, not a discussion to continue on this PR


Examples
--------
>>> idx = pd.Index([1, 5, 7, 7])
>>> idx.has_duplicates
True

>>> idx = pd.Index([1, 5, 7])
>>> idx.has_duplicates
False

>>> idx = pd.Index(["Watermelon", "Orange", "Apple",
... "Watermelon"]).astype("category")
>>> idx.has_duplicates
True

>>> idx = pd.Index(["Orange", "Apple",
... "Watermelon"]).astype("category")
>>> idx.has_duplicates
False
"""
return not self.is_unique

def is_boolean(self) -> bool:
"""
Check if the Index only consists of booleans.

Returns
-------
bool
Whether or not the Index only consists of booleans.

See Also
--------
is_integer : Check if the Index only consists of integers.
is_floating : Check if the Index is a floating type.
is_numeric : Check if the Index only consists of numeric data.
is_object : Check if the Index is of the object dtype.
is_categorical : Check if the Index holds categorical data.
is_interval : Check if the Index holds Interval objects.
is_mixed : Check if the Index holds data with mixed data types.

Examples
--------
>>> idx = pd.Index([True, False, True])
>>> idx.is_boolean()
True

>>> idx = pd.Index(["True", "False", "True"])
>>> idx.is_boolean()
False

>>> idx = pd.Index([True, False, "True"])
>>> idx.is_boolean()
False
"""
return self.inferred_type in ["boolean"]

def is_integer(self) -> bool:
"""
Check if the Index only consists of integers.

Returns
-------
bool
Whether or not the Index only consists of integers.

See Also
--------
is_boolean : Check if the Index only consists of booleans.
is_floating : Check if the Index is a floating type.
is_numeric : Check if the Index only consists of numeric data.
is_object : Check if the Index is of the object dtype.
is_categorical : Check if the Index holds categorical data.
is_interval : Check if the Index holds Interval objects.
is_mixed : Check if the Index holds data with mixed data types.

Examples
--------
>>> idx = pd.Index([1, 2, 3, 4])
>>> idx.is_integer()
True

>>> idx = pd.Index([1.0, 2.0, 3.0, 4.0])
>>> idx.is_integer()
False

>>> idx = pd.Index(["Apple", "Mango", "Watermelon"])
>>> idx.is_integer()
False
"""
return self.inferred_type in ["integer"]

def is_floating(self) -> bool:
"""
Check if the Index is a floating type.

The Index may consist of only floats, NaNs, or a mix of floats,
integers, or NaNs.

Returns
-------
bool
Whether or not the Index only consists of only consists of floats, NaNs, or
a mix of floats, integers, or NaNs.

See Also
--------
is_boolean : Check if the Index only consists of booleans.
is_integer : Check if the Index only consists of integers.
is_numeric : Check if the Index only consists of numeric data.
is_object : Check if the Index is of the object dtype.
is_categorical : Check if the Index holds categorical data.
is_interval : Check if the Index holds Interval objects.
is_mixed : Check if the Index holds data with mixed data types.

Examples
--------
>>> idx = pd.Index([1.0, 2.0, 3.0, 4.0])
>>> idx.is_floating()
True

>>> idx = pd.Index([1.0, 2.0, np.nan, 4.0])
>>> idx.is_floating()
True

>>> idx = pd.Index([1, 2, 3, 4, np.nan])
>>> idx.is_floating()
True

>>> idx = pd.Index([1, 2, 3, 4])
>>> idx.is_floating()
False
"""
return self.inferred_type in ["floating", "mixed-integer-float", "integer-na"]

def is_numeric(self) -> bool:
"""
Check if the Index only consists of numeric data.

Returns
-------
bool
Whether or not the Index only consists of numeric data.

See Also
--------
is_boolean : Check if the Index only consists of booleans.
is_integer : Check if the Index only consists of integers.
is_floating : Check if the Index is a floating type.
is_object : Check if the Index is of the object dtype.
is_categorical : Check if the Index holds categorical data.
is_interval : Check if the Index holds Interval objects.
is_mixed : Check if the Index holds data with mixed data types.

Examples
--------
>>> idx = pd.Index([1.0, 2.0, 3.0, 4.0])
>>> idx.is_numeric()
True

>>> idx = pd.Index([1, 2, 3, 4.0])
>>> idx.is_numeric()
True

>>> idx = pd.Index([1, 2, 3, 4])
>>> idx.is_numeric()
True

>>> idx = pd.Index([1, 2, 3, 4.0, np.nan])
>>> idx.is_numeric()
True

>>> idx = pd.Index([1, 2, 3, 4.0, np.nan, "Apple"])
>>> idx.is_numeric()
False
"""
return self.inferred_type in ["integer", "floating"]

def is_object(self) -> bool:
"""
Check if the Index is of the object dtype.

Returns
-------
bool
Whether or not the Index is of the object dtype.

See Also
--------
is_boolean : Check if the Index only consists of booleans.
is_integer : Check if the Index only consists of integers.
is_floating : Check if the Index is a floating type.
is_numeric : Check if the Index only consists of numeric data.
is_categorical : Check if the Index holds categorical data.
is_interval : Check if the Index holds Interval objects.
is_mixed : Check if the Index holds data with mixed data types.

Examples
--------
>>> idx = pd.Index(["Apple", "Mango", "Watermelon"])
>>> idx.is_object()
True

>>> idx = pd.Index(["Apple", "Mango", 2.0])
>>> idx.is_object()
True

>>> idx = pd.Index(["Watermelon", "Orange", "Apple",
... "Watermelon"]).astype("category")
>>> idx.object()
False

>>> idx = pd.Index([1.0, 2.0, 3.0, 4.0])
>>> idx.is_object()
False
"""
return is_object_dtype(self.dtype)

def is_categorical(self) -> bool:
Expand All @@ -1667,12 +1879,19 @@ def is_categorical(self) -> bool:

Returns
-------
boolean
bool
True if the Index is categorical.

See Also
--------
CategoricalIndex : Index for categorical data.
is_boolean : Check if the Index only consists of booleans.
is_integer : Check if the Index only consists of integers.
is_floating : Check if the Index is a floating type.
is_numeric : Check if the Index only consists of numeric data.
is_object : Check if the Index is of the object dtype.
is_interval : Check if the Index holds Interval objects.
is_mixed : Check if the Index holds data with mixed data types.

Examples
--------
Expand All @@ -1698,9 +1917,67 @@ def is_categorical(self) -> bool:
return self.inferred_type in ["categorical"]

def is_interval(self) -> bool:
"""
Check if the Index holds Interval objects.

Returns
-------
bool
Whether or not the Index holds Interval objects.

See Also
--------
IntervalIndex : Index for Interval objects.
is_boolean : Check if the Index only consists of booleans.
is_integer : Check if the Index only consists of integers.
is_floating : Check if the Index is a floating type.
is_numeric : Check if the Index only consists of numeric data.
is_object : Check if the Index is of the object dtype.
is_categorical : Check if the Index holds categorical data.
is_mixed : Check if the Index holds data with mixed data types.

Examples
--------
>>> idx = pd.Index([pd.Interval(left=0, right=5),
... pd.Interval(left=5, right=10)])
>>> idx.is_interval()
True

>>> idx = pd.Index([1, 3, 5, 7])
>>> idx.is_interval()
False
"""
return self.inferred_type in ["interval"]

def is_mixed(self) -> bool:
"""
Check if the Index holds data with mixed data types.

Returns
-------
bool
Whether or not the Index holds data with mixed data types.

See Also
--------
is_boolean : Check if the Index only consists of booleans.
is_integer : Check if the Index only consists of integers.
is_floating : Check if the Index is a floating type.
is_numeric : Check if the Index only consists of numeric data.
is_object : Check if the Index is of the object dtype.
is_categorical : Check if the Index holds categorical data.
is_interval : Check if the Index holds Interval objects.

Examples
--------
>>> idx = pd.Index(['a', np.nan, 'b'])
>>> idx.is_mixed()
True

>>> idx = pd.Index([1.0, 2.0, 3.0, 5.0])
>>> idx.is_mixed()
False
"""
return self.inferred_type in ["mixed"]

def holds_integer(self):
Expand All @@ -1718,6 +1995,9 @@ def inferred_type(self):

@cache_readonly
def is_all_dates(self) -> bool:
"""
Whether or not the index values only consist of dates.
"""
return is_datetime_array(ensure_object(self.values))

# --------------------------------------------------------------------
Expand Down