Skip to content

CLN: @doc - base.py & indexing.py #31970

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 21 commits into from
Mar 17, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions pandas/core/arrays/categorical.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
Substitution,
cache_readonly,
deprecate_kwarg,
doc,
)
from pandas.util._validators import validate_bool_kwarg, validate_fillna_kwargs

Expand Down Expand Up @@ -1352,8 +1353,7 @@ def memory_usage(self, deep=False):
"""
return self._codes.nbytes + self.dtype.categories.memory_usage(deep=deep)

@Substitution(klass="Categorical")
@Appender(_shared_docs["searchsorted"])
@doc(_shared_docs["searchsorted"], klass="Categorical")
def searchsorted(self, value, side="left", sorter=None):
# searchsorted is very performance sensitive. By converting codes
# to same dtype as self.codes, we get much faster performance.
Expand Down
11 changes: 5 additions & 6 deletions pandas/core/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
from pandas.compat import PYPY
from pandas.compat.numpy import function as nv
from pandas.errors import AbstractMethodError
from pandas.util._decorators import Appender, Substitution, cache_readonly, doc
from pandas.util._decorators import cache_readonly, doc
from pandas.util._validators import validate_bool_kwarg

from pandas.core.dtypes.cast import is_nested_object
Expand Down Expand Up @@ -1429,21 +1429,21 @@ def factorize(self, sort=False, na_sentinel=-1):
] = """
Find indices where elements should be inserted to maintain order.

Find the indices into a sorted %(klass)s `self` such that, if the
Find the indices into a sorted {klass} `self` such that, if the
corresponding elements in `value` were inserted before the indices,
the order of `self` would be preserved.

.. note::

The %(klass)s *must* be monotonically sorted, otherwise
The {klass} *must* be monotonically sorted, otherwise
wrong locations will likely be returned. Pandas does *not*
check this for you.

Parameters
----------
value : array_like
Values to insert into `self`.
side : {'left', 'right'}, optional
side : {{'left', 'right'}}, optional
If 'left', the index of the first suitable location found is given.
If 'right', return the last such index. If there is no suitable
index, return either 0 or N (where N is the length of `self`).
Expand Down Expand Up @@ -1519,8 +1519,7 @@ def factorize(self, sort=False, na_sentinel=-1):
0 # wrong result, correct would be 1
"""

@Substitution(klass="Index")
@Appender(_shared_docs["searchsorted"])
@doc(_shared_docs["searchsorted"], klass="Index")
def searchsorted(self, value, side="left", sorter=None) -> np.ndarray:
return algorithms.searchsorted(self._values, value, side=side, sorter=sorter)

Expand Down
6 changes: 3 additions & 3 deletions pandas/core/indexes/datetimelike.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@
from pandas._typing import Label
from pandas.compat.numpy import function as nv
from pandas.errors import AbstractMethodError
from pandas.util._decorators import Appender, cache_readonly
from pandas.util._decorators import Appender, cache_readonly, doc

from pandas.core.dtypes.common import (
ensure_int64,
Expand All @@ -31,7 +31,7 @@
from pandas.core import algorithms
from pandas.core.arrays import DatetimeArray, PeriodArray, TimedeltaArray
from pandas.core.arrays.datetimelike import DatetimeLikeArrayMixin
from pandas.core.base import _shared_docs
from pandas.core.base import IndexOpsMixin
import pandas.core.indexes.base as ibase
from pandas.core.indexes.base import Index, _index_shared_docs
from pandas.core.indexes.extension import (
Expand Down Expand Up @@ -206,7 +206,7 @@ def take(self, indices, axis=0, allow_fill=True, fill_value=None, **kwargs):
self, indices, axis, allow_fill, fill_value, **kwargs
)

@Appender(_shared_docs["searchsorted"])
@doc(IndexOpsMixin.searchsorted, klass="Datetime-like Index")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment - this is very confusing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @WillAyd, I agree with you and @ simonjayhawkins here. It might be confusing. However, the original docstring is not extended from the base class. It seems like the original code obscures the problem because it does not explicitly indicate the source of the docstring. I try to keep it as it is but use @doc.

It looks like we all agree this docstring template will confuse other developers, but do you feel needed to fix this issue in this PR? If so, what will be your suggestion? One option that comes in my mind will be using the docstring from the base class, and modify them to fit in this case. I was trying to avoid that because it will change the original docstring relations, and I am not sure if we did this on purpose.

Just to be clarified, I am very willing to make the additional change to solve this confusion. I just don't know what will be the best way of doing that. One more thing, I have a comment related to this. You might also be interested.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a more suitable location for the docstring then? Importing the IndexOpsMixin here is strange

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I think this should be addressed in a follow up. It's here in IndexOpsMixin because that's where the _shared_docs was. I think this PR is already too complex to make the change here. And if we change _shared_docs before merging this, the conflict here will be quite annoying to fix. Does it make sense?

Copy link
Contributor Author

@HH-MWB HH-MWB Mar 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a more suitable location for the docstring then? Importing the IndexOpsMixin here is strange

For me, I feel this situation might be caused by they are sharing the same interface, but don't have a common ancestor.

If we have an interface declaration, I would say putting docstring there would be a good choice. However, Python as a dynamic programming language, don't have to declare the interface.

Now, I don't have a good solution in mind, but I would like to look for it and see if we can find somewhere that makes more sense.

def searchsorted(self, value, side="left", sorter=None):
if isinstance(value, str):
raise TypeError(
Expand Down
12 changes: 6 additions & 6 deletions pandas/core/indexing.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
from pandas._libs.indexing import _NDFrameIndexerBase
from pandas._libs.lib import item_from_zerodim
from pandas.errors import AbstractMethodError
from pandas.util._decorators import Appender
from pandas.util._decorators import doc

from pandas.core.dtypes.common import (
is_integer,
Expand Down Expand Up @@ -847,7 +847,7 @@ def _getbool_axis(self, key, axis: int):
return self.obj._take_with_is_copy(inds, axis=axis)


@Appender(IndexingMixin.loc.__doc__)
@doc(IndexingMixin.loc)
class _LocIndexer(_LocationIndexer):
_takeable: bool = False
_valid_types = (
Expand All @@ -859,7 +859,7 @@ class _LocIndexer(_LocationIndexer):
# -------------------------------------------------------------------
# Key Checks

@Appender(_LocationIndexer._validate_key.__doc__)
@doc(_LocationIndexer._validate_key)
def _validate_key(self, key, axis: int):

# valid for a collection of labels (we check their presence later)
Expand Down Expand Up @@ -1289,7 +1289,7 @@ def _validate_read_indexer(
)


@Appender(IndexingMixin.iloc.__doc__)
@doc(IndexingMixin.iloc)
class _iLocIndexer(_LocationIndexer):
_valid_types = (
"integer, integer slice (START point is INCLUDED, END "
Expand Down Expand Up @@ -1998,7 +1998,7 @@ def __setitem__(self, key, value):
self.obj._set_value(*key, value=value, takeable=self._takeable)


@Appender(IndexingMixin.at.__doc__)
@doc(IndexingMixin.at)
class _AtIndexer(_ScalarAccessIndexer):
_takeable = False

Expand All @@ -2024,7 +2024,7 @@ def __getitem__(self, key):
return obj.index._get_values_for_loc(obj, loc, key)


@Appender(IndexingMixin.iat.__doc__)
@doc(IndexingMixin.iat)
class _iAtIndexer(_ScalarAccessIndexer):
_takeable = True

Expand Down
3 changes: 1 addition & 2 deletions pandas/core/series.py
Original file line number Diff line number Diff line change
Expand Up @@ -2443,8 +2443,7 @@ def __rmatmul__(self, other):
"""
return self.dot(np.transpose(other))

@Substitution(klass="Series")
@Appender(base._shared_docs["searchsorted"])
@doc(base.IndexOpsMixin.searchsorted, klass="Series")
def searchsorted(self, value, side="left", sorter=None):
return algorithms.searchsorted(self._values, value, side=side, sorter=sorter)

Expand Down
14 changes: 8 additions & 6 deletions pandas/tests/util/test_doc.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,13 +14,15 @@ def cumsum(whatever):

@doc(
cumsum,
"""
Examples
--------
dedent(
"""
Examples
--------

>>> cumavg([1, 2, 3])
2
""",
>>> cumavg([1, 2, 3])
2
"""
),
method="cumavg",
operation="average",
)
Expand Down
39 changes: 26 additions & 13 deletions pandas/util/_decorators.py
Original file line number Diff line number Diff line change
Expand Up @@ -250,9 +250,11 @@ def doc(*args: Union[str, Callable], **kwargs: str) -> Callable[[F], F]:
A decorator take docstring templates, concatenate them and perform string
substitution on it.

This decorator is robust even if func.__doc__ is None. This decorator will
add a variable "_docstr_template" to the wrapped function to save original
docstring template for potential usage.
This decorator will add a variable "_docstring_components" to the wrapped
function to keep track the original docstring template for potential usage.
If it should be consider as a template, it will be saved as a string.
Otherwise, it will be saved as callable, and later user __doc__ and dedent
to get docstring.

Parameters
----------
Expand All @@ -268,17 +270,28 @@ def decorator(func: F) -> F:
def wrapper(*args, **kwargs) -> Callable:
return func(*args, **kwargs)

templates = [func.__doc__ if func.__doc__ else ""]
# collecting docstring and docstring templates
docstring_components: List[Union[str, Callable]] = []
if func.__doc__:
docstring_components.append(dedent(func.__doc__))

for arg in args:
if isinstance(arg, str):
templates.append(arg)
elif hasattr(arg, "_docstr_template"):
templates.append(arg._docstr_template) # type: ignore
elif arg.__doc__:
templates.append(arg.__doc__)

wrapper._docstr_template = "".join(dedent(t) for t in templates) # type: ignore
wrapper.__doc__ = wrapper._docstr_template.format(**kwargs) # type: ignore
if hasattr(arg, "_docstring_components"):
docstring_components.extend(arg._docstring_components) # type: ignore
Comment on lines +279 to +280
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this could be a better fix for this one:

Suggested change
if hasattr(arg, "_docstring_components"):
docstring_components.extend(arg._docstring_components) # type: ignore
if hasattr(arg, "_docstring_components") and isinstance(arg._docstring_components, list):
docstring_components.extend(arg._docstring_components)

Copy link
Contributor Author

@HH-MWB HH-MWB Mar 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @datapythonista. I would consider this as two parts. One is to add type check for _docstring_components and one for remove # type: ignore.

For remove # type: ignore

After I made the change, I still get the errors below and not sure if we can remove that.

pandas/util/_decorators.py:280: error: Item "str" of "Union[str, Callable[..., Any]]" has no attribute "_docstring_components"
pandas/util/_decorators.py:280: error: Item "function" of "Union[str, Callable[..., Any]]" has no attribute "_docstring_components"

I guess my settings are not the same as checks here. I even the last passed build, I still get mypy errors on local. So, the change might fix it and just I don't know.

For checking if _docstring_components is list

I think this is a very good point, we do as defensive programming. But I am a little bit curious about how far we should go to protect it. Also, since this function is internal use for pandas only, would it be better to let it failed? Then the developer would get notified there is a conflict immediately?

Just to be clarified, I would be happy to make this change if you (or anyone) feels needed. Because I see both pros and cons for this, I am not sure which one will be better.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to merge this as is, was wondering if the suggested change would fix the problem.

We can surely go back to it in the future if needed, but let's merge this first, since this already adds a lot of value, and we can start replacing Appender by doc at a larger scale after this.

elif isinstance(arg, str) or arg.__doc__:
docstring_components.append(arg)

# formatting templates and concatenating docstring
wrapper.__doc__ = "".join(
[
arg.format(**kwargs)
if isinstance(arg, str)
else dedent(arg.__doc__ or "")
for arg in docstring_components
]
)

wrapper._docstring_components = docstring_components # type: ignore

return cast(F, wrapper)

Expand Down