Skip to content

Commit 143b011

Browse files
authored
CLN: Replace Appender and Substitution with simpler doc decorator (#31060)
1 parent 012a6a3 commit 143b011

File tree

9 files changed

+266
-145
lines changed

9 files changed

+266
-145
lines changed

doc/source/development/contributing_docstring.rst

+11-16
Original file line numberDiff line numberDiff line change
@@ -937,33 +937,31 @@ classes. This helps us keep docstrings consistent, while keeping things clear
937937
for the user reading. It comes at the cost of some complexity when writing.
938938

939939
Each shared docstring will have a base template with variables, like
940-
``%(klass)s``. The variables filled in later on using the ``Substitution``
941-
decorator. Finally, docstrings can be appended to with the ``Appender``
942-
decorator.
940+
``{klass}``. The variables filled in later on using the ``doc`` decorator.
941+
Finally, docstrings can also be appended to with the ``doc`` decorator.
943942

944943
In this example, we'll create a parent docstring normally (this is like
945944
``pandas.core.generic.NDFrame``. Then we'll have two children (like
946945
``pandas.core.series.Series`` and ``pandas.core.frame.DataFrame``). We'll
947-
substitute the children's class names in this docstring.
946+
substitute the class names in this docstring.
948947

949948
.. code-block:: python
950949
951950
class Parent:
951+
@doc(klass="Parent")
952952
def my_function(self):
953-
"""Apply my function to %(klass)s."""
953+
"""Apply my function to {klass}."""
954954
...
955955
956956
957957
class ChildA(Parent):
958-
@Substitution(klass="ChildA")
959-
@Appender(Parent.my_function.__doc__)
958+
@doc(Parent.my_function, klass="ChildA")
960959
def my_function(self):
961960
...
962961
963962
964963
class ChildB(Parent):
965-
@Substitution(klass="ChildB")
966-
@Appender(Parent.my_function.__doc__)
964+
@doc(Parent.my_function, klass="ChildB")
967965
def my_function(self):
968966
...
969967
@@ -972,18 +970,16 @@ The resulting docstrings are
972970
.. code-block:: python
973971
974972
>>> print(Parent.my_function.__doc__)
975-
Apply my function to %(klass)s.
973+
Apply my function to Parent.
976974
>>> print(ChildA.my_function.__doc__)
977975
Apply my function to ChildA.
978976
>>> print(ChildB.my_function.__doc__)
979977
Apply my function to ChildB.
980978
981-
Notice two things:
979+
Notice:
982980

983981
1. We "append" the parent docstring to the children docstrings, which are
984982
initially empty.
985-
2. Python decorators are applied inside out. So the order is Append then
986-
Substitution, even though Substitution comes first in the file.
987983

988984
Our files will often contain a module-level ``_shared_doc_kwargs`` with some
989985
common substitution values (things like ``klass``, ``axes``, etc).
@@ -992,14 +988,13 @@ You can substitute and append in one shot with something like
992988

993989
.. code-block:: python
994990
995-
@Appender(template % _shared_doc_kwargs)
991+
@doc(template, **_shared_doc_kwargs)
996992
def my_function(self):
997993
...
998994
999995
where ``template`` may come from a module-level ``_shared_docs`` dictionary
1000996
mapping function names to docstrings. Wherever possible, we prefer using
1001-
``Appender`` and ``Substitution``, since the docstring-writing processes is
1002-
slightly closer to normal.
997+
``doc``, since the docstring-writing processes is slightly closer to normal.
1003998

1004999
See ``pandas.core.generic.NDFrame.fillna`` for an example template, and
10051000
``pandas.core.series.Series.fillna`` and ``pandas.core.generic.frame.fillna``

pandas/core/accessor.py

+79-82
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@
77
from typing import FrozenSet, Set
88
import warnings
99

10-
from pandas.util._decorators import Appender
10+
from pandas.util._decorators import doc
1111

1212

1313
class DirNamesMixin:
@@ -193,122 +193,119 @@ def __get__(self, obj, cls):
193193
return accessor_obj
194194

195195

196+
@doc(klass="", others="")
196197
def _register_accessor(name, cls):
197-
def decorator(accessor):
198-
if hasattr(cls, name):
199-
warnings.warn(
200-
f"registration of accessor {repr(accessor)} under name "
201-
f"{repr(name)} for type {repr(cls)} is overriding a preexisting"
202-
f"attribute with the same name.",
203-
UserWarning,
204-
stacklevel=2,
205-
)
206-
setattr(cls, name, CachedAccessor(name, accessor))
207-
cls._accessors.add(name)
208-
return accessor
209-
210-
return decorator
198+
"""
199+
Register a custom accessor on {klass} objects.
211200
201+
Parameters
202+
----------
203+
name : str
204+
Name under which the accessor should be registered. A warning is issued
205+
if this name conflicts with a preexisting attribute.
212206
213-
_doc = """
214-
Register a custom accessor on %(klass)s objects.
207+
Returns
208+
-------
209+
callable
210+
A class decorator.
215211
216-
Parameters
217-
----------
218-
name : str
219-
Name under which the accessor should be registered. A warning is issued
220-
if this name conflicts with a preexisting attribute.
212+
See Also
213+
--------
214+
{others}
221215
222-
Returns
223-
-------
224-
callable
225-
A class decorator.
216+
Notes
217+
-----
218+
When accessed, your accessor will be initialized with the pandas object
219+
the user is interacting with. So the signature must be
226220
227-
See Also
228-
--------
229-
%(others)s
221+
.. code-block:: python
230222
231-
Notes
232-
-----
233-
When accessed, your accessor will be initialized with the pandas object
234-
the user is interacting with. So the signature must be
223+
def __init__(self, pandas_object): # noqa: E999
224+
...
235225
236-
.. code-block:: python
226+
For consistency with pandas methods, you should raise an ``AttributeError``
227+
if the data passed to your accessor has an incorrect dtype.
237228
238-
def __init__(self, pandas_object): # noqa: E999
239-
...
229+
>>> pd.Series(['a', 'b']).dt
230+
Traceback (most recent call last):
231+
...
232+
AttributeError: Can only use .dt accessor with datetimelike values
240233
241-
For consistency with pandas methods, you should raise an ``AttributeError``
242-
if the data passed to your accessor has an incorrect dtype.
234+
Examples
235+
--------
243236
244-
>>> pd.Series(['a', 'b']).dt
245-
Traceback (most recent call last):
246-
...
247-
AttributeError: Can only use .dt accessor with datetimelike values
237+
In your library code::
248238
249-
Examples
250-
--------
239+
import pandas as pd
251240
252-
In your library code::
241+
@pd.api.extensions.register_dataframe_accessor("geo")
242+
class GeoAccessor:
243+
def __init__(self, pandas_obj):
244+
self._obj = pandas_obj
253245
254-
import pandas as pd
246+
@property
247+
def center(self):
248+
# return the geographic center point of this DataFrame
249+
lat = self._obj.latitude
250+
lon = self._obj.longitude
251+
return (float(lon.mean()), float(lat.mean()))
255252
256-
@pd.api.extensions.register_dataframe_accessor("geo")
257-
class GeoAccessor:
258-
def __init__(self, pandas_obj):
259-
self._obj = pandas_obj
253+
def plot(self):
254+
# plot this array's data on a map, e.g., using Cartopy
255+
pass
260256
261-
@property
262-
def center(self):
263-
# return the geographic center point of this DataFrame
264-
lat = self._obj.latitude
265-
lon = self._obj.longitude
266-
return (float(lon.mean()), float(lat.mean()))
257+
Back in an interactive IPython session:
267258
268-
def plot(self):
269-
# plot this array's data on a map, e.g., using Cartopy
270-
pass
259+
>>> ds = pd.DataFrame({{'longitude': np.linspace(0, 10),
260+
... 'latitude': np.linspace(0, 20)}})
261+
>>> ds.geo.center
262+
(5.0, 10.0)
263+
>>> ds.geo.plot()
264+
# plots data on a map
265+
"""
271266

272-
Back in an interactive IPython session:
267+
def decorator(accessor):
268+
if hasattr(cls, name):
269+
warnings.warn(
270+
f"registration of accessor {repr(accessor)} under name "
271+
f"{repr(name)} for type {repr(cls)} is overriding a preexisting"
272+
f"attribute with the same name.",
273+
UserWarning,
274+
stacklevel=2,
275+
)
276+
setattr(cls, name, CachedAccessor(name, accessor))
277+
cls._accessors.add(name)
278+
return accessor
273279

274-
>>> ds = pd.DataFrame({'longitude': np.linspace(0, 10),
275-
... 'latitude': np.linspace(0, 20)})
276-
>>> ds.geo.center
277-
(5.0, 10.0)
278-
>>> ds.geo.plot()
279-
# plots data on a map
280-
"""
280+
return decorator
281281

282282

283-
@Appender(
284-
_doc
285-
% dict(
286-
klass="DataFrame", others=("register_series_accessor, register_index_accessor")
287-
)
283+
@doc(
284+
_register_accessor,
285+
klass="DataFrame",
286+
others="register_series_accessor, register_index_accessor",
288287
)
289288
def register_dataframe_accessor(name):
290289
from pandas import DataFrame
291290

292291
return _register_accessor(name, DataFrame)
293292

294293

295-
@Appender(
296-
_doc
297-
% dict(
298-
klass="Series", others=("register_dataframe_accessor, register_index_accessor")
299-
)
294+
@doc(
295+
_register_accessor,
296+
klass="Series",
297+
others="register_dataframe_accessor, register_index_accessor",
300298
)
301299
def register_series_accessor(name):
302300
from pandas import Series
303301

304302
return _register_accessor(name, Series)
305303

306304

307-
@Appender(
308-
_doc
309-
% dict(
310-
klass="Index", others=("register_dataframe_accessor, register_series_accessor")
311-
)
305+
@doc(
306+
_register_accessor,
307+
klass="Index",
308+
others="register_dataframe_accessor, register_series_accessor",
312309
)
313310
def register_index_accessor(name):
314311
from pandas import Index

pandas/core/algorithms.py

+29-34
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111

1212
from pandas._libs import Timestamp, algos, hashtable as htable, lib
1313
from pandas._libs.tslib import iNaT
14-
from pandas.util._decorators import Appender, Substitution
14+
from pandas.util._decorators import doc
1515

1616
from pandas.core.dtypes.cast import (
1717
construct_1d_object_array_from_listlike,
@@ -487,9 +487,32 @@ def _factorize_array(
487487
return codes, uniques
488488

489489

490-
_shared_docs[
491-
"factorize"
492-
] = """
490+
@doc(
491+
values=dedent(
492+
"""\
493+
values : sequence
494+
A 1-D sequence. Sequences that aren't pandas objects are
495+
coerced to ndarrays before factorization.
496+
"""
497+
),
498+
sort=dedent(
499+
"""\
500+
sort : bool, default False
501+
Sort `uniques` and shuffle `codes` to maintain the
502+
relationship.
503+
"""
504+
),
505+
size_hint=dedent(
506+
"""\
507+
size_hint : int, optional
508+
Hint to the hashtable sizer.
509+
"""
510+
),
511+
)
512+
def factorize(
513+
values, sort: bool = False, na_sentinel: int = -1, size_hint: Optional[int] = None
514+
) -> Tuple[np.ndarray, Union[np.ndarray, ABCIndex]]:
515+
"""
493516
Encode the object as an enumerated type or categorical variable.
494517
495518
This method is useful for obtaining a numeric representation of an
@@ -499,10 +522,10 @@ def _factorize_array(
499522
500523
Parameters
501524
----------
502-
%(values)s%(sort)s
525+
{values}{sort}
503526
na_sentinel : int, default -1
504527
Value to mark "not found".
505-
%(size_hint)s\
528+
{size_hint}\
506529
507530
Returns
508531
-------
@@ -580,34 +603,6 @@ def _factorize_array(
580603
>>> uniques
581604
Index(['a', 'c'], dtype='object')
582605
"""
583-
584-
585-
@Substitution(
586-
values=dedent(
587-
"""\
588-
values : sequence
589-
A 1-D sequence. Sequences that aren't pandas objects are
590-
coerced to ndarrays before factorization.
591-
"""
592-
),
593-
sort=dedent(
594-
"""\
595-
sort : bool, default False
596-
Sort `uniques` and shuffle `codes` to maintain the
597-
relationship.
598-
"""
599-
),
600-
size_hint=dedent(
601-
"""\
602-
size_hint : int, optional
603-
Hint to the hashtable sizer.
604-
"""
605-
),
606-
)
607-
@Appender(_shared_docs["factorize"])
608-
def factorize(
609-
values, sort: bool = False, na_sentinel: int = -1, size_hint: Optional[int] = None
610-
) -> Tuple[np.ndarray, Union[np.ndarray, ABCIndex]]:
611606
# Implementation notes: This method is responsible for 3 things
612607
# 1.) coercing data to array-like (ndarray, Index, extension array)
613608
# 2.) factorizing codes and uniques

0 commit comments

Comments
 (0)