Skip to content

Commit b7f5775

Browse files
committed
ENH/DOC: reimplement Series delegates/accessors using descriptors
This PR implements `Series.str`, `Series.dt` and `Series.cat` as descriptors instead of properties. This means that the API docs can refer to methods like `Series.str.lower` instead of `StringMethods.lower` and tab-completion like `Series.str.<tab>` also works, even on the base class. CC jorisvandenbossche jreback
1 parent 5fd1fbd commit b7f5775

File tree

12 files changed

+189
-120
lines changed

12 files changed

+189
-120
lines changed

doc/source/api.rst

+68-74
Original file line numberDiff line numberDiff line change
@@ -449,114 +449,106 @@ Datetimelike Properties
449449

450450
``Series.dt`` can be used to access the values of the series as
451451
datetimelike and return several properties.
452-
Due to implementation details the methods show up here as methods of the
453-
``DatetimeProperties/PeriodProperties/TimedeltaProperties`` classes. These can be accessed like ``Series.dt.<property>``.
454-
455-
.. currentmodule:: pandas.tseries.common
452+
These can be accessed like ``Series.dt.<property>``.
456453

457454
**Datetime Properties**
458455

459456
.. autosummary::
460457
:toctree: generated/
461458

462-
DatetimeProperties.date
463-
DatetimeProperties.time
464-
DatetimeProperties.year
465-
DatetimeProperties.month
466-
DatetimeProperties.day
467-
DatetimeProperties.hour
468-
DatetimeProperties.minute
469-
DatetimeProperties.second
470-
DatetimeProperties.microsecond
471-
DatetimeProperties.nanosecond
472-
DatetimeProperties.second
473-
DatetimeProperties.weekofyear
474-
DatetimeProperties.dayofweek
475-
DatetimeProperties.weekday
476-
DatetimeProperties.dayofyear
477-
DatetimeProperties.quarter
478-
DatetimeProperties.is_month_start
479-
DatetimeProperties.is_month_end
480-
DatetimeProperties.is_quarter_start
481-
DatetimeProperties.is_quarter_end
482-
DatetimeProperties.is_year_start
483-
DatetimeProperties.is_year_end
459+
Series.dt.date
460+
Series.dt.time
461+
Series.dt.year
462+
Series.dt.month
463+
Series.dt.day
464+
Series.dt.hour
465+
Series.dt.minute
466+
Series.dt.second
467+
Series.dt.microsecond
468+
Series.dt.nanosecond
469+
Series.dt.second
470+
Series.dt.weekofyear
471+
Series.dt.dayofweek
472+
Series.dt.weekday
473+
Series.dt.dayofyear
474+
Series.dt.quarter
475+
Series.dt.is_month_start
476+
Series.dt.is_month_end
477+
Series.dt.is_quarter_start
478+
Series.dt.is_quarter_end
479+
Series.dt.is_year_start
480+
Series.dt.is_year_end
484481

485482
**Datetime Methods**
486483

487484
.. autosummary::
488485
:toctree: generated/
489486

490-
DatetimeProperties.to_period
491-
DatetimeProperties.to_pydatetime
492-
DatetimeProperties.tz_localize
493-
DatetimeProperties.tz_convert
487+
Series.dt.to_period
488+
Series.dt.to_pydatetime
489+
Series.dt.tz_localize
490+
Series.dt.tz_convert
494491

495492
**Timedelta Properties**
496493

497494
.. autosummary::
498495
:toctree: generated/
499496

500-
TimedeltaProperties.days
501-
TimedeltaProperties.seconds
502-
TimedeltaProperties.microseconds
503-
TimedeltaProperties.nanoseconds
504-
TimedeltaProperties.components
497+
Series.dt.days
498+
Series.dt.seconds
499+
Series.dt.microseconds
500+
Series.dt.nanoseconds
501+
Series.dt.components
505502

506503
**Timedelta Methods**
507504

508505
.. autosummary::
509506
:toctree: generated/
510507

511-
TimedeltaProperties.to_pytimedelta
508+
Series.dt.to_pytimedelta
512509

513510
String handling
514511
~~~~~~~~~~~~~~~
515512
``Series.str`` can be used to access the values of the series as
516-
strings and apply several methods to it. Due to implementation
517-
details the methods show up here as methods of the
518-
``StringMethods`` class. These can be acccessed like ``Series.str.<function/property>``.
519-
520-
.. currentmodule:: pandas.core.strings
521-
522-
.. autosummary::
523-
:toctree: generated/
524-
525-
StringMethods.cat
526-
StringMethods.center
527-
StringMethods.contains
528-
StringMethods.count
529-
StringMethods.decode
530-
StringMethods.encode
531-
StringMethods.endswith
532-
StringMethods.extract
533-
StringMethods.findall
534-
StringMethods.get
535-
StringMethods.join
536-
StringMethods.len
537-
StringMethods.lower
538-
StringMethods.lstrip
539-
StringMethods.match
540-
StringMethods.pad
541-
StringMethods.repeat
542-
StringMethods.replace
543-
StringMethods.rstrip
544-
StringMethods.slice
545-
StringMethods.slice_replace
546-
StringMethods.split
547-
StringMethods.startswith
548-
StringMethods.strip
549-
StringMethods.title
550-
StringMethods.upper
551-
StringMethods.get_dummies
513+
strings and apply several methods to it. These can be acccessed like
514+
``Series.str.<function/property>``.
515+
516+
.. autosummary::
517+
:toctree: generated/
518+
519+
Series.str.cat
520+
Series.str.center
521+
Series.str.contains
522+
Series.str.count
523+
Series.str.decode
524+
Series.str.encode
525+
Series.str.endswith
526+
Series.str.extract
527+
Series.str.findall
528+
Series.str.get
529+
Series.str.join
530+
Series.str.len
531+
Series.str.lower
532+
Series.str.lstrip
533+
Series.str.match
534+
Series.str.pad
535+
Series.str.repeat
536+
Series.str.replace
537+
Series.str.rstrip
538+
Series.str.slice
539+
Series.str.slice_replace
540+
Series.str.split
541+
Series.str.startswith
542+
Series.str.strip
543+
Series.str.title
544+
Series.str.upper
545+
Series.str.get_dummies
552546

553547
.. _api.categorical:
554548

555549
Categorical
556550
~~~~~~~~~~~
557551

558-
.. currentmodule:: pandas.core.categorical
559-
560552
If the Series is of dtype ``category``, ``Series.cat`` can be used to change the the categorical
561553
data. This accessor is similar to the ``Series.dt`` or ``Series.str`` and has the
562554
following usable methods and properties (all available as ``Series.cat.<method_or_property>``).
@@ -579,6 +571,8 @@ To create a Series of dtype ``category``, use ``cat = s.astype("category")``.
579571
The following two ``Categorical`` constructors are considered API but should only be used when
580572
adding ordering information or special categories is need at creation time of the categorical data:
581573

574+
.. currentmodule:: pandas.core.categorical
575+
582576
.. autosummary::
583577
:toctree: generated/
584578

doc/source/reshaping.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -478,7 +478,7 @@ This function is often used along with discretization functions like ``cut``:
478478
479479
get_dummies(cut(values, bins))
480480
481-
See also :func:`Series.str.get_dummies <pandas.core.strings.StringMethods.get_dummies>`.
481+
See also :func:`Series.str.get_dummies <pandas.Series.str.get_dummies>`.
482482

483483
.. versionadded:: 0.15.0
484484

doc/source/text.rst

+24-24
Original file line numberDiff line numberDiff line change
@@ -204,27 +204,27 @@ Method Summary
204204
:header: "Method", "Description"
205205
:widths: 20, 80
206206

207-
:meth:`~core.strings.StringMethods.cat`,Concatenate strings
208-
:meth:`~core.strings.StringMethods.split`,Split strings on delimiter
209-
:meth:`~core.strings.StringMethods.get`,Index into each element (retrieve i-th element)
210-
:meth:`~core.strings.StringMethods.join`,Join strings in each element of the Series with passed separator
211-
:meth:`~core.strings.StringMethods.contains`,Return boolean array if each string contains pattern/regex
212-
:meth:`~core.strings.StringMethods.replace`,Replace occurrences of pattern/regex with some other string
213-
:meth:`~core.strings.StringMethods.repeat`,Duplicate values (``s.str.repeat(3)`` equivalent to ``x * 3``)
214-
:meth:`~core.strings.StringMethods.pad`,"Add whitespace to left, right, or both sides of strings"
215-
:meth:`~core.strings.StringMethods.center`,Equivalent to ``pad(side='both')``
216-
:meth:`~core.strings.StringMethods.wrap`,Split long strings into lines with length less than a given width
217-
:meth:`~core.strings.StringMethods.slice`,Slice each string in the Series
218-
:meth:`~core.strings.StringMethods.slice_replace`,Replace slice in each string with passed value
219-
:meth:`~core.strings.StringMethods.count`,Count occurrences of pattern
220-
:meth:`~core.strings.StringMethods.startswith`,Equivalent to ``str.startswith(pat)`` for each element
221-
:meth:`~core.strings.StringMethods.endswith`,Equivalent to ``str.endswith(pat)`` for each element
222-
:meth:`~core.strings.StringMethods.findall`,Compute list of all occurrences of pattern/regex for each string
223-
:meth:`~core.strings.StringMethods.match`,"Call ``re.match`` on each element, returning matched groups as list"
224-
:meth:`~core.strings.StringMethods.extract`,"Call ``re.match`` on each element, as ``match`` does, but return matched groups as strings for convenience."
225-
:meth:`~core.strings.StringMethods.len`,Compute string lengths
226-
:meth:`~core.strings.StringMethods.strip`,Equivalent to ``str.strip``
227-
:meth:`~core.strings.StringMethods.rstrip`,Equivalent to ``str.rstrip``
228-
:meth:`~core.strings.StringMethods.lstrip`,Equivalent to ``str.lstrip``
229-
:meth:`~core.strings.StringMethods.lower`,Equivalent to ``str.lower``
230-
:meth:`~core.strings.StringMethods.upper`,Equivalent to ``str.upper``
207+
:meth:`~Series.str.cat`,Concatenate strings
208+
:meth:`~Series.str.split`,Split strings on delimiter
209+
:meth:`~Series.str.get`,Index into each element (retrieve i-th element)
210+
:meth:`~Series.str.join`,Join strings in each element of the Series with passed separator
211+
:meth:`~Series.str.contains`,Return boolean array if each string contains pattern/regex
212+
:meth:`~Series.str.replace`,Replace occurrences of pattern/regex with some other string
213+
:meth:`~Series.str.repeat`,Duplicate values (``s.str.repeat(3)`` equivalent to ``x * 3``)
214+
:meth:`~Series.str.pad`,"Add whitespace to left, right, or both sides of strings"
215+
:meth:`~Series.str.center`,Equivalent to ``pad(side='both')``
216+
:meth:`~Series.str.wrap`,Split long strings into lines with length less than a given width
217+
:meth:`~Series.str.slice`,Slice each string in the Series
218+
:meth:`~Series.str.slice_replace`,Replace slice in each string with passed value
219+
:meth:`~Series.str.count`,Count occurrences of pattern
220+
:meth:`~Series.str.startswith`,Equivalent to ``str.startswith(pat)`` for each element
221+
:meth:`~Series.str.endswith`,Equivalent to ``str.endswith(pat)`` for each element
222+
:meth:`~Series.str.findall`,Compute list of all occurrences of pattern/regex for each string
223+
:meth:`~Series.str.match`,"Call ``re.match`` on each element, returning matched groups as list"
224+
:meth:`~Series.str.extract`,"Call ``re.match`` on each element, as ``match`` does, but return matched groups as strings for convenience."
225+
:meth:`~Series.str.len`,Compute string lengths
226+
:meth:`~Series.str.strip`,Equivalent to ``str.strip``
227+
:meth:`~Series.str.rstrip`,Equivalent to ``str.rstrip``
228+
:meth:`~Series.str.lstrip`,Equivalent to ``str.lstrip``
229+
:meth:`~Series.str.lower`,Equivalent to ``str.lower``
230+
:meth:`~Series.str.upper`,Equivalent to ``str.upper``

doc/source/whatsnew/v0.16.0.txt

+2
Original file line numberDiff line numberDiff line change
@@ -107,6 +107,8 @@ Enhancements
107107

108108
- ``Timedelta`` will now accept nanoseconds keyword in constructor (:issue:`9273`)
109109

110+
- Added auto-complete for ``Series.str.<tab>``, ``Series.dt.<tab>`` and ``Series.cat.<tab>`` (:issue:`9322`)
111+
110112
Performance
111113
~~~~~~~~~~~
112114

pandas/core/base.py

+22
Original file line numberDiff line numberDiff line change
@@ -166,6 +166,28 @@ def f(self, *args, **kwargs):
166166
if not hasattr(cls, name):
167167
setattr(cls,name,f)
168168

169+
170+
class AccessorProperty(object):
171+
"""Descriptor for implementing accessor properties like Series.str
172+
"""
173+
def __init__(self, accessor_cls, construct_accessor):
174+
self.accessor_cls = accessor_cls
175+
self.construct_accessor = construct_accessor
176+
self.__doc__ = accessor_cls.__doc__
177+
178+
def __get__(self, instance, owner=None):
179+
if instance is None:
180+
# this ensures that Series.str.<method> is well defined
181+
return self.accessor_cls
182+
return self.construct_accessor(instance)
183+
184+
def __set__(self, instance, value):
185+
raise AttributeError("can't set attribute")
186+
187+
def __delete__(self, instance):
188+
raise AttributeError("can't delete attribute")
189+
190+
169191
class FrozenList(PandasObject, list):
170192

171193
"""

pandas/core/categorical.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -829,7 +829,7 @@ def searchsorted(self, v, side='left', sorter=None):
829829
array([3, 4]) # eggs before milk
830830
>>> x = pd.Categorical(['apple', 'bread', 'bread', 'cheese', 'milk', 'donuts' ])
831831
>>> x.searchsorted(['bread', 'eggs'], side='right', sorter=[0, 1, 2, 3, 5, 4])
832-
array([3, 5]) # eggs after donuts, after switching milk and donuts
832+
array([3, 5]) # eggs after donuts, after switching milk and donuts
833833
"""
834834
if not self.ordered:
835835
raise ValueError("searchsorted requires an ordered Categorical.")

pandas/core/series.py

+18-12
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,10 @@
2727
from pandas.core.indexing import _check_bool_indexer, _maybe_convert_indices
2828
from pandas.core import generic, base
2929
from pandas.core.internals import SingleBlockManager
30-
from pandas.core.categorical import Categorical
30+
from pandas.core.categorical import Categorical, CategoricalAccessor
31+
from pandas.core.strings import StringMethods
32+
from pandas.tseries.common import (maybe_to_datetimelike,
33+
CombinedDatetimelikeProperties)
3134
from pandas.tseries.index import DatetimeIndex
3235
from pandas.tseries.tdi import TimedeltaIndex
3336
from pandas.tseries.period import PeriodIndex, Period
@@ -2452,11 +2455,6 @@ def asof(self, where):
24522455
new_values = com.take_1d(values, locs)
24532456
return self._constructor(new_values, index=where).__finalize__(self)
24542457

2455-
@cache_readonly
2456-
def str(self):
2457-
from pandas.core.strings import StringMethods
2458-
return StringMethods(self)
2459-
24602458
def to_timestamp(self, freq=None, how='start', copy=True):
24612459
"""
24622460
Cast to datetimeindex of timestamps, at *beginning* of period
@@ -2502,27 +2500,35 @@ def to_period(self, freq=None, copy=True):
25022500
return self._constructor(new_values,
25032501
index=new_index).__finalize__(self)
25042502

2503+
#------------------------------------------------------------------------------
2504+
# string methods
2505+
2506+
def _make_str_accessor(self):
2507+
return StringMethods(self)
2508+
2509+
str = base.AccessorProperty(StringMethods, _make_str_accessor)
2510+
25052511
#------------------------------------------------------------------------------
25062512
# Datetimelike delegation methods
25072513

2508-
@cache_readonly
2509-
def dt(self):
2510-
from pandas.tseries.common import maybe_to_datetimelike
2514+
def _make_dt_accessor(self):
25112515
try:
25122516
return maybe_to_datetimelike(self)
25132517
except (Exception):
25142518
raise TypeError("Can only use .dt accessor with datetimelike values")
25152519

2520+
dt = base.AccessorProperty(CombinedDatetimelikeProperties, _make_dt_accessor)
2521+
25162522
#------------------------------------------------------------------------------
25172523
# Categorical methods
25182524

2519-
@cache_readonly
2520-
def cat(self):
2521-
from pandas.core.categorical import CategoricalAccessor
2525+
def _make_cat_accessor(self):
25222526
if not com.is_categorical_dtype(self.dtype):
25232527
raise TypeError("Can only use .cat accessor with a 'category' dtype")
25242528
return CategoricalAccessor(self.values, self.index)
25252529

2530+
cat = base.AccessorProperty(CategoricalAccessor, _make_cat_accessor)
2531+
25262532
Series._setup_axes(['index'], info_axis=0, stat_axis=0,
25272533
aliases={'rows': 0})
25282534
Series._add_numeric_operations()

0 commit comments

Comments
 (0)