Skip to content

DOC: Fixed examples in pandas/core/indexes/ #33208

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions ci/code_checks.sh
Original file line number Diff line number Diff line change
Expand Up @@ -279,6 +279,10 @@ if [[ -z "$CHECK" || "$CHECK" == "doctests" ]]; then
pytest -q --doctest-modules pandas/core/groupby/groupby.py -k"-cumcount -describe -pipe"
RET=$(($RET + $?)) ; echo $MSG "DONE"

MSG='Doctests indexes' ; echo $MSG
pytest -q --doctest-modules pandas/core/indexes/
RET=$(($RET + $?)) ; echo $MSG "DONE"

MSG='Doctests tools' ; echo $MSG
pytest -q --doctest-modules pandas/core/tools/
RET=$(($RET + $?)) ; echo $MSG "DONE"
Expand All @@ -287,10 +291,6 @@ if [[ -z "$CHECK" || "$CHECK" == "doctests" ]]; then
pytest -q --doctest-modules pandas/core/reshape/
RET=$(($RET + $?)) ; echo $MSG "DONE"

MSG='Doctests interval classes' ; echo $MSG
pytest -q --doctest-modules pandas/core/indexes/interval.py
RET=$(($RET + $?)) ; echo $MSG "DONE"

MSG='Doctests arrays'; echo $MSG
pytest -q --doctest-modules pandas/core/arrays/
RET=$(($RET + $?)) ; echo $MSG "DONE"
Expand Down
125 changes: 107 additions & 18 deletions pandas/core/indexes/accessors.py
Original file line number Diff line number Diff line change
Expand Up @@ -129,9 +129,41 @@ class DatetimeProperties(Properties):

Examples
--------
>>> s.dt.hour
>>> s.dt.second
>>> s.dt.quarter
>>> seconds_series = pd.Series(pd.date_range("2000-01-01", periods=3, freq="s"))
>>> seconds_series
0 2000-01-01 00:00:00
1 2000-01-01 00:00:01
2 2000-01-01 00:00:02
dtype: datetime64[ns]
>>> seconds_series.dt.second
0 0
1 1
2 2
dtype: int64

>>> hours_series = pd.Series(pd.date_range("2000-01-01", periods=3, freq="h"))
>>> hours_series
0 2000-01-01 00:00:00
1 2000-01-01 01:00:00
2 2000-01-01 02:00:00
dtype: datetime64[ns]
>>> hours_series.dt.hour
0 0
1 1
2 2
dtype: int64

>>> quarters_series = pd.Series(pd.date_range("2000-01-01", periods=3, freq="q"))
>>> quarters_series
0 2000-03-31
1 2000-06-30
2 2000-09-30
dtype: datetime64[ns]
>>> quarters_series.dt.quarter
0 1
1 2
2 3
dtype: int64
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether needs to be in the class description.

with help(pd.core.indexes.accessors.DatetimeProperties)

under Data descriptors defined here:

we have

| dayofweek
| The day of the week with Monday=0, Sunday=6.
|
| Return the day of the week. It is assumed the week starts on
| Monday, which is denoted by 0 and ends on Sunday which is denoted
| by 6. This method is available on both Series with datetime
| values (using the dt accessor) or DatetimeIndex.
|

Returns
Series or Index
Containing integers indicating the day number.

|

See Also
Series.dt.dayofweek : Alias.
Series.dt.weekday : Alias.
Series.dt.day_name : Returns the name of the day of the week.

|

Examples
>>> s = pd.date_range('2016-12-31', '2017-01-08', freq='D').to_series()
>>> s.dt.dayofweek
2016-12-31 5
2017-01-01 6
2017-01-02 0
2017-01-03 1
2017-01-04 2
2017-01-05 3
2017-01-06 4
2017-01-07 5
2017-01-08 6
Freq: D, dtype: int64

and for hour, second and quarter we just have

| hour
| The hours of the datetime.

second
| The seconds of the datetime.

| quarter
| The quarter of the date.

maybe these examples should be in the descriptor docstrings instead. wdyt?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that we should have them for both, because the class's docstring is telling the user/developer that this class is responsible for accessing the fields, and some examples on.

And on top of that we should also supply specific examples for each field accessor, would love to do so in a follow up.


Returns a Series indexed like the original Series.
Raises TypeError if the Series does not contain datetimelike values.
Expand Down Expand Up @@ -200,13 +232,24 @@ class TimedeltaProperties(Properties):
"""
Accessor object for datetimelike properties of the Series values.

Examples
--------
>>> s.dt.hours
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgot to mention, I could not manage to give an example for s.dt.hours,

I tried to have:

import pandas as pd
hours_series = pd.Series(pd.timedelta_range(start="1 day", periods=3, freq="H"))

And then:

hours_series.dt.hours

and

 hours_series.dt.hour

but both gives:

AttributeError: 'TimedeltaProperties' object has no attribute 'hour'
AttributeError: 'TimedeltaProperties' object has no attribute 'hours'

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you see if there is an issue for this.

>>> s = pd.Series(pd.timedelta_range(start="1 day", periods=3, freq="H"))
>>> s
0   1 days 00:00:00
1   1 days 01:00:00
2   1 days 02:00:00
dtype: timedelta64[ns]
>>>
>>> s.dt.components
   days  hours  minutes  seconds  milliseconds  microseconds  nanoseconds
0     1      0        0        0             0             0            0
1     1      1        0        0             0             0            0
2     1      2        0        0             0             0            0

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could not find an issue for this, so I opened one here:

xref #33255

>>> s.dt.seconds

Returns a Series indexed like the original Series.
Raises TypeError if the Series does not contain datetimelike values.

Examples
--------
>>> seconds_series = pd.Series(
... pd.timedelta_range(start="1 second", periods=3, freq="S")
... )
>>> seconds_series
0 00:00:01
1 00:00:02
2 00:00:03
dtype: timedelta64[ns]
>>> seconds_series.dt.seconds
0 1
1 2
2 3
dtype: int64
"""

def to_pytimedelta(self) -> np.ndarray:
Expand All @@ -229,7 +272,7 @@ def to_pytimedelta(self) -> np.ndarray:

Examples
--------
>>> s = pd.Series(pd.to_timedelta(np.arange(5), unit='d'))
>>> s = pd.Series(pd.to_timedelta(np.arange(5), unit="d"))
>>> s
0 0 days
1 1 days
Expand All @@ -239,9 +282,9 @@ def to_pytimedelta(self) -> np.ndarray:
dtype: timedelta64[ns]

>>> s.dt.to_pytimedelta()
array([datetime.timedelta(0), datetime.timedelta(1),
datetime.timedelta(2), datetime.timedelta(3),
datetime.timedelta(4)], dtype=object)
array([datetime.timedelta(0), datetime.timedelta(days=1),
datetime.timedelta(days=2), datetime.timedelta(days=3),
datetime.timedelta(days=4)], dtype=object)
"""
return self._get_values().to_pytimedelta()

Expand Down Expand Up @@ -289,14 +332,60 @@ class PeriodProperties(Properties):
"""
Accessor object for datetimelike properties of the Series values.

Examples
--------
>>> s.dt.hour
>>> s.dt.second
>>> s.dt.quarter

Returns a Series indexed like the original Series.
Raises TypeError if the Series does not contain datetimelike values.

Examples
--------
>>> seconds_series = pd.Series(
... pd.period_range(
... start="2000-01-01 00:00:00", end="2000-01-01 00:00:03", freq="s"
... )
... )
>>> seconds_series
0 2000-01-01 00:00:00
1 2000-01-01 00:00:01
2 2000-01-01 00:00:02
3 2000-01-01 00:00:03
dtype: period[S]
>>> seconds_series.dt.second
0 0
1 1
2 2
3 3
dtype: int64

>>> hours_series = pd.Series(
... pd.period_range(start="2000-01-01 00:00", end="2000-01-01 03:00", freq="h")
... )
>>> hours_series
0 2000-01-01 00:00
1 2000-01-01 01:00
2 2000-01-01 02:00
3 2000-01-01 03:00
dtype: period[H]
>>> hours_series.dt.hour
0 0
1 1
2 2
3 3
dtype: int64

>>> quarters_series = pd.Series(
... pd.period_range(start="2000-01-01", end="2000-12-31", freq="Q-DEC")
... )
>>> quarters_series
0 2000Q1
1 2000Q2
2 2000Q3
3 2000Q4
dtype: period[Q-DEC]
>>> quarters_series.dt.quarter
0 1
1 2
2 3
3 4
dtype: int64
"""


Expand Down
35 changes: 17 additions & 18 deletions pandas/core/indexes/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -1837,7 +1837,7 @@ def is_object(self) -> bool:

>>> idx = pd.Index(["Watermelon", "Orange", "Apple",
... "Watermelon"]).astype("category")
>>> idx.object()
>>> idx.is_object()
False

>>> idx = pd.Index([1.0, 2.0, 3.0, 4.0])
Expand Down Expand Up @@ -2049,7 +2049,7 @@ def isna(self):
>>> idx
Float64Index([5.2, 6.0, nan], dtype='float64')
>>> idx.isna()
array([False, False, True], dtype=bool)
array([False, False, True])

Empty strings are not considered NA values. None is considered an NA
value.
Expand All @@ -2058,7 +2058,7 @@ def isna(self):
>>> idx
Index(['black', '', 'red', None], dtype='object')
>>> idx.isna()
array([False, False, False, True], dtype=bool)
array([False, False, False, True])

For datetimes, `NaT` (Not a Time) is considered as an NA value.

Expand All @@ -2068,7 +2068,7 @@ def isna(self):
DatetimeIndex(['1940-04-25', 'NaT', 'NaT', 'NaT'],
dtype='datetime64[ns]', freq=None)
>>> idx.isna()
array([False, True, True, True], dtype=bool)
array([False, True, True, True])
"""
return self._isnan

Expand Down Expand Up @@ -4786,8 +4786,9 @@ def isin(self, values, level=None):
... ['red', 'blue', 'green']],
... names=('number', 'color'))
>>> midx
MultiIndex(levels=[[1, 2, 3], ['blue', 'green', 'red']],
codes=[[0, 1, 2], [2, 0, 1]],
MultiIndex([(1, 'red'),
(2, 'blue'),
(3, 'green')],
names=['number', 'color'])

Check whether the strings in the 'color' level of the MultiIndex
Expand Down Expand Up @@ -4855,11 +4856,11 @@ def slice_indexer(self, start=None, end=None, step=None, kind=None):

>>> idx = pd.Index(list('abcd'))
>>> idx.slice_indexer(start='b', end='c')
slice(1, 3)
slice(1, 3, None)

>>> idx = pd.MultiIndex.from_arrays([list('abcd'), list('efgh')])
>>> idx.slice_indexer(start='b', end=('c', 'g'))
slice(1, 3)
slice(1, 3, None)
"""
start_slice, end_slice = self.slice_locs(start, end, step=step, kind=kind)

Expand Down Expand Up @@ -5430,11 +5431,10 @@ def ensure_index_from_sequences(sequences, names=None):

Examples
--------
>>> ensure_index_from_sequences([[1, 2, 3]], names=['name'])
>>> ensure_index_from_sequences([[1, 2, 3]], names=["name"])
Int64Index([1, 2, 3], dtype='int64', name='name')

>>> ensure_index_from_sequences([['a', 'a'], ['a', 'b']],
names=['L1', 'L2'])
>>> ensure_index_from_sequences([["a", "a"], ["a", "b"]], names=["L1", "L2"])
MultiIndex([('a', 'a'),
('a', 'b')],
names=['L1', 'L2'])
Expand Down Expand Up @@ -5467,6 +5467,10 @@ def ensure_index(index_like, copy=False):
-------
index : Index or MultiIndex

See Also
--------
ensure_index_from_sequences

Examples
--------
>>> ensure_index(['a', 'b'])
Expand All @@ -5477,13 +5481,8 @@ def ensure_index(index_like, copy=False):

>>> ensure_index([['a', 'a'], ['b', 'c']])
MultiIndex([('a', 'b'),
('a', 'c')],
dtype='object')
)

See Also
--------
ensure_index_from_sequences
('a', 'c')],
)
"""
if isinstance(index_like, Index):
if copy:
Expand Down
20 changes: 12 additions & 8 deletions pandas/core/indexes/category.py
Original file line number Diff line number Diff line change
Expand Up @@ -138,21 +138,25 @@ class CategoricalIndex(ExtensionIndex, accessor.PandasDelegate):

Examples
--------
>>> pd.CategoricalIndex(['a', 'b', 'c', 'a', 'b', 'c'])
CategoricalIndex(['a', 'b', 'c', 'a', 'b', 'c'], categories=['a', 'b', 'c'], ordered=False, dtype='category') # noqa
>>> pd.CategoricalIndex(["a", "b", "c", "a", "b", "c"])
CategoricalIndex(['a', 'b', 'c', 'a', 'b', 'c'],
categories=['a', 'b', 'c'], ordered=False, dtype='category')

``CategoricalIndex`` can also be instantiated from a ``Categorical``:

>>> c = pd.Categorical(['a', 'b', 'c', 'a', 'b', 'c'])
>>> c = pd.Categorical(["a", "b", "c", "a", "b", "c"])
>>> pd.CategoricalIndex(c)
CategoricalIndex(['a', 'b', 'c', 'a', 'b', 'c'], categories=['a', 'b', 'c'], ordered=False, dtype='category') # noqa
CategoricalIndex(['a', 'b', 'c', 'a', 'b', 'c'],
categories=['a', 'b', 'c'], ordered=False, dtype='category')

Ordered ``CategoricalIndex`` can have a min and max value.

>>> ci = pd.CategoricalIndex(['a','b','c','a','b','c'], ordered=True,
... categories=['c', 'b', 'a'])
>>> ci = pd.CategoricalIndex(
... ["a", "b", "c", "a", "b", "c"], ordered=True, categories=["c", "b", "a"]
... )
>>> ci
CategoricalIndex(['a', 'b', 'c', 'a', 'b', 'c'], categories=['c', 'b', 'a'], ordered=True, dtype='category') # noqa
CategoricalIndex(['a', 'b', 'c', 'a', 'b', 'c'],
categories=['c', 'b', 'a'], ordered=True, dtype='category')
>>> ci.min()
'c'
"""
Expand Down Expand Up @@ -652,7 +656,7 @@ def map(self, mapper):
>>> idx = pd.CategoricalIndex(['a', 'b', 'c'])
>>> idx
CategoricalIndex(['a', 'b', 'c'], categories=['a', 'b', 'c'],
ordered=False, dtype='category')
ordered=False, dtype='category')
>>> idx.map(lambda x: x.upper())
CategoricalIndex(['A', 'B', 'C'], categories=['A', 'B', 'C'],
ordered=False, dtype='category')
Expand Down
Loading