Skip to content

Commit 1c5bcec

Browse files
mroeschkePingviinituutti
authored andcommitted
API: Remove CalendarDay (pandas-dev#24330)
1 parent b1a50e7 commit 1c5bcec

File tree

13 files changed

+52
-252
lines changed

13 files changed

+52
-252
lines changed

doc/source/timeseries.rst

+2-23
Original file line numberDiff line numberDiff line change
@@ -408,7 +408,7 @@ In practice this becomes very cumbersome because we often need a very long
408408
index with a large number of timestamps. If we need timestamps on a regular
409409
frequency, we can use the :func:`date_range` and :func:`bdate_range` functions
410410
to create a ``DatetimeIndex``. The default frequency for ``date_range`` is a
411-
**day** while the default for ``bdate_range`` is a **business day**:
411+
**calendar day** while the default for ``bdate_range`` is a **business day**:
412412

413413
.. ipython:: python
414414
@@ -927,26 +927,6 @@ in the operation).
927927
928928
.. _relativedelta documentation: https://dateutil.readthedocs.io/en/stable/relativedelta.html
929929

930-
.. _timeseries.dayvscalendarday:
931-
932-
Day vs. CalendarDay
933-
~~~~~~~~~~~~~~~~~~~
934-
935-
:class:`Day` (``'D'``) is a timedelta-like offset that respects absolute time
936-
arithmetic and is an alias for 24 :class:`Hour`. This offset is the default
937-
argument to many pandas time related function like :func:`date_range` and :func:`timedelta_range`.
938-
939-
:class:`CalendarDay` (``'CD'``) is a relativedelta-like offset that respects
940-
calendar time arithmetic. :class:`CalendarDay` is useful preserving calendar day
941-
semantics with date times with have day light savings transitions, i.e. :class:`CalendarDay`
942-
will preserve the hour before the day light savings transition.
943-
944-
.. ipython:: python
945-
946-
ts = pd.Timestamp('2016-10-30 00:00:00', tz='Europe/Helsinki')
947-
ts + pd.offsets.Day(1)
948-
ts + pd.offsets.CalendarDay(1)
949-
950930

951931
Parametric Offsets
952932
~~~~~~~~~~~~~~~~~~
@@ -1243,8 +1223,7 @@ frequencies. We will refer to these aliases as *offset aliases*.
12431223

12441224
"B", "business day frequency"
12451225
"C", "custom business day frequency"
1246-
"D", "day frequency"
1247-
"CD", "calendar day frequency"
1226+
"D", "calendar day frequency"
12481227
"W", "weekly frequency"
12491228
"M", "month end frequency"
12501229
"SM", "semi-month end frequency (15th and end of month)"

doc/source/whatsnew/v0.24.0.rst

-42
Original file line numberDiff line numberDiff line change
@@ -591,48 +591,6 @@ that the dates have been converted to UTC
591591
pd.to_datetime(["2015-11-18 15:30:00+05:30",
592592
"2015-11-18 16:30:00+06:30"], utc=True)
593593
594-
.. _whatsnew_0240.api_breaking.calendarday:
595-
596-
CalendarDay Offset
597-
^^^^^^^^^^^^^^^^^^
598-
599-
:class:`Day` and associated frequency alias ``'D'`` were documented to represent
600-
a calendar day; however, arithmetic and operations with :class:`Day` sometimes
601-
respected absolute time instead (i.e. ``Day(n)`` and acted identically to ``Timedelta(days=n)``).
602-
603-
*Previous Behavior*:
604-
605-
.. code-block:: ipython
606-
607-
608-
In [2]: ts = pd.Timestamp('2016-10-30 00:00:00', tz='Europe/Helsinki')
609-
610-
# Respects calendar arithmetic
611-
In [3]: pd.date_range(start=ts, freq='D', periods=3)
612-
Out[3]:
613-
DatetimeIndex(['2016-10-30 00:00:00+03:00', '2016-10-31 00:00:00+02:00',
614-
'2016-11-01 00:00:00+02:00'],
615-
dtype='datetime64[ns, Europe/Helsinki]', freq='D')
616-
617-
# Respects absolute arithmetic
618-
In [4]: ts + pd.tseries.frequencies.to_offset('D')
619-
Out[4]: Timestamp('2016-10-30 23:00:00+0200', tz='Europe/Helsinki')
620-
621-
*New Behavior*:
622-
623-
:class:`CalendarDay` and associated frequency alias ``'CD'`` are now available
624-
and respect calendar day arithmetic while :class:`Day` and frequency alias ``'D'``
625-
will now respect absolute time (:issue:`22274`, :issue:`20596`, :issue:`16980`, :issue:`8774`)
626-
See the :ref:`documentation here <timeseries.dayvscalendarday>` for more information.
627-
628-
Addition with :class:`CalendarDay` across a daylight savings time transition:
629-
630-
.. ipython:: python
631-
632-
ts = pd.Timestamp('2016-10-30 00:00:00', tz='Europe/Helsinki')
633-
ts + pd.offsets.Day(1)
634-
ts + pd.offsets.CalendarDay(1)
635-
636594
.. _whatsnew_0240.api_breaking.period_end_time:
637595

638596
Time values in ``dt.end_time`` and ``to_timestamp(how='end')``

pandas/core/arrays/datetimes.py

+14-14
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@
2727
import pandas.core.common as com
2828

2929
from pandas.tseries.frequencies import get_period_alias, to_offset
30-
from pandas.tseries.offsets import Tick, generate_range
30+
from pandas.tseries.offsets import Day, Tick, generate_range
3131

3232
_midnight = time(0, 0)
3333

@@ -255,7 +255,8 @@ def _from_sequence(cls, data, dtype=None, copy=False,
255255

256256
@classmethod
257257
def _generate_range(cls, start, end, periods, freq, tz=None,
258-
normalize=False, ambiguous='raise', closed=None):
258+
normalize=False, ambiguous='raise',
259+
nonexistent='raise', closed=None):
259260

260261
periods = dtl.validate_periods(periods)
261262
if freq is None and any(x is None for x in [periods, start, end]):
@@ -285,7 +286,7 @@ def _generate_range(cls, start, end, periods, freq, tz=None,
285286
start, end, _normalized = _maybe_normalize_endpoints(start, end,
286287
normalize)
287288

288-
tz, _ = _infer_tz_from_endpoints(start, end, tz)
289+
tz = _infer_tz_from_endpoints(start, end, tz)
289290

290291
if tz is not None:
291292
# Localize the start and end arguments
@@ -295,22 +296,22 @@ def _generate_range(cls, start, end, periods, freq, tz=None,
295296
end = _maybe_localize_point(
296297
end, getattr(end, 'tz', None), end, freq, tz
297298
)
298-
if start and end:
299-
# Make sure start and end have the same tz
300-
start = _maybe_localize_point(
301-
start, start.tz, end.tz, freq, tz
302-
)
303-
end = _maybe_localize_point(
304-
end, end.tz, start.tz, freq, tz
305-
)
306299
if freq is not None:
300+
# We break Day arithmetic (fixed 24 hour) here and opt for
301+
# Day to mean calendar day (23/24/25 hour). Therefore, strip
302+
# tz info from start and day to avoid DST arithmetic
303+
if isinstance(freq, Day):
304+
if start is not None:
305+
start = start.tz_localize(None)
306+
if end is not None:
307+
end = end.tz_localize(None)
307308
# TODO: consider re-implementing _cached_range; GH#17914
308309
index = _generate_regular_range(cls, start, end, periods, freq)
309310

310311
if tz is not None and index.tz is None:
311312
arr = conversion.tz_localize_to_utc(
312313
index.asi8,
313-
tz, ambiguous=ambiguous)
314+
tz, ambiguous=ambiguous, nonexistent=nonexistent)
314315

315316
index = cls(arr)
316317

@@ -1878,7 +1879,6 @@ def _infer_tz_from_endpoints(start, end, tz):
18781879
Returns
18791880
-------
18801881
tz : tzinfo or None
1881-
inferred_tz : tzinfo or None
18821882
18831883
Raises
18841884
------
@@ -1901,7 +1901,7 @@ def _infer_tz_from_endpoints(start, end, tz):
19011901
elif inferred_tz is not None:
19021902
tz = inferred_tz
19031903

1904-
return tz, inferred_tz
1904+
return tz
19051905

19061906

19071907
def _maybe_normalize_endpoints(start, end, normalize):

pandas/core/resample.py

+13-8
Original file line numberDiff line numberDiff line change
@@ -1403,7 +1403,9 @@ def _get_time_bins(self, ax):
14031403
start=first,
14041404
end=last,
14051405
tz=tz,
1406-
name=ax.name)
1406+
name=ax.name,
1407+
ambiguous='infer',
1408+
nonexistent='shift')
14071409

14081410
# GH 15549
14091411
# In edge case of tz-aware resapmling binner last index can be
@@ -1607,7 +1609,7 @@ def _get_timestamp_range_edges(first, last, offset, closed='left', base=0):
16071609
Adjust the `first` Timestamp to the preceeding Timestamp that resides on
16081610
the provided offset. Adjust the `last` Timestamp to the following
16091611
Timestamp that resides on the provided offset. Input Timestamps that
1610-
already reside on the offset will be adjusted depeding on the type of
1612+
already reside on the offset will be adjusted depending on the type of
16111613
offset and the `closed` parameter.
16121614
16131615
Parameters
@@ -1627,18 +1629,21 @@ def _get_timestamp_range_edges(first, last, offset, closed='left', base=0):
16271629
-------
16281630
A tuple of length 2, containing the adjusted pd.Timestamp objects.
16291631
"""
1630-
if not all(isinstance(obj, pd.Timestamp) for obj in [first, last]):
1631-
raise TypeError("'first' and 'last' must be instances of type "
1632-
"Timestamp")
1633-
16341632
if isinstance(offset, Tick):
16351633
is_day = isinstance(offset, Day)
16361634
day_nanos = delta_to_nanoseconds(timedelta(1))
16371635

16381636
# #1165 and #24127
16391637
if (is_day and not offset.nanos % day_nanos) or not is_day:
1640-
return _adjust_dates_anchored(first, last, offset,
1641-
closed=closed, base=base)
1638+
first, last = _adjust_dates_anchored(first, last, offset,
1639+
closed=closed, base=base)
1640+
if is_day and first.tz is not None:
1641+
# _adjust_dates_anchored assumes 'D' means 24H, but first/last
1642+
# might contain a DST transition (23H, 24H, or 25H).
1643+
# Ensure first/last snap to midnight.
1644+
first = first.normalize()
1645+
last = last.normalize()
1646+
return first, last
16421647

16431648
else:
16441649
first = first.normalize()

pandas/tests/indexes/datetimes/test_date_range.py

+4-13
Original file line numberDiff line numberDiff line change
@@ -359,18 +359,18 @@ def test_range_tz_pytz(self):
359359
Timestamp(datetime(2013, 11, 6), tz='US/Eastern')]
360360
])
361361
def test_range_tz_dst_straddle_pytz(self, start, end):
362-
dr = date_range(start, end, freq='CD')
362+
dr = date_range(start, end, freq='D')
363363
assert dr[0] == start
364364
assert dr[-1] == end
365365
assert np.all(dr.hour == 0)
366366

367-
dr = date_range(start, end, freq='CD', tz='US/Eastern')
367+
dr = date_range(start, end, freq='D', tz='US/Eastern')
368368
assert dr[0] == start
369369
assert dr[-1] == end
370370
assert np.all(dr.hour == 0)
371371

372372
dr = date_range(start.replace(tzinfo=None), end.replace(
373-
tzinfo=None), freq='CD', tz='US/Eastern')
373+
tzinfo=None), freq='D', tz='US/Eastern')
374374
assert dr[0] == start
375375
assert dr[-1] == end
376376
assert np.all(dr.hour == 0)
@@ -604,14 +604,6 @@ def test_mismatching_tz_raises_err(self, start, end):
604604
with pytest.raises(TypeError):
605605
pd.date_range(start, end, freq=BDay())
606606

607-
def test_CalendarDay_range_with_dst_crossing(self):
608-
# GH 20596
609-
result = date_range('2018-10-23', '2018-11-06', freq='7CD',
610-
tz='Europe/Paris')
611-
expected = date_range('2018-10-23', '2018-11-06',
612-
freq=pd.DateOffset(days=7), tz='Europe/Paris')
613-
tm.assert_index_equal(result, expected)
614-
615607

616608
class TestBusinessDateRange(object):
617609

@@ -766,8 +758,7 @@ def test_cdaterange_weekmask_and_holidays(self):
766758
holidays=['2013-05-01'])
767759

768760
@pytest.mark.parametrize('freq', [freq for freq in prefix_mapping
769-
if freq.startswith('C')
770-
and freq != 'CD']) # CalendarDay
761+
if freq.startswith('C')])
771762
def test_all_custom_freq(self, freq):
772763
# should not raise
773764
bdate_range(START, END, freq=freq, weekmask='Mon Wed Fri',

pandas/tests/indexes/datetimes/test_timezones.py

+2-2
Original file line numberDiff line numberDiff line change
@@ -436,7 +436,7 @@ def test_dti_tz_localize_utc_conversion(self, tz):
436436

437437
@pytest.mark.parametrize('idx', [
438438
date_range(start='2014-01-01', end='2014-12-31', freq='M'),
439-
date_range(start='2014-01-01', end='2014-12-31', freq='CD'),
439+
date_range(start='2014-01-01', end='2014-12-31', freq='D'),
440440
date_range(start='2014-01-01', end='2014-03-01', freq='H'),
441441
date_range(start='2014-08-01', end='2014-10-31', freq='T')
442442
])
@@ -1072,7 +1072,7 @@ def test_date_range_span_dst_transition(self, tzstr):
10721072

10731073
dr = date_range('2012-11-02', periods=10, tz=tzstr)
10741074
result = dr.hour
1075-
expected = Index([0, 0, 0, 23, 23, 23, 23, 23, 23, 23])
1075+
expected = Index([0] * 10)
10761076
tm.assert_index_equal(result, expected)
10771077

10781078
@pytest.mark.parametrize('tzstr', ['US/Eastern', 'dateutil/US/Eastern'])

pandas/tests/indexes/timedeltas/test_timedelta_range.py

-4
Original file line numberDiff line numberDiff line change
@@ -49,10 +49,6 @@ def test_timedelta_range(self):
4949
result = df.loc['0s':, :]
5050
tm.assert_frame_equal(expected, result)
5151

52-
with pytest.raises(ValueError):
53-
# GH 22274: CalendarDay is a relative time measurement
54-
timedelta_range('1day', freq='CD', periods=2)
55-
5652
@pytest.mark.parametrize('periods, freq', [
5753
(3, '2D'), (5, 'D'), (6, '19H12T'), (7, '16H'), (9, '12H')])
5854
def test_linspace_behavior(self, periods, freq):

pandas/tests/resample/test_datetime_index.py

+4-4
Original file line numberDiff line numberDiff line change
@@ -1279,7 +1279,7 @@ def test_resample_dst_anchor(self):
12791279
# 5172
12801280
dti = DatetimeIndex([datetime(2012, 11, 4, 23)], tz='US/Eastern')
12811281
df = DataFrame([5], index=dti)
1282-
assert_frame_equal(df.resample(rule='CD').sum(),
1282+
assert_frame_equal(df.resample(rule='D').sum(),
12831283
DataFrame([5], index=df.index.normalize()))
12841284
df.resample(rule='MS').sum()
12851285
assert_frame_equal(
@@ -1333,14 +1333,14 @@ def test_resample_dst_anchor(self):
13331333

13341334
df_daily = df['10/26/2013':'10/29/2013']
13351335
assert_frame_equal(
1336-
df_daily.resample("CD").agg({"a": "min", "b": "max", "c": "count"})
1336+
df_daily.resample("D").agg({"a": "min", "b": "max", "c": "count"})
13371337
[["a", "b", "c"]],
13381338
DataFrame({"a": [1248, 1296, 1346, 1394],
13391339
"b": [1295, 1345, 1393, 1441],
13401340
"c": [48, 50, 48, 48]},
13411341
index=date_range('10/26/2013', '10/29/2013',
1342-
freq='CD', tz='Europe/Paris')),
1343-
'CD Frequency')
1342+
freq='D', tz='Europe/Paris')),
1343+
'D Frequency')
13441344

13451345
def test_downsample_across_dst(self):
13461346
# GH 8531

pandas/tests/resample/test_period_index.py

+4-3
Original file line numberDiff line numberDiff line change
@@ -289,10 +289,11 @@ def test_resample_nonexistent_time_bin_edge(self):
289289
index = date_range(start='2017-10-10', end='2017-10-20', freq='1H')
290290
index = index.tz_localize('UTC').tz_convert('America/Sao_Paulo')
291291
df = DataFrame(data=list(range(len(index))), index=index)
292-
result = df.groupby(pd.Grouper(freq='1D'))
292+
result = df.groupby(pd.Grouper(freq='1D')).count()
293293
expected = date_range(start='2017-10-09', end='2017-10-20', freq='D',
294-
tz="America/Sao_Paulo")
295-
tm.assert_index_equal(result.count().index, expected)
294+
tz="America/Sao_Paulo", nonexistent='shift',
295+
closed='left')
296+
tm.assert_index_equal(result.index, expected)
296297

297298
def test_resample_ambiguous_time_bin_edge(self):
298299
# GH 10117

pandas/tests/series/test_timezones.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -343,7 +343,7 @@ def test_getitem_pydatetime_tz(self, tzstr):
343343

344344
def test_series_truncate_datetimeindex_tz(self):
345345
# GH 9243
346-
idx = date_range('4/1/2005', '4/30/2005', freq='CD', tz='US/Pacific')
346+
idx = date_range('4/1/2005', '4/30/2005', freq='D', tz='US/Pacific')
347347
s = Series(range(len(idx)), index=idx)
348348
result = s.truncate(datetime(2005, 4, 2), datetime(2005, 4, 4))
349349
expected = Series([1, 2, 3], index=idx[1:4])

0 commit comments

Comments
 (0)