Skip to content

Commit d98e6f0

Browse files
Merge branch 'main' into test_numpy_complex2
2 parents bc96021 + 0021d24 commit d98e6f0

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

58 files changed

+615
-301
lines changed

doc/source/user_guide/timeseries.rst

+8-8
Original file line numberDiff line numberDiff line change
@@ -461,7 +461,7 @@ of those specified will not be generated:
461461

462462
.. ipython:: python
463463
464-
pd.date_range(start, end, freq="BM")
464+
pd.date_range(start, end, freq="BME")
465465
466466
pd.date_range(start, end, freq="W")
467467
@@ -557,7 +557,7 @@ intelligent functionality like selection, slicing, etc.
557557

558558
.. ipython:: python
559559
560-
rng = pd.date_range(start, end, freq="BM")
560+
rng = pd.date_range(start, end, freq="BME")
561561
ts = pd.Series(np.random.randn(len(rng)), index=rng)
562562
ts.index
563563
ts[:5].index
@@ -884,9 +884,9 @@ into ``freq`` keyword arguments. The available date offsets and associated frequ
884884
:class:`~pandas.tseries.offsets.LastWeekOfMonth`, ``'LWOM'``, "the x-th day of the last week of each month"
885885
:class:`~pandas.tseries.offsets.MonthEnd`, ``'ME'``, "calendar month end"
886886
:class:`~pandas.tseries.offsets.MonthBegin`, ``'MS'``, "calendar month begin"
887-
:class:`~pandas.tseries.offsets.BMonthEnd` or :class:`~pandas.tseries.offsets.BusinessMonthEnd`, ``'BM'``, "business month end"
887+
:class:`~pandas.tseries.offsets.BMonthEnd` or :class:`~pandas.tseries.offsets.BusinessMonthEnd`, ``'BME'``, "business month end"
888888
:class:`~pandas.tseries.offsets.BMonthBegin` or :class:`~pandas.tseries.offsets.BusinessMonthBegin`, ``'BMS'``, "business month begin"
889-
:class:`~pandas.tseries.offsets.CBMonthEnd` or :class:`~pandas.tseries.offsets.CustomBusinessMonthEnd`, ``'CBM'``, "custom business month end"
889+
:class:`~pandas.tseries.offsets.CBMonthEnd` or :class:`~pandas.tseries.offsets.CustomBusinessMonthEnd`, ``'CBME'``, "custom business month end"
890890
:class:`~pandas.tseries.offsets.CBMonthBegin` or :class:`~pandas.tseries.offsets.CustomBusinessMonthBegin`, ``'CBMS'``, "custom business month begin"
891891
:class:`~pandas.tseries.offsets.SemiMonthEnd`, ``'SM'``, "15th (or other day_of_month) and calendar month end"
892892
:class:`~pandas.tseries.offsets.SemiMonthBegin`, ``'SMS'``, "15th (or other day_of_month) and calendar month begin"
@@ -1248,8 +1248,8 @@ frequencies. We will refer to these aliases as *offset aliases*.
12481248
"W", "weekly frequency"
12491249
"ME", "month end frequency"
12501250
"SM", "semi-month end frequency (15th and end of month)"
1251-
"BM", "business month end frequency"
1252-
"CBM", "custom business month end frequency"
1251+
"BME", "business month end frequency"
1252+
"CBME", "custom business month end frequency"
12531253
"MS", "month start frequency"
12541254
"SMS", "semi-month start frequency (1st and 15th)"
12551255
"BMS", "business month start frequency"
@@ -1586,7 +1586,7 @@ rather than changing the alignment of the data and the index:
15861586
15871587
ts.shift(5, freq="D")
15881588
ts.shift(5, freq=pd.offsets.BDay())
1589-
ts.shift(5, freq="BM")
1589+
ts.shift(5, freq="BME")
15901590
15911591
Note that with when ``freq`` is specified, the leading entry is no longer NaN
15921592
because the data is not being realigned.
@@ -1692,7 +1692,7 @@ the end of the interval.
16921692
.. warning::
16931693

16941694
The default values for ``label`` and ``closed`` is '**left**' for all
1695-
frequency offsets except for 'ME', 'Y', 'Q', 'BM', 'BY', 'BQ', and 'W'
1695+
frequency offsets except for 'ME', 'Y', 'Q', 'BME', 'BY', 'BQ', and 'W'
16961696
which all have a default of 'right'.
16971697

16981698
This might unintendedly lead to looking ahead, where the value for a later

doc/source/whatsnew/v2.2.0.rst

+13-4
Original file line numberDiff line numberDiff line change
@@ -234,6 +234,7 @@ For example:
234234
Other Deprecations
235235
^^^^^^^^^^^^^^^^^^
236236
- Changed :meth:`Timedelta.resolution_string` to return ``h``, ``min``, ``s``, ``ms``, ``us``, and ``ns`` instead of ``H``, ``T``, ``S``, ``L``, ``U``, and ``N``, for compatibility with respective deprecations in frequency aliases (:issue:`52536`)
237+
- Deprecated :meth:`Index.format`, use ``index.astype(str)`` or ``index.map(formatter)`` instead (:issue:`55413`)
237238
- Deprecated allowing non-keyword arguments in :meth:`DataFrame.to_clipboard`. (:issue:`54229`)
238239
- Deprecated allowing non-keyword arguments in :meth:`DataFrame.to_csv` except ``path_or_buf``. (:issue:`54229`)
239240
- Deprecated allowing non-keyword arguments in :meth:`DataFrame.to_dict`. (:issue:`54229`)
@@ -252,7 +253,11 @@ Other Deprecations
252253
- Deprecated downcasting behavior in :meth:`Series.where`, :meth:`DataFrame.where`, :meth:`Series.mask`, :meth:`DataFrame.mask`, :meth:`Series.clip`, :meth:`DataFrame.clip`; in a future version these will not infer object-dtype columns to non-object dtype, or all-round floats to integer dtype. Call ``result.infer_objects(copy=False)`` on the result for object inference, or explicitly cast floats to ints. To opt in to the future version, use ``pd.set_option("future.no_silent_downcasting", True)`` (:issue:`53656`)
253254
- Deprecated including the groups in computations when using :meth:`DataFrameGroupBy.apply` and :meth:`DataFrameGroupBy.resample`; pass ``include_groups=False`` to exclude the groups (:issue:`7155`)
254255
- Deprecated not passing a tuple to :class:`DataFrameGroupBy.get_group` or :class:`SeriesGroupBy.get_group` when grouping by a length-1 list-like (:issue:`25971`)
255-
- Deprecated string ``A`` denoting frequency in :class:`YearEnd` and strings ``A-DEC``, ``A-JAN``, etc. denoting annual frequencies with various fiscal year ends (:issue:`52536`)
256+
- Deprecated string ``AS`` denoting frequency in :class:`YearBegin` and strings ``AS-DEC``, ``AS-JAN``, etc. denoting annual frequencies with various fiscal year starts (:issue:`54275`)
257+
- Deprecated string ``A`` denoting frequency in :class:`YearEnd` and strings ``A-DEC``, ``A-JAN``, etc. denoting annual frequencies with various fiscal year ends (:issue:`54275`)
258+
- Deprecated string ``BAS`` denoting frequency in :class:`BYearBegin` and strings ``BAS-DEC``, ``BAS-JAN``, etc. denoting annual frequencies with various fiscal year starts (:issue:`54275`)
259+
- Deprecated string ``BA`` denoting frequency in :class:`BYearEnd` and strings ``BA-DEC``, ``BA-JAN``, etc. denoting annual frequencies with various fiscal year ends (:issue:`54275`)
260+
- Deprecated strings ``BM``, and ``CBM`` denoting frequencies in :class:`BusinessMonthEnd`, :class:`CustomBusinessMonthEnd` (:issue:`52064`)
256261
- Deprecated strings ``H``, ``BH``, and ``CBH`` denoting frequencies in :class:`Hour`, :class:`BusinessHour`, :class:`CustomBusinessHour` (:issue:`52536`)
257262
- Deprecated strings ``H``, ``S``, ``U``, and ``N`` denoting units in :func:`to_timedelta` (:issue:`52536`)
258263
- Deprecated strings ``H``, ``T``, ``S``, ``L``, ``U``, and ``N`` denoting units in :class:`Timedelta` (:issue:`52536`)
@@ -261,6 +266,7 @@ Other Deprecations
261266
- Deprecated the extension test classes ``BaseNoReduceTests``, ``BaseBooleanReduceTests``, and ``BaseNumericReduceTests``, use ``BaseReduceTests`` instead (:issue:`54663`)
262267
- Deprecated the option ``mode.data_manager`` and the ``ArrayManager``; only the ``BlockManager`` will be available in future versions (:issue:`55043`)
263268
- Deprecating downcasting the results of :meth:`DataFrame.fillna`, :meth:`Series.fillna`, :meth:`DataFrame.ffill`, :meth:`Series.ffill`, :meth:`DataFrame.bfill`, :meth:`Series.bfill` in object-dtype cases. To opt in to the future version, use ``pd.set_option("future.no_silent_downcasting", True)`` (:issue:`54261`)
269+
-
264270

265271
.. ---------------------------------------------------------------------------
266272
.. _whatsnew_220.performance:
@@ -285,17 +291,20 @@ Bug fixes
285291
Categorical
286292
^^^^^^^^^^^
287293
- :meth:`Categorical.isin` raising ``InvalidIndexError`` for categorical containing overlapping :class:`Interval` values (:issue:`34974`)
294+
- Bug in :meth:`CategoricalDtype.__eq__` returning false for unordered categorical data with mixed types (:issue:`55468`)
288295
-
289296

290297
Datetimelike
291298
^^^^^^^^^^^^
292299
- Bug in :meth:`DatetimeIndex.union` returning object dtype for tz-aware indexes with the same timezone but different units (:issue:`55238`)
293-
-
300+
- Bug in :meth:`Tick.delta` with very large ticks raising ``OverflowError`` instead of ``OutOfBoundsTimedelta`` (:issue:`55503`)
301+
- Bug in addition or subtraction of very large :class:`Tick` objects with :class:`Timestamp` or :class:`Timedelta` objects raising ``OverflowError`` instead of ``OutOfBoundsTimedelta`` (:issue:`55503`)
302+
294303

295304
Timedelta
296305
^^^^^^^^^
306+
- Bug in :class:`Timedelta` construction raising ``OverflowError`` instead of ``OutOfBoundsTimedelta`` (:issue:`55503`)
297307
- Bug in rendering (``__repr__``) of :class:`TimedeltaIndex` and :class:`Series` with timedelta64 values with non-nanosecond resolution entries that are all multiples of 24 hours failing to use the compact representation used in the nanosecond cases (:issue:`55405`)
298-
-
299308

300309
Timezones
301310
^^^^^^^^^
@@ -352,7 +361,7 @@ I/O
352361

353362
Period
354363
^^^^^^
355-
-
364+
- Bug in :class:`Period` addition silently wrapping around instead of raising ``OverflowError`` (:issue:`55503`)
356365
-
357366

358367
Plotting

pandas/_libs/tslibs/dtypes.pyx

+3-1
Original file line numberDiff line numberDiff line change
@@ -188,7 +188,7 @@ cdef dict _abbrev_to_attrnames = {v: k for k, v in attrname_to_abbrevs.items()}
188188
OFFSET_TO_PERIOD_FREQSTR: dict = {
189189
"WEEKDAY": "D",
190190
"EOM": "M",
191-
"BM": "M",
191+
"BME": "M",
192192
"BQS": "Q",
193193
"QS": "Q",
194194
"BQ": "Q",
@@ -280,6 +280,8 @@ DEPR_ABBREVS: dict[str, str]= {
280280
"BAS-SEP": "BYS-SEP",
281281
"BAS-OCT": "BYS-OCT",
282282
"BAS-NOV": "BYS-NOV",
283+
"BM": "BME",
284+
"CBM": "CBME",
283285
"H": "h",
284286
"BH": "bh",
285287
"CBH": "cbh",

pandas/_libs/tslibs/offsets.pyx

+11-6
Original file line numberDiff line numberDiff line change
@@ -961,7 +961,12 @@ cdef class Tick(SingleConstructorOffset):
961961

962962
@property
963963
def delta(self):
964-
return self.n * Timedelta(self._nanos_inc)
964+
try:
965+
return self.n * Timedelta(self._nanos_inc)
966+
except OverflowError as err:
967+
# GH#55503 as_unit will raise a more useful OutOfBoundsTimedelta
968+
Timedelta(self).as_unit("ns")
969+
raise AssertionError("This should not be reached.")
965970

966971
@property
967972
def nanos(self) -> int64_t:
@@ -2930,7 +2935,7 @@ cdef class BusinessMonthEnd(MonthOffset):
29302935
>>> pd.offsets.BMonthEnd().rollforward(ts)
29312936
Timestamp('2022-11-30 00:00:00')
29322937
"""
2933-
_prefix = "BM"
2938+
_prefix = "BME"
29342939
_day_opt = "business_end"
29352940

29362941

@@ -4460,10 +4465,10 @@ cdef class CustomBusinessMonthEnd(_CustomBusinessMonth):
44604465
>>> freq = pd.offsets.CustomBusinessMonthEnd(calendar=bdc)
44614466
>>> pd.date_range(dt.datetime(2022, 7, 10), dt.datetime(2022, 11, 10), freq=freq)
44624467
DatetimeIndex(['2022-07-29', '2022-08-31', '2022-09-29', '2022-10-28'],
4463-
dtype='datetime64[ns]', freq='CBM')
4468+
dtype='datetime64[ns]', freq='CBME')
44644469
"""
44654470

4466-
_prefix = "CBM"
4471+
_prefix = "CBME"
44674472

44684473

44694474
cdef class CustomBusinessMonthBegin(_CustomBusinessMonth):
@@ -4546,12 +4551,12 @@ prefix_mapping = {
45464551
BYearEnd, # 'BY'
45474552
BusinessDay, # 'B'
45484553
BusinessMonthBegin, # 'BMS'
4549-
BusinessMonthEnd, # 'BM'
4554+
BusinessMonthEnd, # 'BME'
45504555
BQuarterEnd, # 'BQ'
45514556
BQuarterBegin, # 'BQS'
45524557
BusinessHour, # 'bh'
45534558
CustomBusinessDay, # 'C'
4554-
CustomBusinessMonthEnd, # 'CBM'
4559+
CustomBusinessMonthEnd, # 'CBME'
45554560
CustomBusinessMonthBegin, # 'CBMS'
45564561
CustomBusinessHour, # 'cbh'
45574562
MonthEnd, # 'ME'

pandas/_libs/tslibs/period.pyx

+4-3
Original file line numberDiff line numberDiff line change
@@ -1814,7 +1814,7 @@ cdef class _Period(PeriodMixin):
18141814

18151815
def _add_timedeltalike_scalar(self, other) -> "Period":
18161816
cdef:
1817-
int64_t inc
1817+
int64_t inc, ordinal
18181818

18191819
if not self._dtype._is_tick_like():
18201820
raise IncompatibleFrequency("Input cannot be converted to "
@@ -1832,8 +1832,8 @@ cdef class _Period(PeriodMixin):
18321832
except ValueError as err:
18331833
raise IncompatibleFrequency("Input cannot be converted to "
18341834
f"Period(freq={self.freqstr})") from err
1835-
# TODO: overflow-check here
1836-
ordinal = self.ordinal + inc
1835+
with cython.overflowcheck(True):
1836+
ordinal = self.ordinal + inc
18371837
return Period(ordinal=ordinal, freq=self.freq)
18381838

18391839
def _add_offset(self, other) -> "Period":
@@ -1846,6 +1846,7 @@ cdef class _Period(PeriodMixin):
18461846
ordinal = self.ordinal + other.n
18471847
return Period(ordinal=ordinal, freq=self.freq)
18481848

1849+
@cython.overflowcheck(True)
18491850
def __add__(self, other):
18501851
if not is_period_object(self):
18511852
# cython semantics; this is analogous to a call to __radd__

pandas/_libs/tslibs/timedeltas.pxd

+1
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@ from numpy cimport int64_t
44
from .np_datetime cimport NPY_DATETIMEUNIT
55

66

7+
cpdef int64_t get_unit_for_round(freq, NPY_DATETIMEUNIT creso) except? -1
78
# Exposed for tslib, not intended for outside use.
89
cpdef int64_t delta_to_nanoseconds(
910
delta, NPY_DATETIMEUNIT reso=*, bint round_ok=*

pandas/_libs/tslibs/timedeltas.pyi

+2
Original file line numberDiff line numberDiff line change
@@ -68,6 +68,8 @@ UnitChoices: TypeAlias = Literal[
6868
]
6969
_S = TypeVar("_S", bound=timedelta)
7070

71+
def get_unit_for_round(freq, creso: int) -> int: ...
72+
def disallow_ambiguous_unit(unit: str | None) -> None: ...
7173
def ints_to_pytimedelta(
7274
arr: npt.NDArray[np.timedelta64],
7375
box: bool = ...,

pandas/_libs/tslibs/timedeltas.pyx

+37-16
Original file line numberDiff line numberDiff line change
@@ -827,6 +827,14 @@ def _binary_op_method_timedeltalike(op, name):
827827
# ----------------------------------------------------------------------
828828
# Timedelta Construction
829829

830+
cpdef disallow_ambiguous_unit(unit):
831+
if unit in {"Y", "y", "M"}:
832+
raise ValueError(
833+
"Units 'M', 'Y', and 'y' are no longer supported, as they do not "
834+
"represent unambiguous timedelta values durations."
835+
)
836+
837+
830838
cdef int64_t parse_iso_format_string(str ts) except? -1:
831839
"""
832840
Extracts and cleanses the appropriate values from a match object with
@@ -1784,7 +1792,7 @@ class Timedelta(_Timedelta):
17841792
)
17851793

17861794
# GH43764, convert any input to nanoseconds first and then
1787-
# create the timestamp. This ensures that any potential
1795+
# create the timedelta. This ensures that any potential
17881796
# nanosecond contributions from kwargs parsed as floats
17891797
# are taken into consideration.
17901798
seconds = int((
@@ -1797,17 +1805,25 @@ class Timedelta(_Timedelta):
17971805
) * 1_000_000_000
17981806
)
17991807

1800-
value = np.timedelta64(
1801-
int(kwargs.get("nanoseconds", 0))
1802-
+ int(kwargs.get("microseconds", 0) * 1_000)
1803-
+ int(kwargs.get("milliseconds", 0) * 1_000_000)
1804-
+ seconds
1805-
)
1806-
if unit in {"Y", "y", "M"}:
1807-
raise ValueError(
1808-
"Units 'M', 'Y', and 'y' are no longer supported, as they do not "
1809-
"represent unambiguous timedelta values durations."
1810-
)
1808+
ns = kwargs.get("nanoseconds", 0)
1809+
us = kwargs.get("microseconds", 0)
1810+
ms = kwargs.get("milliseconds", 0)
1811+
try:
1812+
value = np.timedelta64(
1813+
int(ns)
1814+
+ int(us * 1_000)
1815+
+ int(ms * 1_000_000)
1816+
+ seconds
1817+
)
1818+
except OverflowError as err:
1819+
# GH#55503
1820+
msg = (
1821+
f"seconds={seconds}, milliseconds={ms}, "
1822+
f"microseconds={us}, nanoseconds={ns}"
1823+
)
1824+
raise OutOfBoundsTimedelta(msg) from err
1825+
1826+
disallow_ambiguous_unit(unit)
18111827

18121828
# GH 30543 if pd.Timedelta already passed, return it
18131829
# check that only value is passed
@@ -1920,10 +1936,7 @@ class Timedelta(_Timedelta):
19201936
int64_t result, unit
19211937
ndarray[int64_t] arr
19221938

1923-
from pandas._libs.tslibs.offsets import to_offset
1924-
1925-
to_offset(freq).nanos # raises on non-fixed freq
1926-
unit = delta_to_nanoseconds(to_offset(freq), self._creso)
1939+
unit = get_unit_for_round(freq, self._creso)
19271940

19281941
arr = np.array([self._value], dtype="i8")
19291942
try:
@@ -2280,3 +2293,11 @@ cdef bint _should_cast_to_timedelta(object obj):
22802293
return (
22812294
is_any_td_scalar(obj) or obj is None or obj is NaT or isinstance(obj, str)
22822295
)
2296+
2297+
2298+
cpdef int64_t get_unit_for_round(freq, NPY_DATETIMEUNIT creso) except? -1:
2299+
from pandas._libs.tslibs.offsets import to_offset
2300+
2301+
freq = to_offset(freq)
2302+
freq.nanos # raises on non-fixed freq
2303+
return delta_to_nanoseconds(freq, creso)

pandas/_libs/tslibs/timestamps.pyx

+2-5
Original file line numberDiff line numberDiff line change
@@ -107,7 +107,7 @@ from pandas._libs.tslibs.np_datetime import (
107107
from pandas._libs.tslibs.offsets cimport to_offset
108108
from pandas._libs.tslibs.timedeltas cimport (
109109
_Timedelta,
110-
delta_to_nanoseconds,
110+
get_unit_for_round,
111111
is_any_td_scalar,
112112
)
113113

@@ -1896,17 +1896,14 @@ class Timestamp(_Timestamp):
18961896
int64_t nanos
18971897

18981898
freq = to_offset(freq, is_period=False)
1899-
freq.nanos # raises on non-fixed freq
1900-
nanos = delta_to_nanoseconds(freq, self._creso)
1899+
nanos = get_unit_for_round(freq, self._creso)
19011900
if nanos == 0:
19021901
if freq.nanos == 0:
19031902
raise ValueError("Division by zero in rounding")
19041903

19051904
# e.g. self.unit == "s" and sub-second freq
19061905
return self
19071906

1908-
# TODO: problem if nanos==0
1909-
19101907
if self.tz is not None:
19111908
value = self.tz_localize(None)._value
19121909
else:

pandas/_libs/tslibs/tzconversion.pyx

+6-2
Original file line numberDiff line numberDiff line change
@@ -425,7 +425,11 @@ timedelta-like}
425425
return result.base # .base to get underlying ndarray
426426

427427

428-
cdef Py_ssize_t bisect_right_i8(int64_t *data, int64_t val, Py_ssize_t n):
428+
cdef Py_ssize_t bisect_right_i8(
429+
const int64_t *data,
430+
int64_t val,
431+
Py_ssize_t n
432+
) noexcept:
429433
# Caller is responsible for checking n > 0
430434
# This looks very similar to local_search_right in the ndarray.searchsorted
431435
# implementation.
@@ -463,7 +467,7 @@ cdef str _render_tstamp(int64_t val, NPY_DATETIMEUNIT creso):
463467

464468
cdef _get_utc_bounds(
465469
ndarray[int64_t] vals,
466-
int64_t* tdata,
470+
const int64_t* tdata,
467471
Py_ssize_t ntrans,
468472
const int64_t[::1] deltas,
469473
NPY_DATETIMEUNIT creso,

pandas/core/arrays/datetimelike.py

+2-4
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,6 @@
3636
Timedelta,
3737
Timestamp,
3838
astype_overflowsafe,
39-
delta_to_nanoseconds,
4039
get_unit_from_dtype,
4140
iNaT,
4241
ints_to_pydatetime,
@@ -49,6 +48,7 @@
4948
round_nsint64,
5049
)
5150
from pandas._libs.tslibs.np_datetime import compare_mismatched_resolutions
51+
from pandas._libs.tslibs.timedeltas import get_unit_for_round
5252
from pandas._libs.tslibs.timestamps import integer_op_not_supported
5353
from pandas._typing import (
5454
ArrayLike,
@@ -2129,9 +2129,7 @@ def _round(self, freq, mode, ambiguous, nonexistent):
21292129

21302130
values = self.view("i8")
21312131
values = cast(np.ndarray, values)
2132-
offset = to_offset(freq)
2133-
offset.nanos # raises on non-fixed frequencies
2134-
nanos = delta_to_nanoseconds(offset, self._creso)
2132+
nanos = get_unit_for_round(freq, self._creso)
21352133
if nanos == 0:
21362134
# GH 52761
21372135
return self.copy()

0 commit comments

Comments
 (0)