Skip to content

Commit 2683ab5

Browse files
Merge remote-tracking branch 'upstream/main' into bisect
2 parents a4a2f7a + f74a186 commit 2683ab5

36 files changed

+572
-231
lines changed

.github/workflows/macos-windows.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ jobs:
2323
defaults:
2424
run:
2525
shell: bash -el {0}
26-
timeout-minutes: 90
26+
timeout-minutes: 120
2727
strategy:
2828
matrix:
2929
os: [macos-latest, windows-latest]

doc/redirects.csv

+1
Original file line numberDiff line numberDiff line change
@@ -761,6 +761,7 @@ generated/pandas.IntervalIndex.mid,../reference/api/pandas.IntervalIndex.mid
761761
generated/pandas.IntervalIndex.overlaps,../reference/api/pandas.IntervalIndex.overlaps
762762
generated/pandas.IntervalIndex.right,../reference/api/pandas.IntervalIndex.right
763763
generated/pandas.IntervalIndex.set_closed,../reference/api/pandas.IntervalIndex.set_closed
764+
generated/pandas.IntervalIndex.set_inclusive,../reference/api/pandas.IntervalIndex.set_inclusive
764765
generated/pandas.IntervalIndex.to_tuples,../reference/api/pandas.IntervalIndex.to_tuples
765766
generated/pandas.IntervalIndex.values,../reference/api/pandas.IntervalIndex.values
766767
generated/pandas.Interval.left,../reference/api/pandas.Interval.left

doc/source/reference/arrays.rst

+1
Original file line numberDiff line numberDiff line change
@@ -352,6 +352,7 @@ A collection of intervals may be stored in an :class:`arrays.IntervalArray`.
352352
arrays.IntervalArray.contains
353353
arrays.IntervalArray.overlaps
354354
arrays.IntervalArray.set_closed
355+
arrays.IntervalArray.set_inclusive
355356
arrays.IntervalArray.to_tuples
356357

357358

doc/source/reference/indexing.rst

+1
Original file line numberDiff line numberDiff line change
@@ -251,6 +251,7 @@ IntervalIndex components
251251
IntervalIndex.get_loc
252252
IntervalIndex.get_indexer
253253
IntervalIndex.set_closed
254+
IntervalIndex.set_inclusive
254255
IntervalIndex.contains
255256
IntervalIndex.overlaps
256257
IntervalIndex.to_tuples

doc/source/user_guide/groupby.rst

+3-3
Original file line numberDiff line numberDiff line change
@@ -839,10 +839,10 @@ Alternatively, the built-in methods could be used to produce the same outputs.
839839

840840
.. ipython:: python
841841
842-
max = ts.groupby(lambda x: x.year).transform("max")
843-
min = ts.groupby(lambda x: x.year).transform("min")
842+
max_ts = ts.groupby(lambda x: x.year).transform("max")
843+
min_ts = ts.groupby(lambda x: x.year).transform("min")
844844
845-
max - min
845+
max_ts - min_ts
846846
847847
Another common data transform is to replace missing data with the group mean.
848848

doc/source/whatsnew/v1.5.0.rst

+4-3
Original file line numberDiff line numberDiff line change
@@ -277,7 +277,7 @@ Other enhancements
277277
- Allow reading compressed SAS files with :func:`read_sas` (e.g., ``.sas7bdat.gz`` files)
278278
- :meth:`DatetimeIndex.astype` now supports casting timezone-naive indexes to ``datetime64[s]``, ``datetime64[ms]``, and ``datetime64[us]``, and timezone-aware indexes to the corresponding ``datetime64[unit, tzname]`` dtypes (:issue:`47579`)
279279
- :class:`Series` reducers (e.g. ``min``, ``max``, ``sum``, ``mean``) will now successfully operate when the dtype is numeric and ``numeric_only=True`` is provided; previously this would raise a ``NotImplementedError`` (:issue:`47500`)
280-
-
280+
- :meth:`RangeIndex.union` now can return a :class:`RangeIndex` instead of a :class:`Int64Index` if the resulting values are equally spaced (:issue:`47557`, :issue:`43885`)
281281

282282
.. ---------------------------------------------------------------------------
283283
.. _whatsnew_150.notable_bug_fixes:
@@ -762,7 +762,7 @@ Other Deprecations
762762
- Deprecated the ``closed`` argument in :class:`IntervalIndex` in favor of ``inclusive`` argument; In a future version passing ``closed`` will raise (:issue:`40245`)
763763
- Deprecated the ``closed`` argument in :class:`IntervalDtype` in favor of ``inclusive`` argument; In a future version passing ``closed`` will raise (:issue:`40245`)
764764
- Deprecated the ``closed`` argument in :class:`.IntervalArray` in favor of ``inclusive`` argument; In a future version passing ``closed`` will raise (:issue:`40245`)
765-
- Deprecated the ``closed`` argument in :class:`IntervalTree` in favor of ``inclusive`` argument; In a future version passing ``closed`` will raise (:issue:`40245`)
765+
- Deprecated :meth:`.IntervalArray.set_closed` and :meth:`.IntervalIndex.set_closed` in favor of ``set_inclusive``; In a future version ``set_closed`` will get removed (:issue:`40245`)
766766
- Deprecated the ``closed`` argument in :class:`ArrowInterval` in favor of ``inclusive`` argument; In a future version passing ``closed`` will raise (:issue:`40245`)
767767
- Deprecated allowing ``unit="M"`` or ``unit="Y"`` in :class:`Timestamp` constructor with a non-round float value (:issue:`47267`)
768768
- Deprecated the ``display.column_space`` global configuration option (:issue:`7576`)
@@ -885,6 +885,7 @@ Indexing
885885
- Bug when setting a value too large for a :class:`Series` dtype failing to coerce to a common type (:issue:`26049`, :issue:`32878`)
886886
- Bug in :meth:`loc.__setitem__` treating ``range`` keys as positional instead of label-based (:issue:`45479`)
887887
- Bug in :meth:`Series.__setitem__` when setting ``boolean`` dtype values containing ``NA`` incorrectly raising instead of casting to ``boolean`` dtype (:issue:`45462`)
888+
- Bug in :meth:`Series.loc` raising with boolean indexer containing ``NA`` when :class:`Index` did not match (:issue:`46551`)
888889
- Bug in :meth:`Series.__setitem__` where setting :attr:`NA` into a numeric-dtype :class:`Series` would incorrectly upcast to object-dtype rather than treating the value as ``np.nan`` (:issue:`44199`)
889890
- Bug in :meth:`DataFrame.loc` when setting values to a column and right hand side is a dictionary (:issue:`47216`)
890891
- Bug in :meth:`DataFrame.loc` when setting a :class:`DataFrame` not aligning index in some cases (:issue:`47578`)
@@ -1008,7 +1009,7 @@ Reshaping
10081009
- Bug in :func:`concat` with identical key leads to error when indexing :class:`MultiIndex` (:issue:`46519`)
10091010
- Bug in :meth:`DataFrame.join` with a list when using suffixes to join DataFrames with duplicate column names (:issue:`46396`)
10101011
- Bug in :meth:`DataFrame.pivot_table` with ``sort=False`` results in sorted index (:issue:`17041`)
1011-
-
1012+
- Bug in :meth:`concat` when ``axis=1`` and ``sort=False`` where the resulting Index was a :class:`Int64Index` instead of a :class:`RangeIndex` (:issue:`46675`)
10121013

10131014
Sparse
10141015
^^^^^^

pandas/_config/localization.py

+2-1
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,8 @@ def set_locale(
3939
particular locale, without globally setting the locale. This probably isn't
4040
thread-safe.
4141
"""
42-
current_locale = locale.getlocale()
42+
# getlocale is not always compliant with setlocale, use setlocale. GH#46595
43+
current_locale = locale.setlocale(lc_var)
4344

4445
try:
4546
locale.setlocale(lc_var, new_locale)

pandas/_libs/interval.pyi

+8-8
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ import numpy.typing as npt
1212

1313
from pandas._libs import lib
1414
from pandas._typing import (
15-
IntervalClosedType,
15+
IntervalInclusiveType,
1616
Timedelta,
1717
Timestamp,
1818
)
@@ -56,25 +56,25 @@ class IntervalMixin:
5656

5757
def _warning_interval(
5858
inclusive, closed
59-
) -> tuple[IntervalClosedType, lib.NoDefault]: ...
59+
) -> tuple[IntervalInclusiveType, lib.NoDefault]: ...
6060

6161
class Interval(IntervalMixin, Generic[_OrderableT]):
6262
@property
6363
def left(self: Interval[_OrderableT]) -> _OrderableT: ...
6464
@property
6565
def right(self: Interval[_OrderableT]) -> _OrderableT: ...
6666
@property
67-
def inclusive(self) -> IntervalClosedType: ...
67+
def inclusive(self) -> IntervalInclusiveType: ...
6868
@property
69-
def closed(self) -> IntervalClosedType: ...
69+
def closed(self) -> IntervalInclusiveType: ...
7070
mid: _MidDescriptor
7171
length: _LengthDescriptor
7272
def __init__(
7373
self,
7474
left: _OrderableT,
7575
right: _OrderableT,
76-
inclusive: IntervalClosedType = ...,
77-
closed: IntervalClosedType = ...,
76+
inclusive: IntervalInclusiveType = ...,
77+
closed: IntervalInclusiveType = ...,
7878
) -> None: ...
7979
def __hash__(self) -> int: ...
8080
@overload
@@ -151,14 +151,14 @@ class Interval(IntervalMixin, Generic[_OrderableT]):
151151

152152
def intervals_to_interval_bounds(
153153
intervals: np.ndarray, validate_closed: bool = ...
154-
) -> tuple[np.ndarray, np.ndarray, str]: ...
154+
) -> tuple[np.ndarray, np.ndarray, IntervalInclusiveType]: ...
155155

156156
class IntervalTree(IntervalMixin):
157157
def __init__(
158158
self,
159159
left: np.ndarray,
160160
right: np.ndarray,
161-
inclusive: IntervalClosedType = ...,
161+
inclusive: IntervalInclusiveType = ...,
162162
leaf_size: int = ...,
163163
) -> None: ...
164164
@property

pandas/_libs/intervaltree.pxi.in

+3-12
Original file line numberDiff line numberDiff line change
@@ -8,8 +8,6 @@ import warnings
88
from pandas._libs import lib
99
from pandas._libs.algos import is_monotonic
1010

11-
from pandas._libs.interval import _warning_interval
12-
1311
ctypedef fused int_scalar_t:
1412
int64_t
1513
float64_t
@@ -42,18 +40,13 @@ cdef class IntervalTree(IntervalMixin):
4240
object _is_overlapping, _left_sorter, _right_sorter
4341
Py_ssize_t _na_count
4442

45-
def __init__(self, left, right, inclusive: str | None = None, closed: None | lib.NoDefault = lib.no_default, leaf_size=100):
43+
def __init__(self, left, right, inclusive: str | None = None, leaf_size=100):
4644
"""
4745
Parameters
4846
----------
4947
left, right : np.ndarray[ndim=1]
5048
Left and right bounds for each interval. Assumed to contain no
5149
NaNs.
52-
closed : {'left', 'right', 'both', 'neither'}, optional
53-
Whether the intervals are closed on the left-side, right-side, both
54-
or neither. Defaults to 'right'.
55-
56-
.. deprecated:: 1.5.0
5750

5851
inclusive : {"both", "neither", "left", "right"}, optional
5952
Whether the intervals are closed on the left-side, right-side, both
@@ -66,8 +59,6 @@ cdef class IntervalTree(IntervalMixin):
6659
to brute-force search. Tune this parameter to optimize query
6760
performance.
6861
"""
69-
inclusive, closed = _warning_interval(inclusive, closed)
70-
7162
if inclusive is None:
7263
inclusive = "right"
7364

@@ -119,7 +110,7 @@ cdef class IntervalTree(IntervalMixin):
119110
if self._is_overlapping is not None:
120111
return self._is_overlapping
121112

122-
# <= when both sides closed since endpoints can overlap
113+
# <= when inclusive on both sides since endpoints can overlap
123114
op = le if self.inclusive == 'both' else lt
124115

125116
# overlap if start of current interval < end of previous interval
@@ -263,7 +254,7 @@ cdef class IntervalNode:
263254

264255

265256
# we need specialized nodes and leaves to optimize for different dtype and
266-
# closed values
257+
# inclusive values
267258

268259
{{py:
269260

pandas/_libs/tslibs/timedeltas.pyx

+64-8
Original file line numberDiff line numberDiff line change
@@ -950,14 +950,18 @@ cdef _timedelta_from_value_and_reso(int64_t value, NPY_DATETIMEUNIT reso):
950950
cdef:
951951
_Timedelta td_base
952952

953+
# For millisecond and second resos, we cannot actually pass int(value) because
954+
# many cases would fall outside of the pytimedelta implementation bounds.
955+
# We pass 0 instead, and override seconds, microseconds, days.
956+
# In principle we could pass 0 for ns and us too.
953957
if reso == NPY_FR_ns:
954958
td_base = _Timedelta.__new__(Timedelta, microseconds=int(value) // 1000)
955959
elif reso == NPY_DATETIMEUNIT.NPY_FR_us:
956960
td_base = _Timedelta.__new__(Timedelta, microseconds=int(value))
957961
elif reso == NPY_DATETIMEUNIT.NPY_FR_ms:
958-
td_base = _Timedelta.__new__(Timedelta, milliseconds=int(value))
962+
td_base = _Timedelta.__new__(Timedelta, milliseconds=0)
959963
elif reso == NPY_DATETIMEUNIT.NPY_FR_s:
960-
td_base = _Timedelta.__new__(Timedelta, seconds=int(value))
964+
td_base = _Timedelta.__new__(Timedelta, seconds=0)
961965
# Other resolutions are disabled but could potentially be implemented here:
962966
# elif reso == NPY_DATETIMEUNIT.NPY_FR_m:
963967
# td_base = _Timedelta.__new__(Timedelta, minutes=int(value))
@@ -977,6 +981,34 @@ cdef _timedelta_from_value_and_reso(int64_t value, NPY_DATETIMEUNIT reso):
977981
return td_base
978982

979983

984+
class MinMaxReso:
985+
"""
986+
We need to define min/max/resolution on both the Timedelta _instance_
987+
and Timedelta class. On an instance, these depend on the object's _reso.
988+
On the class, we default to the values we would get with nanosecond _reso.
989+
"""
990+
def __init__(self, name):
991+
self._name = name
992+
993+
def __get__(self, obj, type=None):
994+
if self._name == "min":
995+
val = np.iinfo(np.int64).min + 1
996+
elif self._name == "max":
997+
val = np.iinfo(np.int64).max
998+
else:
999+
assert self._name == "resolution"
1000+
val = 1
1001+
1002+
if obj is None:
1003+
# i.e. this is on the class, default to nanos
1004+
return Timedelta(val)
1005+
else:
1006+
return Timedelta._from_value_and_reso(val, obj._reso)
1007+
1008+
def __set__(self, obj, value):
1009+
raise AttributeError(f"{self._name} is not settable.")
1010+
1011+
9801012
# Similar to Timestamp/datetime, this is a construction requirement for
9811013
# timedeltas that we need to do object instantiation in python. This will
9821014
# serve as a C extension type that shadows the Python class, where we do any
@@ -990,6 +1022,36 @@ cdef class _Timedelta(timedelta):
9901022

9911023
# higher than np.ndarray and np.matrix
9921024
__array_priority__ = 100
1025+
min = MinMaxReso("min")
1026+
max = MinMaxReso("max")
1027+
resolution = MinMaxReso("resolution")
1028+
1029+
@property
1030+
def days(self) -> int: # TODO(cython3): make cdef property
1031+
# NB: using the python C-API PyDateTime_DELTA_GET_DAYS will fail
1032+
# (or be incorrect)
1033+
self._ensure_components()
1034+
return self._d
1035+
1036+
@property
1037+
def seconds(self) -> int: # TODO(cython3): make cdef property
1038+
# NB: using the python C-API PyDateTime_DELTA_GET_SECONDS will fail
1039+
# (or be incorrect)
1040+
self._ensure_components()
1041+
return self._h * 3600 + self._m * 60 + self._s
1042+
1043+
@property
1044+
def microseconds(self) -> int: # TODO(cython3): make cdef property
1045+
# NB: using the python C-API PyDateTime_DELTA_GET_MICROSECONDS will fail
1046+
# (or be incorrect)
1047+
self._ensure_components()
1048+
return self._ms * 1000 + self._us
1049+
1050+
def total_seconds(self) -> float:
1051+
"""Total seconds in the duration."""
1052+
# We need to override bc we overrided days/seconds/microseconds
1053+
# TODO: add nanos/1e9?
1054+
return self.days * 24 * 3600 + self.seconds + self.microseconds / 1_000_000
9931055

9941056
@property
9951057
def freq(self) -> None:
@@ -1979,9 +2041,3 @@ cdef _broadcast_floordiv_td64(
19792041
res = res.astype('f8')
19802042
res[mask] = np.nan
19812043
return res
1982-
1983-
1984-
# resolution in ns
1985-
Timedelta.min = Timedelta(np.iinfo(np.int64).min + 1)
1986-
Timedelta.max = Timedelta(np.iinfo(np.int64).max)
1987-
Timedelta.resolution = Timedelta(nanoseconds=1)

pandas/_typing.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -314,7 +314,7 @@ def closed(self) -> bool:
314314

315315
# Interval closed type
316316
IntervalLeftRight = Literal["left", "right"]
317-
IntervalClosedType = Union[IntervalLeftRight, Literal["both", "neither"]]
317+
IntervalInclusiveType = Union[IntervalLeftRight, Literal["both", "neither"]]
318318

319319
# datetime and NaTType
320320
DatetimeNaTType = Union[datetime, "NaTType"]

pandas/core/arrays/arrow/_arrow_utils.py

+5-4
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@
66
import numpy as np
77
import pyarrow
88

9+
from pandas._typing import IntervalInclusiveType
910
from pandas.errors import PerformanceWarning
1011
from pandas.util._decorators import deprecate_kwarg
1112
from pandas.util._exceptions import find_stack_level
@@ -107,11 +108,11 @@ def to_pandas_dtype(self):
107108

108109
class ArrowIntervalType(pyarrow.ExtensionType):
109110
@deprecate_kwarg(old_arg_name="closed", new_arg_name="inclusive")
110-
def __init__(self, subtype, inclusive: str) -> None:
111+
def __init__(self, subtype, inclusive: IntervalInclusiveType) -> None:
111112
# attributes need to be set first before calling
112113
# super init (as that calls serialize)
113114
assert inclusive in VALID_CLOSED
114-
self._closed = inclusive
115+
self._closed: IntervalInclusiveType = inclusive
115116
if not isinstance(subtype, pyarrow.DataType):
116117
subtype = pyarrow.type_for_alias(str(subtype))
117118
self._subtype = subtype
@@ -124,11 +125,11 @@ def subtype(self):
124125
return self._subtype
125126

126127
@property
127-
def inclusive(self) -> str:
128+
def inclusive(self) -> IntervalInclusiveType:
128129
return self._closed
129130

130131
@property
131-
def closed(self):
132+
def closed(self) -> IntervalInclusiveType:
132133
warnings.warn(
133134
"Attribute `closed` is deprecated in favor of `inclusive`.",
134135
FutureWarning,

0 commit comments

Comments
 (0)