Skip to content

Commit 9dfa18d

Browse files
authored
Merge branch 'main' into dev/depr/literal-str-read_xml
2 parents 526c224 + fbf647d commit 9dfa18d

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

54 files changed

+1299
-767
lines changed

ci/code_checks.sh

Lines changed: 0 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -105,17 +105,6 @@ if [[ -z "$CHECK" || "$CHECK" == "docstrings" ]]; then
105105
pandas.errors.UnsupportedFunctionCall \
106106
pandas.test \
107107
pandas.NaT \
108-
pandas.SparseDtype \
109-
pandas.DatetimeTZDtype.unit \
110-
pandas.DatetimeTZDtype.tz \
111-
pandas.PeriodDtype.freq \
112-
pandas.IntervalDtype.subtype \
113-
pandas_dtype \
114-
pandas.api.types.is_bool \
115-
pandas.api.types.is_complex \
116-
pandas.api.types.is_float \
117-
pandas.api.types.is_integer \
118-
pandas.api.types.pandas_dtype \
119108
pandas.read_clipboard \
120109
pandas.ExcelFile \
121110
pandas.ExcelFile.parse \

doc/source/whatsnew/v2.0.3.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@ including other versions of pandas.
1414
Fixed regressions
1515
~~~~~~~~~~~~~~~~~
1616
- Fixed performance regression in merging on datetime-like columns (:issue:`53231`)
17+
- Fixed regression when :meth:`DataFrame.to_string` creates extra space for string dtypes (:issue:`52690`)
1718
- For external ExtensionArray implementations, restored the default use of ``_values_for_factorize`` for hashing arrays (:issue:`53475`)
1819
-
1920

doc/source/whatsnew/v2.1.0.rst

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -366,12 +366,14 @@ Datetimelike
366366
- :meth:`DatetimeIndex.map` with ``na_action="ignore"`` now works as expected. (:issue:`51644`)
367367
- Bug in :class:`DateOffset` which had inconsistent behavior when multiplying a :class:`DateOffset` object by a constant (:issue:`47953`)
368368
- Bug in :func:`date_range` when ``freq`` was a :class:`DateOffset` with ``nanoseconds`` (:issue:`46877`)
369+
- Bug in :meth:`DataFrame.to_sql` raising ``ValueError`` for pyarrow-backed date like dtypes (:issue:`53854`)
369370
- Bug in :meth:`Timestamp.date`, :meth:`Timestamp.isocalendar`, :meth:`Timestamp.timetuple`, and :meth:`Timestamp.toordinal` were returning incorrect results for inputs outside those supported by the Python standard library's datetime module (:issue:`53668`)
370371
- Bug in :meth:`Timestamp.round` with values close to the implementation bounds returning incorrect results instead of raising ``OutOfBoundsDatetime`` (:issue:`51494`)
371372
- Bug in :meth:`arrays.DatetimeArray.map` and :meth:`DatetimeIndex.map`, where the supplied callable operated array-wise instead of element-wise (:issue:`51977`)
372373
- Bug in constructing a :class:`Series` or :class:`DataFrame` from a datetime or timedelta scalar always inferring nanosecond resolution instead of inferring from the input (:issue:`52212`)
373374
- Bug in parsing datetime strings with weekday but no day e.g. "2023 Sept Thu" incorrectly raising ``AttributeError`` instead of ``ValueError`` (:issue:`52659`)
374375

376+
375377
Timedelta
376378
^^^^^^^^^
377379
- :meth:`TimedeltaIndex.map` with ``na_action="ignore"`` now works as expected (:issue:`51644`)
@@ -495,6 +497,7 @@ Reshaping
495497
- Bug in :func:`crosstab` when ``dropna=False`` would not keep ``np.nan`` in the result (:issue:`10772`)
496498
- Bug in :func:`merge_asof` raising ``KeyError`` for extension dtypes (:issue:`52904`)
497499
- Bug in :func:`merge_asof` raising ``ValueError`` for data backed by read-only ndarrays (:issue:`53513`)
500+
- Bug in :func:`merge_asof` with ``left_index=True`` or ``right_index=True`` with mismatched index dtypes giving incorrect results in some cases instead of raising ``MergeError`` (:issue:`53870`)
498501
- Bug in :meth:`DataFrame.agg` and :meth:`Series.agg` on non-unique columns would return incorrect type when dist-like argument passed in (:issue:`51099`)
499502
- Bug in :meth:`DataFrame.combine_first` ignoring other's columns if ``other`` is empty (:issue:`53792`)
500503
- Bug in :meth:`DataFrame.idxmin` and :meth:`DataFrame.idxmax`, where the axis dtype would be lost for empty frames (:issue:`53265`)
@@ -503,6 +506,7 @@ Reshaping
503506
- Bug in :meth:`DataFrame.stack` sorting columns lexicographically (:issue:`53786`)
504507
- Bug in :meth:`DataFrame.transpose` inferring dtype for object column (:issue:`51546`)
505508
- Bug in :meth:`Series.combine_first` converting ``int64`` dtype to ``float64`` and losing precision on very large integers (:issue:`51764`)
509+
-
506510

507511
Sparse
508512
^^^^^^
@@ -537,11 +541,12 @@ Other
537541
- Bug in :func:`assert_almost_equal` now throwing assertion error for two unequal sets (:issue:`51727`)
538542
- Bug in :func:`assert_frame_equal` checks category dtypes even when asked not to check index type (:issue:`52126`)
539543
- Bug in :meth:`DataFrame.reindex` with a ``fill_value`` that should be inferred with a :class:`ExtensionDtype` incorrectly inferring ``object`` dtype (:issue:`52586`)
544+
- Bug in :meth:`DataFrame.shift` and :meth:`Series.shift` when passing both "freq" and "fill_value" silently ignoring "fill_value" instead of raising ``ValueError`` (:issue:`53832`)
545+
- Bug in :meth:`DataFrame.shift` with ``axis=1`` on a :class:`DataFrame` with a single :class:`ExtensionDtype` column giving incorrect results (:issue:`53832`)
540546
- Bug in :meth:`Series.align`, :meth:`DataFrame.align`, :meth:`Series.reindex`, :meth:`DataFrame.reindex`, :meth:`Series.interpolate`, :meth:`DataFrame.interpolate`, incorrectly failing to raise with method="asfreq" (:issue:`53620`)
541547
- Bug in :meth:`Series.map` when giving a callable to an empty series, the returned series had ``object`` dtype. It now keeps the original dtype (:issue:`52384`)
542548
- Bug in :meth:`Series.memory_usage` when ``deep=True`` throw an error with Series of objects and the returned value is incorrect, as it does not take into account GC corrections (:issue:`51858`)
543549
- Fixed incorrect ``__name__`` attribute of ``pandas._libs.json`` (:issue:`52898`)
544-
-
545550

546551
.. ***DO NOT USE THIS SECTION***
547552

pandas/_libs/lib.pyx

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1056,6 +1056,14 @@ def is_float(obj: object) -> bool:
10561056
Returns
10571057
-------
10581058
bool
1059+
1060+
Examples
1061+
--------
1062+
>>> pd.api.types.is_float(1.0)
1063+
True
1064+
1065+
>>> pd.api.types.is_float(1)
1066+
False
10591067
"""
10601068
return util.is_float_object(obj)
10611069

@@ -1067,6 +1075,14 @@ def is_integer(obj: object) -> bool:
10671075
Returns
10681076
-------
10691077
bool
1078+
1079+
Examples
1080+
--------
1081+
>>> pd.api.types.is_integer(1)
1082+
True
1083+
1084+
>>> pd.api.types.is_integer(1.0)
1085+
False
10701086
"""
10711087
return util.is_integer_object(obj)
10721088

@@ -1089,6 +1105,14 @@ def is_bool(obj: object) -> bool:
10891105
Returns
10901106
-------
10911107
bool
1108+
1109+
Examples
1110+
--------
1111+
>>> pd.api.types.is_bool(True)
1112+
True
1113+
1114+
>>> pd.api.types.is_bool(1)
1115+
False
10921116
"""
10931117
return util.is_bool_object(obj)
10941118

@@ -1100,6 +1124,14 @@ def is_complex(obj: object) -> bool:
11001124
Returns
11011125
-------
11021126
bool
1127+
1128+
Examples
1129+
--------
1130+
>>> pd.api.types.is_complex(1 + 1j)
1131+
True
1132+
1133+
>>> pd.api.types.is_complex(1)
1134+
False
11031135
"""
11041136
return util.is_complex_object(obj)
11051137

pandas/_libs/tslibs/period.pyx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2674,7 +2674,7 @@ class Period(_Period):
26742674
freq : str, default None
26752675
One of pandas period strings or corresponding objects. Accepted
26762676
strings are listed in the
2677-
:ref:`offset alias section <timeseries.offset_aliases>` in the user docs.
2677+
:ref:`period alias section <timeseries.period_aliases>` in the user docs.
26782678
If value is datetime, freq is required.
26792679
ordinal : int, default None
26802680
The period offset from the proleptic Gregorian epoch.

pandas/core/algorithms.py

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -450,6 +450,9 @@ def unique_with_mask(values, mask: npt.NDArray[np.bool_] | None = None):
450450
unique1d = unique
451451

452452

453+
_MINIMUM_COMP_ARR_LEN = 1_000_000
454+
455+
453456
def isin(comps: ListLike, values: ListLike) -> npt.NDArray[np.bool_]:
454457
"""
455458
Compute the isin boolean array.
@@ -518,7 +521,7 @@ def isin(comps: ListLike, values: ListLike) -> npt.NDArray[np.bool_]:
518521
# Albeit hashmap has O(1) look-up (vs. O(logn) in sorted array),
519522
# in1d is faster for small sizes
520523
if (
521-
len(comps_array) > 1_000_000
524+
len(comps_array) > _MINIMUM_COMP_ARR_LEN
522525
and len(values) <= 26
523526
and comps_array.dtype != object
524527
):

pandas/core/apply.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -735,7 +735,7 @@ def apply(self) -> DataFrame | Series:
735735
with np.errstate(all="ignore"):
736736
results = self.obj._mgr.apply("apply", func=self.func)
737737
# _constructor will retain self.index and self.columns
738-
return self.obj._constructor(data=results)
738+
return self.obj._constructor_from_mgr(results, axes=results.axes)
739739

740740
# broadcasting
741741
if self.result_type == "broadcast":

pandas/core/arraylike.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -349,7 +349,7 @@ def _reconstruct(result):
349349
return result
350350
if isinstance(result, BlockManager):
351351
# we went through BlockManager.apply e.g. np.sqrt
352-
result = self._constructor(result, **reconstruct_kwargs, copy=False)
352+
result = self._constructor_from_mgr(result, axes=result.axes)
353353
else:
354354
# we converted an array, lost our axes
355355
result = self._constructor(

pandas/core/arrays/datetimelike.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2235,7 +2235,7 @@ def interpolate(
22352235
*,
22362236
method,
22372237
axis: int,
2238-
index: Index | None,
2238+
index: Index,
22392239
limit,
22402240
limit_direction,
22412241
limit_area,
@@ -2255,7 +2255,7 @@ def interpolate(
22552255
else:
22562256
out_data = self._ndarray.copy()
22572257

2258-
missing.interpolate_array_2d(
2258+
missing.interpolate_2d_inplace(
22592259
out_data,
22602260
method=method,
22612261
axis=axis,

pandas/core/arrays/numpy_.py

Lines changed: 38 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,9 @@
11
from __future__ import annotations
22

3-
from typing import TYPE_CHECKING
3+
from typing import (
4+
TYPE_CHECKING,
5+
Literal,
6+
)
47

58
import numpy as np
69

@@ -32,6 +35,7 @@
3235
from pandas._typing import (
3336
AxisInt,
3437
Dtype,
38+
FillnaOptions,
3539
NpDtype,
3640
Scalar,
3741
Self,
@@ -224,12 +228,42 @@ def _values_for_factorize(self) -> tuple[np.ndarray, float | None]:
224228
fv = np.nan
225229
return self._ndarray, fv
226230

231+
def pad_or_backfill(
232+
self,
233+
*,
234+
method: FillnaOptions,
235+
axis: int,
236+
limit: int | None,
237+
limit_area: Literal["inside", "outside"] | None = None,
238+
copy: bool = True,
239+
) -> Self:
240+
"""
241+
ffill or bfill
242+
"""
243+
if copy:
244+
out_data = self._ndarray.copy()
245+
else:
246+
out_data = self._ndarray
247+
248+
meth = missing.clean_fill_method(method)
249+
missing.pad_or_backfill_inplace(
250+
out_data,
251+
method=meth,
252+
axis=axis,
253+
limit=limit,
254+
limit_area=limit_area,
255+
)
256+
257+
if not copy:
258+
return self
259+
return type(self)._simple_new(out_data, dtype=self.dtype)
260+
227261
def interpolate(
228262
self,
229263
*,
230264
method,
231265
axis: int,
232-
index: Index | None,
266+
index: Index,
233267
limit,
234268
limit_direction,
235269
limit_area,
@@ -246,7 +280,8 @@ def interpolate(
246280
else:
247281
out_data = self._ndarray.copy()
248282

249-
missing.interpolate_array_2d(
283+
# TODO: assert we have floating dtype?
284+
missing.interpolate_2d_inplace(
250285
out_data,
251286
method=method,
252287
axis=axis,

pandas/core/arrays/sparse/array.py

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -79,7 +79,7 @@
7979
check_array_indexer,
8080
unpack_tuple_and_ellipses,
8181
)
82-
from pandas.core.missing import interpolate_2d
82+
from pandas.core.missing import pad_or_backfill_inplace
8383
from pandas.core.nanops import check_below_min_count
8484

8585
from pandas.io.formats import printing
@@ -764,11 +764,11 @@ def fillna(
764764
stacklevel=find_stack_level(),
765765
)
766766
new_values = np.asarray(self)
767-
# interpolate_2d modifies new_values inplace
768-
# error: Argument "method" to "interpolate_2d" has incompatible type
769-
# "Literal['backfill', 'bfill', 'ffill', 'pad']"; expected
767+
# pad_or_backfill_inplace modifies new_values inplace
768+
# error: Argument "method" to "pad_or_backfill_inplace" has incompatible
769+
# type "Literal['backfill', 'bfill', 'ffill', 'pad']"; expected
770770
# "Literal['pad', 'backfill']"
771-
interpolate_2d(
771+
pad_or_backfill_inplace(
772772
new_values, method=method, limit=limit # type: ignore[arg-type]
773773
)
774774
return type(self)(new_values, fill_value=self.fill_value)

pandas/core/dtypes/common.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1603,6 +1603,11 @@ def pandas_dtype(dtype) -> DtypeObj:
16031603
Raises
16041604
------
16051605
TypeError if not a dtype
1606+
1607+
Examples
1608+
--------
1609+
>>> pd.api.types.pandas_dtype(int)
1610+
dtype('int64')
16061611
"""
16071612
# short-circuit
16081613
if isinstance(dtype, np.ndarray):

pandas/core/dtypes/dtypes.py

Lines changed: 42 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -697,16 +697,17 @@ class DatetimeTZDtype(PandasExtensionDtype):
697697
698698
Raises
699699
------
700-
pytz.UnknownTimeZoneError
700+
ZoneInfoNotFoundError
701701
When the requested timezone cannot be found.
702702
703703
Examples
704704
--------
705-
>>> pd.DatetimeTZDtype(tz='UTC')
705+
>>> from zoneinfo import ZoneInfo
706+
>>> pd.DatetimeTZDtype(tz=ZoneInfo('UTC'))
706707
datetime64[ns, UTC]
707708
708-
>>> pd.DatetimeTZDtype(tz='dateutil/US/Central')
709-
datetime64[ns, tzfile('/usr/share/zoneinfo/US/Central')]
709+
>>> pd.DatetimeTZDtype(tz=ZoneInfo('Europe/Paris'))
710+
datetime64[ns, Europe/Paris]
710711
"""
711712

712713
type: type[Timestamp] = Timestamp
@@ -772,13 +773,27 @@ def _creso(self) -> int:
772773
def unit(self) -> str_type:
773774
"""
774775
The precision of the datetime data.
776+
777+
Examples
778+
--------
779+
>>> from zoneinfo import ZoneInfo
780+
>>> dtype = pd.DatetimeTZDtype(tz=ZoneInfo('America/Los_Angeles'))
781+
>>> dtype.unit
782+
'ns'
775783
"""
776784
return self._unit
777785

778786
@property
779787
def tz(self) -> tzinfo:
780788
"""
781789
The timezone.
790+
791+
Examples
792+
--------
793+
>>> from zoneinfo import ZoneInfo
794+
>>> dtype = pd.DatetimeTZDtype(tz=ZoneInfo('America/Los_Angeles'))
795+
>>> dtype.tz
796+
zoneinfo.ZoneInfo(key='America/Los_Angeles')
782797
"""
783798
return self._tz
784799

@@ -967,6 +982,12 @@ def __reduce__(self):
967982
def freq(self):
968983
"""
969984
The frequency object of this PeriodDtype.
985+
986+
Examples
987+
--------
988+
>>> dtype = pd.PeriodDtype(freq='D')
989+
>>> dtype.freq
990+
<Day>
970991
"""
971992
return self._freq
972993

@@ -1217,6 +1238,12 @@ def closed(self) -> IntervalClosedType:
12171238
def subtype(self):
12181239
"""
12191240
The dtype of the Interval bounds.
1241+
1242+
Examples
1243+
--------
1244+
>>> dtype = pd.IntervalDtype(subtype='int64', closed='both')
1245+
>>> dtype.subtype
1246+
dtype('int64')
12201247
"""
12211248
return self._subtype
12221249

@@ -1565,6 +1592,17 @@ class SparseDtype(ExtensionDtype):
15651592
Methods
15661593
-------
15671594
None
1595+
1596+
Examples
1597+
--------
1598+
>>> ser = pd.Series([1, 0, 0], dtype=pd.SparseDtype(dtype=int, fill_value=0))
1599+
>>> ser
1600+
0 1
1601+
1 0
1602+
2 0
1603+
dtype: Sparse[int64, 0]
1604+
>>> ser.sparse.density
1605+
0.3333333333333333
15681606
"""
15691607

15701608
# We include `_is_na_fill_value` in the metadata to avoid hash collisions

0 commit comments

Comments
 (0)