Skip to content

Commit 87aa7ac

Browse files
committed
Merge remote-tracking branch 'upstream/main' into tst/np/2
2 parents 5431ab5 + c46fb76 commit 87aa7ac

File tree

26 files changed

+118
-57
lines changed

26 files changed

+118
-57
lines changed

ci/code_checks.sh

-1
Original file line numberDiff line numberDiff line change
@@ -462,7 +462,6 @@ if [[ -z "$CHECK" || "$CHECK" == "docstrings" ]]; then
462462
-i "pandas.io.stata.StataReader.variable_labels RT03,SA01" \
463463
-i "pandas.io.stata.StataWriter.write_file SA01" \
464464
-i "pandas.json_normalize RT03,SA01" \
465-
-i "pandas.merge_asof PR07,RT03" \
466465
-i "pandas.period_range RT03,SA01" \
467466
-i "pandas.plotting.andrews_curves RT03,SA01" \
468467
-i "pandas.plotting.lag_plot RT03,SA01" \

doc/source/user_guide/basics.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -1606,7 +1606,7 @@ For instance:
16061606
This method does not convert the row to a Series object; it merely
16071607
returns the values inside a namedtuple. Therefore,
16081608
:meth:`~DataFrame.itertuples` preserves the data type of the values
1609-
and is generally faster as :meth:`~DataFrame.iterrows`.
1609+
and is generally faster than :meth:`~DataFrame.iterrows`.
16101610

16111611
.. note::
16121612

doc/source/user_guide/io.rst

+4-4
Original file line numberDiff line numberDiff line change
@@ -3003,7 +3003,7 @@ However, if XPath does not reference node names such as default, ``/*``, then
30033003
.. note::
30043004

30053005
Since ``xpath`` identifies the parent of content to be parsed, only immediate
3006-
desendants which include child nodes or current attributes are parsed.
3006+
descendants which include child nodes or current attributes are parsed.
30073007
Therefore, ``read_xml`` will not parse the text of grandchildren or other
30083008
descendants and will not parse attributes of any descendant. To retrieve
30093009
lower level content, adjust xpath to lower level. For example,
@@ -3535,7 +3535,7 @@ For example, to read in a ``MultiIndex`` index without names:
35353535
df = pd.read_excel("path_to_file.xlsx", index_col=[0, 1])
35363536
df
35373537
3538-
If the index has level names, they will parsed as well, using the same
3538+
If the index has level names, they will be parsed as well, using the same
35393539
parameters.
35403540

35413541
.. ipython:: python
@@ -5847,10 +5847,10 @@ You can check if a table exists using :func:`~pandas.io.sql.has_table`
58475847
Schema support
58485848
''''''''''''''
58495849

5850-
Reading from and writing to different schema's is supported through the ``schema``
5850+
Reading from and writing to different schemas is supported through the ``schema``
58515851
keyword in the :func:`~pandas.read_sql_table` and :func:`~pandas.DataFrame.to_sql`
58525852
functions. Note however that this depends on the database flavor (sqlite does not
5853-
have schema's). For example:
5853+
have schemas). For example:
58545854

58555855
.. code-block:: python
58565856

doc/source/user_guide/missing_data.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -319,7 +319,7 @@ Missing values propagate through arithmetic operations between pandas objects.
319319
320320
The descriptive statistics and computational methods discussed in the
321321
:ref:`data structure overview <basics.stats>` (and listed :ref:`here
322-
<api.series.stats>` and :ref:`here <api.dataframe.stats>`) are all
322+
<api.series.stats>` and :ref:`here <api.dataframe.stats>`) all
323323
account for missing data.
324324

325325
When summing data, NA values or empty data will be treated as zero.

doc/source/user_guide/options.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ Options and settings
88

99
Overview
1010
--------
11-
pandas has an options API configure and customize global behavior related to
11+
pandas has an options API to configure and customize global behavior related to
1212
:class:`DataFrame` display, data behavior and more.
1313

1414
Options have a full "dotted-style", case-insensitive name (e.g. ``display.max_rows``).

doc/source/user_guide/timeseries.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -1479,7 +1479,7 @@ or some other non-observed day. Defined observance rules are:
14791479
"after_nearest_workday", "apply ``nearest_workday`` and then move to next workday after that day"
14801480
"sunday_to_monday", "move Sunday to following Monday"
14811481
"next_monday_or_tuesday", "move Saturday to Monday and Sunday/Monday to Tuesday"
1482-
"previous_friday", move Saturday and Sunday to previous Friday"
1482+
"previous_friday", "move Saturday and Sunday to previous Friday"
14831483
"next_monday", "move Saturday and Sunday to following Monday"
14841484
"weekend_to_monday", "same as ``next_monday``"
14851485

doc/source/whatsnew/v3.0.0.rst

+3-1
Original file line numberDiff line numberDiff line change
@@ -503,8 +503,8 @@ Timezones
503503

504504
Numeric
505505
^^^^^^^
506+
- Bug in :meth:`DataFrame.quantile` where the column type was not preserved when ``numeric_only=True`` with a list-like ``q`` produced an empty result (:issue:`59035`)
506507
- Bug in ``np.matmul`` with :class:`Index` inputs raising a ``TypeError`` (:issue:`57079`)
507-
-
508508

509509
Conversion
510510
^^^^^^^^^^
@@ -546,6 +546,7 @@ I/O
546546
- Bug in :meth:`DataFrame.to_excel` when writing empty :class:`DataFrame` with :class:`MultiIndex` on both axes (:issue:`57696`)
547547
- Bug in :meth:`DataFrame.to_stata` when writing :class:`DataFrame` and ``byteorder=`big```. (:issue:`58969`)
548548
- Bug in :meth:`DataFrame.to_string` that raised ``StopIteration`` with nested DataFrames. (:issue:`16098`)
549+
- Bug in :meth:`HDFStore.get` was failing to save data of dtype datetime64[s] correctly (:issue:`59004`)
549550
- Bug in :meth:`read_csv` raising ``TypeError`` when ``index_col`` is specified and ``na_values`` is a dict containing the key ``None``. (:issue:`57547`)
550551
- Bug in :meth:`read_stata` raising ``KeyError`` when input file is stored in big-endian format and contains strL data. (:issue:`58638`)
551552

@@ -608,6 +609,7 @@ Other
608609
- Bug in :meth:`DataFrame.where` where using a non-bool type array in the function would return a ``ValueError`` instead of a ``TypeError`` (:issue:`56330`)
609610
- Bug in :meth:`Index.sort_values` when passing a key function that turns values into tuples, e.g. ``key=natsort.natsort_key``, would raise ``TypeError`` (:issue:`56081`)
610611
- Bug in :meth:`Series.diff` allowing non-integer values for the ``periods`` argument. (:issue:`56607`)
612+
- Bug in :meth:`Series.dt` methods in :class:`ArrowDtype` that were returning incorrect values. (:issue:`57355`)
611613
- Bug in :meth:`Series.rank` that doesn't preserve missing values for nullable integers when ``na_option='keep'``. (:issue:`56976`)
612614
- Bug in :meth:`Series.replace` and :meth:`DataFrame.replace` inconsistently replacing matching instances when ``regex=True`` and missing values are present. (:issue:`56599`)
613615
- Bug in Dataframe Interchange Protocol implementation was returning incorrect results for data buffers' associated dtype, for string and datetime columns (:issue:`54781`)

pandas/core/arrays/arrow/array.py

+18-17
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,6 @@
1818

1919
from pandas._libs import lib
2020
from pandas._libs.tslibs import (
21-
NaT,
2221
Timedelta,
2322
Timestamp,
2423
timezones,
@@ -2612,17 +2611,19 @@ def _str_wrap(self, width: int, **kwargs) -> Self:
26122611
@property
26132612
def _dt_days(self) -> Self:
26142613
return type(self)(
2615-
pa.array(self._to_timedeltaarray().days, from_pandas=True, type=pa.int32())
2614+
pa.array(
2615+
self._to_timedeltaarray().components.days,
2616+
from_pandas=True,
2617+
type=pa.int32(),
2618+
)
26162619
)
26172620

26182621
@property
26192622
def _dt_hours(self) -> Self:
26202623
return type(self)(
26212624
pa.array(
2622-
[
2623-
td.components.hours if td is not NaT else None
2624-
for td in self._to_timedeltaarray()
2625-
],
2625+
self._to_timedeltaarray().components.hours,
2626+
from_pandas=True,
26262627
type=pa.int32(),
26272628
)
26282629
)
@@ -2631,10 +2632,8 @@ def _dt_hours(self) -> Self:
26312632
def _dt_minutes(self) -> Self:
26322633
return type(self)(
26332634
pa.array(
2634-
[
2635-
td.components.minutes if td is not NaT else None
2636-
for td in self._to_timedeltaarray()
2637-
],
2635+
self._to_timedeltaarray().components.minutes,
2636+
from_pandas=True,
26382637
type=pa.int32(),
26392638
)
26402639
)
@@ -2643,18 +2642,18 @@ def _dt_minutes(self) -> Self:
26432642
def _dt_seconds(self) -> Self:
26442643
return type(self)(
26452644
pa.array(
2646-
self._to_timedeltaarray().seconds, from_pandas=True, type=pa.int32()
2645+
self._to_timedeltaarray().components.seconds,
2646+
from_pandas=True,
2647+
type=pa.int32(),
26472648
)
26482649
)
26492650

26502651
@property
26512652
def _dt_milliseconds(self) -> Self:
26522653
return type(self)(
26532654
pa.array(
2654-
[
2655-
td.components.milliseconds if td is not NaT else None
2656-
for td in self._to_timedeltaarray()
2657-
],
2655+
self._to_timedeltaarray().components.milliseconds,
2656+
from_pandas=True,
26582657
type=pa.int32(),
26592658
)
26602659
)
@@ -2663,7 +2662,7 @@ def _dt_milliseconds(self) -> Self:
26632662
def _dt_microseconds(self) -> Self:
26642663
return type(self)(
26652664
pa.array(
2666-
self._to_timedeltaarray().microseconds,
2665+
self._to_timedeltaarray().components.microseconds,
26672666
from_pandas=True,
26682667
type=pa.int32(),
26692668
)
@@ -2673,7 +2672,9 @@ def _dt_microseconds(self) -> Self:
26732672
def _dt_nanoseconds(self) -> Self:
26742673
return type(self)(
26752674
pa.array(
2676-
self._to_timedeltaarray().nanoseconds, from_pandas=True, type=pa.int32()
2675+
self._to_timedeltaarray().components.nanoseconds,
2676+
from_pandas=True,
2677+
type=pa.int32(),
26772678
)
26782679
)
26792680

pandas/core/frame.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -13078,7 +13078,7 @@ def quantile(
1307813078

1307913079
if len(data.columns) == 0:
1308013080
# GH#23925 _get_numeric_data may have dropped all columns
13081-
cols = Index([], name=self.columns.name)
13081+
cols = self.columns[:0]
1308213082

1308313083
dtype = np.float64
1308413084
if axis == 1:

pandas/core/generic.py

+1-2
Original file line numberDiff line numberDiff line change
@@ -158,7 +158,6 @@
158158
Index,
159159
MultiIndex,
160160
PeriodIndex,
161-
RangeIndex,
162161
default_index,
163162
ensure_index,
164163
)
@@ -1852,7 +1851,7 @@ def _drop_labels_or_levels(self, keys, axis: AxisInt = 0):
18521851
else:
18531852
# Drop the last level of Index by replacing with
18541853
# a RangeIndex
1855-
dropped.columns = RangeIndex(dropped.columns.size)
1854+
dropped.columns = default_index(dropped.columns.size)
18561855

18571856
# Handle dropping index labels
18581857
if labels_to_drop:

pandas/core/groupby/groupby.py

+2-3
Original file line numberDiff line numberDiff line change
@@ -128,7 +128,6 @@ class providing the base-class of operations.
128128
from pandas.core.indexes.api import (
129129
Index,
130130
MultiIndex,
131-
RangeIndex,
132131
default_index,
133132
)
134133
from pandas.core.internals.blocks import ensure_block_shape
@@ -1264,7 +1263,7 @@ def _set_result_index_ordered(
12641263
if self._grouper.has_dropped_na:
12651264
# Add back in any missing rows due to dropna - index here is integral
12661265
# with values referring to the row of the input so can use RangeIndex
1267-
result = result.reindex(RangeIndex(len(index)), axis=0)
1266+
result = result.reindex(default_index(len(index)), axis=0)
12681267
result = result.set_axis(index, axis=0)
12691268

12701269
return result
@@ -1334,7 +1333,7 @@ def _wrap_aggregated_output(
13341333
# enforced in __init__
13351334
result = self._insert_inaxis_grouper(result, qs=qs)
13361335
result = result._consolidate()
1337-
result.index = RangeIndex(len(result))
1336+
result.index = default_index(len(result))
13381337

13391338
else:
13401339
index = self._grouper.result_index

pandas/core/groupby/grouper.py

+2-1
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,7 @@
3434
from pandas.core.indexes.api import (
3535
Index,
3636
MultiIndex,
37+
default_index,
3738
)
3839
from pandas.core.series import Series
3940

@@ -901,7 +902,7 @@ def is_in_obj(gpr) -> bool:
901902
if len(groupings) == 0 and len(obj):
902903
raise ValueError("No group keys passed!")
903904
if len(groupings) == 0:
904-
groupings.append(Grouping(Index([], dtype="int"), np.array([], dtype=np.intp)))
905+
groupings.append(Grouping(default_index(0), np.array([], dtype=np.intp)))
905906

906907
# create the internals grouper
907908
grouper = ops.BaseGrouper(group_axis, groupings, sort=sort, dropna=dropna)

pandas/core/indexes/api.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -130,7 +130,7 @@ def _get_combined_index(
130130
# TODO: handle index names!
131131
indexes = _get_distinct_objs(indexes)
132132
if len(indexes) == 0:
133-
index = Index([])
133+
index: Index = default_index(0)
134134
elif len(indexes) == 1:
135135
index = indexes[0]
136136
elif intersect:

pandas/core/internals/managers.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -249,7 +249,7 @@ def blklocs(self) -> npt.NDArray[np.intp]:
249249
def make_empty(self, axes=None) -> Self:
250250
"""return an empty BlockManager with the items axis of len 0"""
251251
if axes is None:
252-
axes = [Index([])] + self.axes[1:]
252+
axes = [default_index(0)] + self.axes[1:]
253253

254254
# preserve dtype if possible
255255
if self.ndim == 1:

pandas/core/methods/selectn.py

+4-3
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,8 @@
2929
)
3030
from pandas.core.dtypes.dtypes import BaseMaskedDtype
3131

32+
from pandas.core.indexes.api import default_index
33+
3234
if TYPE_CHECKING:
3335
from pandas._typing import (
3436
DtypeObj,
@@ -38,6 +40,7 @@
3840

3941
from pandas import (
4042
DataFrame,
43+
Index,
4144
Series,
4245
)
4346
else:
@@ -199,8 +202,6 @@ def __init__(self, obj: DataFrame, n: int, keep: str, columns: IndexLabel) -> No
199202
self.columns = columns
200203

201204
def compute(self, method: str) -> DataFrame:
202-
from pandas.core.api import Index
203-
204205
n = self.n
205206
frame = self.obj
206207
columns = self.columns
@@ -227,7 +228,7 @@ def get_indexer(current_indexer: Index, other_indexer: Index) -> Index:
227228
original_index = frame.index
228229
cur_frame = frame = frame.reset_index(drop=True)
229230
cur_n = n
230-
indexer = Index([], dtype=np.int64)
231+
indexer: Index = default_index(0)
231232

232233
for i, column in enumerate(columns):
233234
# For each column we apply method to cur_frame[column].

pandas/core/reshape/merge.py

+3
Original file line numberDiff line numberDiff line change
@@ -673,7 +673,9 @@ def merge_asof(
673673
Parameters
674674
----------
675675
left : DataFrame or named Series
676+
First pandas object to merge.
676677
right : DataFrame or named Series
678+
Second pandas object to merge.
677679
on : label
678680
Field name to join on. Must be found in both DataFrames.
679681
The data MUST be ordered. Furthermore this must be a numeric column,
@@ -712,6 +714,7 @@ def merge_asof(
712714
Returns
713715
-------
714716
DataFrame
717+
A DataFrame of the two merged objects.
715718
716719
See Also
717720
--------

pandas/core/reshape/reshape.py

+2-2
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,7 @@
4242
from pandas.core.indexes.api import (
4343
Index,
4444
MultiIndex,
45-
RangeIndex,
45+
default_index,
4646
)
4747
from pandas.core.reshape.concat import concat
4848
from pandas.core.series import Series
@@ -1047,7 +1047,7 @@ def stack_reshape(
10471047
if data.ndim == 1:
10481048
data.name = 0
10491049
else:
1050-
data.columns = RangeIndex(len(data.columns))
1050+
data.columns = default_index(len(data.columns))
10511051
buf.append(data)
10521052

10531053
if len(buf) > 0 and not frame.empty:

pandas/io/html.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -1178,7 +1178,7 @@ def read_html(
11781178
**after** `skiprows` is applied.
11791179
11801180
This function will *always* return a list of :class:`DataFrame` *or*
1181-
it will fail, e.g., it will *not* return an empty list.
1181+
it will fail, i.e., it will *not* return an empty list.
11821182
11831183
Examples
11841184
--------

0 commit comments

Comments
 (0)