Skip to content

Commit 6575577

Browse files
committed
Merge branch 'master' into faster_masked_transpose
2 parents 3bba6d3 + 4dd071e commit 6575577

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

43 files changed

+883
-221
lines changed

.github/workflows/comment-commands.yml

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -15,21 +15,21 @@ jobs:
1515
concurrency:
1616
group: ${{ github.actor }}-issue-assign
1717
steps:
18-
run: |
19-
echo "Assigning issue ${{ github.event.issue.number }} to ${{ github.event.comment.user.login }}"
20-
curl -H "Authorization: token ${{ secrets.GITHUB_TOKEN }}" -d '{"assignees": ["${{ github.event.comment.user.login }}"]}' https://api.github.com/repos/${{ github.repository }}/issues/${{ github.event.issue.number }}/assignees
18+
- run: |
19+
echo "Assigning issue ${{ github.event.issue.number }} to ${{ github.event.comment.user.login }}"
20+
curl -H "Authorization: token ${{ secrets.GITHUB_TOKEN }}" -d '{"assignees": ["${{ github.event.comment.user.login }}"]}' https://api.github.com/repos/${{ github.repository }}/issues/${{ github.event.issue.number }}/assignees
2121
preview_docs:
2222
runs-on: ubuntu-22.04
2323
if: github.event.issue.pull_request && github.event.comment.body == '/preview'
2424
concurrency:
2525
group: ${{ github.actor }}-preview-docs
2626
steps:
27-
run: |
28-
if curl --output /dev/null --silent --head --fail "https://pandas.pydata.org/preview/${{ github.event.issue.number }}/"; then
29-
curl -H "Authorization: token ${{ secrets.GITHUB_TOKEN }}" -d '{"body": "Website preview of this PR available at: https://pandas.pydata.org/preview/${{ github.event.issue.number }}/"}' https://api.github.com/repos/${{ github.repository }}/issues/${{ github.event.issue.number }}/comments
30-
else
31-
curl -H "Authorization: token ${{ secrets.GITHUB_TOKEN }}" -d '{"body": "No preview found for PR #${{ github.event.issue.number }}. Did the docs build complete?"}' https://api.github.com/repos/${{ github.repository }}/issues/${{ github.event.issue.number }}/comments
32-
fi
27+
- run: |
28+
if curl --output /dev/null --silent --head --fail "https://pandas.pydata.org/preview/${{ github.event.issue.number }}/"; then
29+
curl -H "Authorization: token ${{ secrets.GITHUB_TOKEN }}" -d '{"body": "Website preview of this PR available at: https://pandas.pydata.org/preview/${{ github.event.issue.number }}/"}' https://api.github.com/repos/${{ github.repository }}/issues/${{ github.event.issue.number }}/comments
30+
else
31+
curl -H "Authorization: token ${{ secrets.GITHUB_TOKEN }}" -d '{"body": "No preview found for PR #${{ github.event.issue.number }}. Did the docs build complete?"}' https://api.github.com/repos/${{ github.repository }}/issues/${{ github.event.issue.number }}/comments
32+
fi
3333
asv_run:
3434
runs-on: ubuntu-22.04
3535
# TODO: Support more benchmarking options later, against different branches, against self, etc

ci/deps/actions-310.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ dependencies:
3232
- gcsfs>=2022.05.0
3333
- jinja2>=3.1.2
3434
- lxml>=4.8.0
35-
- matplotlib>=3.6.1, <3.7.0
35+
- matplotlib>=3.6.1
3636
- numba>=0.55.2
3737
- numexpr>=2.8.0
3838
- odfpy>=1.4.1

ci/deps/actions-311.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ dependencies:
3232
- gcsfs>=2022.05.0
3333
- jinja2>=3.1.2
3434
- lxml>=4.8.0
35-
- matplotlib>=3.6.1, <3.7.0
35+
- matplotlib>=3.6.1
3636
# - numba>=0.55.2 not compatible with 3.11
3737
- numexpr>=2.8.0
3838
- odfpy>=1.4.1

ci/deps/actions-38-downstream_compat.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ dependencies:
3333
- gcsfs>=2022.05.0
3434
- jinja2>=3.1.2
3535
- lxml>=4.8.0
36-
- matplotlib>=3.6.1, <3.7.0
36+
- matplotlib>=3.6.1
3737
- numba>=0.55.2
3838
- numexpr>=2.8.0
3939
- odfpy>=1.4.1

ci/deps/actions-38.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ dependencies:
3232
- gcsfs>=2022.05.0
3333
- jinja2>=3.1.2
3434
- lxml>=4.8.0
35-
- matplotlib>=3.6.1, <3.7.0
35+
- matplotlib>=3.6.1
3636
- numba>=0.55.2
3737
- numexpr>=2.8.0
3838
- odfpy>=1.4.1

ci/deps/actions-39.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ dependencies:
3232
- gcsfs>=2022.05.0
3333
- jinja2>=3.1.2
3434
- lxml>=4.8.0
35-
- matplotlib>=3.6.1, <3.7.0
35+
- matplotlib>=3.6.1
3636
- numba>=0.55.2
3737
- numexpr>=2.8.0
3838
- odfpy>=1.4.1

ci/deps/circle-38-arm64.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ dependencies:
3232
- gcsfs>=2022.05.0
3333
- jinja2>=3.1.2
3434
- lxml>=4.8.0
35-
- matplotlib>=3.6.1, <3.7.0
35+
- matplotlib>=3.6.1
3636
- numba>=0.55.2
3737
- numexpr>=2.8.0
3838
- odfpy>=1.4.1

doc/source/user_guide/io.rst

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5239,6 +5239,7 @@ See the `Full Documentation <https://github.com/wesm/feather>`__.
52395239
Write to a feather file.
52405240

52415241
.. ipython:: python
5242+
:okwarning:
52425243
52435244
df.to_feather("example.feather")
52445245
@@ -5382,6 +5383,7 @@ Serializing a ``DataFrame`` to parquet may include the implicit index as one or
53825383
more columns in the output file. Thus, this code:
53835384

53845385
.. ipython:: python
5386+
:okwarning:
53855387
53865388
df = pd.DataFrame({"a": [1, 2], "b": [3, 4]})
53875389
df.to_parquet("test.parquet", engine="pyarrow")
@@ -5398,6 +5400,7 @@ If you want to omit a dataframe's indexes when writing, pass ``index=False`` to
53985400
:func:`~pandas.DataFrame.to_parquet`:
53995401

54005402
.. ipython:: python
5403+
:okwarning:
54015404
54025405
df.to_parquet("test.parquet", index=False)
54035406
@@ -5420,6 +5423,7 @@ Partitioning Parquet files
54205423
Parquet supports partitioning of data based on the values of one or more columns.
54215424

54225425
.. ipython:: python
5426+
:okwarning:
54235427
54245428
df = pd.DataFrame({"a": [0, 0, 1, 1], "b": [0, 1, 0, 1]})
54255429
df.to_parquet(path="test", engine="pyarrow", partition_cols=["a"], compression=None)

doc/source/user_guide/scale.rst

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,7 @@ Suppose our raw dataset on disk has many columns::
4242
That can be generated by the following code snippet:
4343

4444
.. ipython:: python
45+
:okwarning:
4546
4647
import pandas as pd
4748
import numpy as np
@@ -106,6 +107,7 @@ referred to as "low-cardinality" data). By using more efficient data types, you
106107
can store larger datasets in memory.
107108

108109
.. ipython:: python
110+
:okwarning:
109111
110112
ts = make_timeseries(freq="30S", seed=0)
111113
ts.to_parquet("timeseries.parquet")
@@ -183,6 +185,7 @@ Suppose we have an even larger "logical dataset" on disk that's a directory of p
183185
files. Each file in the directory represents a different year of the entire dataset.
184186

185187
.. ipython:: python
188+
:okwarning:
186189
187190
import pathlib
188191

doc/source/whatsnew/v0.19.0.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -905,6 +905,7 @@ As a consequence of this change, ``PeriodIndex`` no longer has an integer dtype:
905905
**New behavior**:
906906

907907
.. ipython:: python
908+
:okwarning:
908909
909910
pi = pd.PeriodIndex(["2016-08-01"], freq="D")
910911
pi

doc/source/whatsnew/v2.0.1.rst

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@ Fixed regressions
1919
- Fixed regression in :meth:`DataFrame.sort_values` not resetting index when :class:`DataFrame` is already sorted and ``ignore_index=True`` (:issue:`52553`)
2020
- Fixed regression in :meth:`MultiIndex.isin` raising ``TypeError`` for ``Generator`` (:issue:`52568`)
2121
- Fixed regression in :meth:`Series.describe` showing ``RuntimeWarning`` for extension dtype :class:`Series` with one element (:issue:`52515`)
22+
- Fixed regression in :meth:`SeriesGroupBy.agg` failing when grouping with categorical data, multiple groupings, ``as_index=False``, and a list of aggregations (:issue:`52760`)
2223

2324
.. ---------------------------------------------------------------------------
2425
.. _whatsnew_201.bug_fixes:
@@ -27,6 +28,8 @@ Bug fixes
2728
~~~~~~~~~
2829
- Bug in :attr:`Series.dt.days` that would overflow ``int32`` number of days (:issue:`52391`)
2930
- Bug in :class:`arrays.DatetimeArray` constructor returning an incorrect unit when passed a non-nanosecond numpy datetime array (:issue:`52555`)
31+
- Bug in :class:`~arrays.ArrowExtensionArray` with duration dtype overflowing when constructed from data containing numpy ``NaT`` (:issue:`52843`)
32+
- Bug in :func:`Series.dt.round` when passing a ``freq`` of equal or higher resolution compared to the :class:`Series` would raise a ``ZeroDivisionError`` (:issue:`52761`)
3033
- Bug in :func:`Series.median` with :class:`ArrowDtype` returning an approximate median (:issue:`52679`)
3134
- Bug in :func:`api.interchange.from_dataframe` was unnecessarily raising on categorical dtypes (:issue:`49889`)
3235
- Bug in :func:`api.interchange.from_dataframe` was unnecessarily raising on large string dtypes (:issue:`52795`)
@@ -35,9 +38,11 @@ Bug fixes
3538
- Bug in :func:`to_datetime` and :func:`to_timedelta` when trying to convert numeric data with a :class:`ArrowDtype` (:issue:`52425`)
3639
- Bug in :func:`to_numeric` with ``errors='coerce'`` and ``dtype_backend='pyarrow'`` with :class:`ArrowDtype` data (:issue:`52588`)
3740
- Bug in :meth:`ArrowDtype.__from_arrow__` not respecting if dtype is explicitly given (:issue:`52533`)
41+
- Bug in :meth:`DataFrame.describe` not respecting ``ArrowDtype`` in ``include`` and ``exclude`` (:issue:`52570`)
3842
- Bug in :meth:`DataFrame.max` and related casting different :class:`Timestamp` resolutions always to nanoseconds (:issue:`52524`)
3943
- Bug in :meth:`Series.describe` not returning :class:`ArrowDtype` with ``pyarrow.float64`` type with numeric data (:issue:`52427`)
4044
- Bug in :meth:`Series.dt.tz_localize` incorrectly localizing timestamps with :class:`ArrowDtype` (:issue:`52677`)
45+
- Bug in arithmetic between ``np.datetime64`` and ``np.timedelta64`` ``NaT`` scalars with units always returning nanosecond resolution (:issue:`52295`)
4146
- Bug in logical and comparison operations between :class:`ArrowDtype` and numpy masked types (e.g. ``"boolean"``) (:issue:`52625`)
4247
- Fixed bug in :func:`merge` when merging with ``ArrowDtype`` one one and a NumPy dtype on the other side (:issue:`52406`)
4348
- Fixed segfault in :meth:`Series.to_numpy` with ``null[pyarrow]`` dtype (:issue:`52443`)
@@ -50,6 +55,7 @@ Other
5055
- :class:`DataFrame` created from empty dicts had :attr:`~DataFrame.columns` of dtype ``object``. It is now a :class:`RangeIndex` (:issue:`52404`)
5156
- :class:`Series` created from empty dicts had :attr:`~Series.index` of dtype ``object``. It is now a :class:`RangeIndex` (:issue:`52404`)
5257
- Implemented :meth:`Series.str.split` and :meth:`Series.str.rsplit` for :class:`ArrowDtype` with ``pyarrow.string`` (:issue:`52401`)
58+
- Implemented most ``str`` accessor methods for :class:`ArrowDtype` with ``pyarrow.string`` (:issue:`52401`)
5359

5460
.. ---------------------------------------------------------------------------
5561
.. _whatsnew_201.contributors:

doc/source/whatsnew/v2.1.0.rst

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -87,6 +87,7 @@ Other enhancements
8787
- Added to the escape mode "latex-math" preserving without escaping all characters between "\(" and "\)" in formatter (:issue:`51903`)
8888
- Adding ``engine_kwargs`` parameter to :meth:`DataFrame.read_excel` (:issue:`52214`)
8989
- Classes that are useful for type-hinting have been added to the public API in the new submodule ``pandas.api.typing`` (:issue:`48577`)
90+
- Implemented :attr:`Series.dt.is_month_start`, :attr:`Series.dt.is_month_end`, :attr:`Series.dt.is_year_start`, :attr:`Series.dt.is_year_end`, :attr:`Series.dt.is_quarter_start`, :attr:`Series.dt.is_quarter_end`, :attr:`Series.dt.is_days_in_month`, :attr:`Series.dt.unit`, :meth:`Series.dt.is_normalize`, :meth:`Series.dt.day_name`, :meth:`Series.dt.month_name`, :meth:`Series.dt.tz_convert` for :class:`ArrowDtype` with ``pyarrow.timestamp`` (:issue:`52388`, :issue:`51718`)
9091
- Implemented ``__from_arrow__`` on :class:`DatetimeTZDtype`. (:issue:`52201`)
9192
- Implemented ``__pandas_priority__`` to allow custom types to take precedence over :class:`DataFrame`, :class:`Series`, :class:`Index`, or :class:`ExtensionArray` for arithmetic operations, :ref:`see the developer guide <extending.pandas_priority>` (:issue:`48347`)
9293
- Improve error message when having incompatible columns using :meth:`DataFrame.merge` (:issue:`51861`)
@@ -233,6 +234,8 @@ Deprecations
233234
- Deprecated :func:`is_datetime64tz_dtype`, check ``isinstance(dtype, pd.DatetimeTZDtype)`` instead (:issue:`52607`)
234235
- Deprecated :func:`is_int64_dtype`, check ``dtype == np.dtype(np.int64)`` instead (:issue:`52564`)
235236
- Deprecated :func:`is_interval_dtype`, check ``isinstance(dtype, pd.IntervalDtype)`` instead (:issue:`52607`)
237+
- Deprecated :func:`is_period_dtype`, check ``isinstance(dtype, pd.PeriodDtype)`` instead (:issue:`52642`)
238+
- Deprecated :func:`is_sparse`, check ``isinstance(dtype, pd.SparseDtype)`` instead (:issue:`52642`)
236239
- Deprecated :meth:`DataFrame.applymap`. Use the new :meth:`DataFrame.map` method instead (:issue:`52353`)
237240
- Deprecated :meth:`DataFrame.swapaxes` and :meth:`Series.swapaxes`, use :meth:`DataFrame.transpose` or :meth:`Series.transpose` instead (:issue:`51946`)
238241
- Deprecated ``freq`` parameter in :class:`PeriodArray` constructor, pass ``dtype`` instead (:issue:`52462`)

environment.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@ dependencies:
3535
- ipython
3636
- jinja2>=3.1.2
3737
- lxml>=4.8.0
38-
- matplotlib>=3.6.1, <3.7.0
38+
- matplotlib>=3.6.1
3939
- numba>=0.55.2
4040
- numexpr>=2.8.0 # pin for "Run checks on imported code" job
4141
- openpyxl<3.1.1, >=3.0.7

pandas/compat/numpy/__init__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@
88
_nlv = Version(_np_version)
99
np_version_under1p22 = _nlv < Version("1.22")
1010
np_version_gte1p24 = _nlv >= Version("1.24")
11+
np_version_gte1p24p3 = _nlv >= Version("1.24.3")
1112
is_numpy_dev = _nlv.dev is not None
1213
_min_numpy_ver = "1.21.6"
1314

pandas/conftest.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -137,7 +137,10 @@ def pytest_collection_modifyitems(items, config) -> None:
137137
ignored_doctest_warnings = [
138138
("is_int64_dtype", "is_int64_dtype is deprecated"),
139139
("is_interval_dtype", "is_interval_dtype is deprecated"),
140+
("is_period_dtype", "is_period_dtype is deprecated"),
140141
("is_datetime64tz_dtype", "is_datetime64tz_dtype is deprecated"),
142+
("is_categorical_dtype", "is_categorical_dtype is deprecated"),
143+
("is_sparse", "is_sparse is deprecated"),
141144
# Docstring divides by zero to show behavior difference
142145
("missing.mask_zero_div_zero", "divide by zero encountered"),
143146
(
@@ -149,7 +152,6 @@ def pytest_collection_modifyitems(items, config) -> None:
149152
"(Series|DataFrame).bool is now deprecated and will be removed "
150153
"in future version of pandas",
151154
),
152-
("is_categorical_dtype", "is_categorical_dtype is deprecated"),
153155
]
154156

155157
for item in items:

0 commit comments

Comments
 (0)