Skip to content

Commit b3ae92d

Browse files
committed
Merge branch '2.1.x' of https://github.com/pandas-dev/pandas into 2.1.x
2 parents d1454db + bf91684 commit b3ae92d

File tree

79 files changed

+847
-366
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

79 files changed

+847
-366
lines changed

.github/actions/build_pandas/action.yml

+2-2
Original file line numberDiff line numberDiff line change
@@ -25,8 +25,8 @@ runs:
2525
- name: Build Pandas
2626
run: |
2727
if [[ ${{ inputs.editable }} == "true" ]]; then
28-
pip install -e . --no-build-isolation -v
28+
pip install -e . --no-build-isolation -v --no-deps
2929
else
30-
pip install . --no-build-isolation -v
30+
pip install . --no-build-isolation -v --no-deps
3131
fi
3232
shell: bash -el {0}

.github/workflows/unit-tests.yml

+4-4
Original file line numberDiff line numberDiff line change
@@ -236,7 +236,7 @@ jobs:
236236
. ~/virtualenvs/pandas-dev/bin/activate
237237
python -m pip install --no-cache-dir -U pip wheel setuptools meson[ninja]==1.2.1 meson-python==0.13.1
238238
python -m pip install numpy --config-settings=setup-args="-Dallow-noblas=true"
239-
python -m pip install --no-cache-dir versioneer[toml] "cython<3.0.3" python-dateutil pytz pytest>=7.3.2 pytest-xdist>=2.2.0 pytest-asyncio>=0.17 hypothesis>=6.46.1
239+
python -m pip install --no-cache-dir versioneer[toml] cython python-dateutil pytz pytest>=7.3.2 pytest-xdist>=2.2.0 pytest-asyncio>=0.17 hypothesis>=6.46.1
240240
python -m pip install --no-cache-dir --no-build-isolation -e .
241241
python -m pip list --no-cache-dir
242242
export PANDAS_CI=1
@@ -274,7 +274,7 @@ jobs:
274274
/opt/python/cp311-cp311/bin/python -m venv ~/virtualenvs/pandas-dev
275275
. ~/virtualenvs/pandas-dev/bin/activate
276276
python -m pip install --no-cache-dir -U pip wheel setuptools meson-python==0.13.1 meson[ninja]==1.2.1
277-
python -m pip install --no-cache-dir versioneer[toml] "cython<3.0.3" numpy python-dateutil pytz pytest>=7.3.2 pytest-xdist>=2.2.0 pytest-asyncio>=0.17 hypothesis>=6.46.1
277+
python -m pip install --no-cache-dir versioneer[toml] cython numpy python-dateutil pytz pytest>=7.3.2 pytest-xdist>=2.2.0 pytest-asyncio>=0.17 hypothesis>=6.46.1
278278
python -m pip install --no-cache-dir --no-build-isolation -e .
279279
python -m pip list --no-cache-dir
280280
@@ -347,8 +347,8 @@ jobs:
347347
python -m pip install --upgrade pip setuptools wheel meson[ninja]==1.2.1 meson-python==0.13.1
348348
python -m pip install --pre --extra-index-url https://pypi.anaconda.org/scientific-python-nightly-wheels/simple numpy
349349
python -m pip install versioneer[toml]
350-
python -m pip install python-dateutil pytz tzdata "cython<3.0.3" hypothesis>=6.46.1 pytest>=7.3.2 pytest-xdist>=2.2.0 pytest-cov pytest-asyncio>=0.17
351-
python -m pip install -ve . --no-build-isolation --no-index
350+
python -m pip install python-dateutil pytz tzdata cython hypothesis>=6.46.1 pytest>=7.3.2 pytest-xdist>=2.2.0 pytest-cov pytest-asyncio>=0.17
351+
python -m pip install -ve . --no-build-isolation --no-index --no-deps
352352
python -m pip list
353353
354354
- name: Run Tests

ci/deps/actions-310.yaml

+2-2
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ dependencies:
66

77
# build dependencies
88
- versioneer[toml]
9-
- cython>=0.29.33, <3.0.3
9+
- cython>=0.29.33
1010
- meson[ninja]=1.2.1
1111
- meson-python=0.13.1
1212

@@ -20,7 +20,7 @@ dependencies:
2020

2121
# required dependencies
2222
- python-dateutil
23-
- numpy
23+
- numpy<2
2424
- pytz
2525

2626
# optional dependencies

ci/deps/actions-311-downstream_compat.yaml

+2-2
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ dependencies:
77

88
# build dependencies
99
- versioneer[toml]
10-
- cython>=0.29.33, <3.0.3
10+
- cython>=0.29.33
1111
- meson[ninja]=1.2.1
1212
- meson-python=0.13.1
1313

@@ -21,7 +21,7 @@ dependencies:
2121

2222
# required dependencies
2323
- python-dateutil
24-
- numpy
24+
- numpy<2
2525
- pytz
2626

2727
# optional dependencies

ci/deps/actions-311-numpydev.yaml

+1-2
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ dependencies:
88
- versioneer[toml]
99
- meson[ninja]=1.2.1
1010
- meson-python=0.13.1
11-
- cython>=0.29.33, <3.0.3
11+
- cython>=0.29.33
1212

1313
# test dependencies
1414
- pytest>=7.3.2
@@ -29,5 +29,4 @@ dependencies:
2929
- "--extra-index-url https://pypi.anaconda.org/scientific-python-nightly-wheels/simple"
3030
- "--pre"
3131
- "numpy"
32-
- "scipy"
3332
- "tzdata>=2022.1"

ci/deps/actions-311-pyarrownightly.yaml

+2-2
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ dependencies:
77
# build dependencies
88
- versioneer[toml]
99
- meson[ninja]=1.2.1
10-
- cython>=0.29.33, <3.0.3
10+
- cython>=0.29.33
1111
- meson-python=0.13.1
1212

1313
# test dependencies
@@ -19,7 +19,7 @@ dependencies:
1919

2020
# required dependencies
2121
- python-dateutil
22-
- numpy
22+
- numpy<2
2323
- pytz
2424
- pip
2525

ci/deps/actions-311.yaml

+2-2
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ dependencies:
66

77
# build dependencies
88
- versioneer[toml]
9-
- cython>=0.29.33, <3.0.3
9+
- cython>=0.29.33
1010
- meson[ninja]=1.2.1
1111
- meson-python=0.13.1
1212

@@ -20,7 +20,7 @@ dependencies:
2020

2121
# required dependencies
2222
- python-dateutil
23-
- numpy
23+
- numpy<2
2424
- pytz
2525

2626
# optional dependencies

ci/deps/actions-39-minimum_versions.yaml

+2-2
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ dependencies:
88

99
# build dependencies
1010
- versioneer[toml]
11-
- cython>=0.29.33, <3.0.3
11+
- cython>=0.29.33
1212
- meson[ninja]=1.2.1
1313
- meson-python=0.13.1
1414

@@ -22,7 +22,7 @@ dependencies:
2222

2323
# required dependencies
2424
- python-dateutil=2.8.2
25-
- numpy=1.22.4
25+
- numpy=1.22.4, <2
2626
- pytz=2020.1
2727

2828
# optional dependencies

ci/deps/actions-39.yaml

+2-2
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ dependencies:
66

77
# build dependencies
88
- versioneer[toml]
9-
- cython>=0.29.33, <3.0.3
9+
- cython>=0.29.33
1010
- meson[ninja]=1.2.1
1111
- meson-python=0.13.1
1212

@@ -20,7 +20,7 @@ dependencies:
2020

2121
# required dependencies
2222
- python-dateutil
23-
- numpy
23+
- numpy<2
2424
- pytz
2525

2626
# optional dependencies

ci/deps/actions-pypy-39.yaml

+2-2
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ dependencies:
99

1010
# build dependencies
1111
- versioneer[toml]
12-
- cython>=0.29.33, <3.0.3
12+
- cython>=0.29.33
1313
- meson[ninja]=1.2.1
1414
- meson-python=0.13.1
1515

@@ -21,7 +21,7 @@ dependencies:
2121
- hypothesis>=6.46.1
2222

2323
# required
24-
- numpy
24+
- numpy<2
2525
- python-dateutil
2626
- pytz
2727
- pip:

ci/deps/circle-310-arm64.yaml

+2-2
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ dependencies:
66

77
# build dependencies
88
- versioneer[toml]
9-
- cython>=0.29.33, <3.0.3
9+
- cython>=0.29.33
1010
- meson[ninja]=1.2.1
1111
- meson-python=0.13.1
1212

@@ -20,7 +20,7 @@ dependencies:
2020

2121
# required dependencies
2222
- python-dateutil
23-
- numpy
23+
- numpy<2
2424
- pytz
2525

2626
# optional dependencies

ci/run_tests.sh

+2-1
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,8 @@ echo PYTHONHASHSEED=$PYTHONHASHSEED
1010

1111
COVERAGE="-s --cov=pandas --cov-report=xml --cov-append --cov-config=pyproject.toml"
1212

13-
PYTEST_CMD="MESONPY_EDITABLE_VERBOSE=1 PYTHONDEVMODE=1 PYTHONWARNDEFAULTENCODING=1 pytest -r fEs -n $PYTEST_WORKERS --dist=loadfile $TEST_ARGS $COVERAGE $PYTEST_TARGET"
13+
# TODO: Support NEP 50 and remove NPY_PROMOTION_STATE
14+
PYTEST_CMD="NPY_PROMOTION_STATE=legacy MESONPY_EDITABLE_VERBOSE=1 PYTHONDEVMODE=1 PYTHONWARNDEFAULTENCODING=1 pytest -r fEs -n $PYTEST_WORKERS --dist=loadfile $TEST_ARGS $COVERAGE $PYTEST_TARGET"
1415

1516
if [[ "$PATTERN" ]]; then
1617
PYTEST_CMD="$PYTEST_CMD -m \"$PATTERN\""

doc/source/user_guide/copy_on_write.rst

+75-55
Original file line numberDiff line numberDiff line change
@@ -7,8 +7,8 @@ Copy-on-Write (CoW)
77
*******************
88

99
Copy-on-Write was first introduced in version 1.5.0. Starting from version 2.0 most of the
10-
optimizations that become possible through CoW are implemented and supported. A complete list
11-
can be found at :ref:`Copy-on-Write optimizations <copy_on_write.optimizations>`.
10+
optimizations that become possible through CoW are implemented and supported. All possible
11+
optimizations are supported starting from pandas 2.1.
1212

1313
We expect that CoW will be enabled by default in version 3.0.
1414

@@ -154,66 +154,86 @@ With copy on write this can be done by using ``loc``.
154154
155155
df.loc[df["bar"] > 5, "foo"] = 100
156156
157+
Read-only NumPy arrays
158+
----------------------
159+
160+
Accessing the underlying NumPy array of a DataFrame will return a read-only array if the array
161+
shares data with the initial DataFrame:
162+
163+
The array is a copy if the initial DataFrame consists of more than one array:
164+
165+
166+
.. ipython:: python
167+
168+
df = pd.DataFrame({"a": [1, 2], "b": [1.5, 2.5]})
169+
df.to_numpy()
170+
171+
The array shares data with the DataFrame if the DataFrame consists of only one NumPy array:
172+
173+
.. ipython:: python
174+
175+
df = pd.DataFrame({"a": [1, 2], "b": [3, 4]})
176+
df.to_numpy()
177+
178+
This array is read-only, which means that it can't be modified inplace:
179+
180+
.. ipython:: python
181+
:okexcept:
182+
183+
arr = df.to_numpy()
184+
arr[0, 0] = 100
185+
186+
The same holds true for a Series, since a Series always consists of a single array.
187+
188+
There are two potential solution to this:
189+
190+
- Trigger a copy manually if you want to avoid updating DataFrames that share memory with your array.
191+
- Make the array writeable. This is a more performant solution but circumvents Copy-on-Write rules, so
192+
it should be used with caution.
193+
194+
.. ipython:: python
195+
196+
arr = df.to_numpy()
197+
arr.flags.writeable = True
198+
arr[0, 0] = 100
199+
arr
200+
201+
Patterns to avoid
202+
-----------------
203+
204+
No defensive copy will be performed if two objects share the same data while
205+
you are modifying one object inplace.
206+
207+
.. ipython:: python
208+
209+
df = pd.DataFrame({"a": [1, 2, 3], "b": [4, 5, 6]})
210+
df2 = df.reset_index()
211+
df2.iloc[0, 0] = 100
212+
213+
This creates two objects that share data and thus the setitem operation will trigger a
214+
copy. This is not necessary if the initial object ``df`` isn't needed anymore.
215+
Simply reassigning to the same variable will invalidate the reference that is
216+
held by the object.
217+
218+
.. ipython:: python
219+
220+
df = pd.DataFrame({"a": [1, 2, 3], "b": [4, 5, 6]})
221+
df = df.reset_index()
222+
df.iloc[0, 0] = 100
223+
224+
No copy is necessary in this example.
225+
Creating multiple references keeps unnecessary references alive
226+
and thus will hurt performance with Copy-on-Write.
227+
157228
.. _copy_on_write.optimizations:
158229

159230
Copy-on-Write optimizations
160231
---------------------------
161232

162233
A new lazy copy mechanism that defers the copy until the object in question is modified
163234
and only if this object shares data with another object. This mechanism was added to
164-
following methods:
165-
166-
- :meth:`DataFrame.reset_index` / :meth:`Series.reset_index`
167-
- :meth:`DataFrame.set_index`
168-
- :meth:`DataFrame.set_axis` / :meth:`Series.set_axis`
169-
- :meth:`DataFrame.set_flags` / :meth:`Series.set_flags`
170-
- :meth:`DataFrame.rename_axis` / :meth:`Series.rename_axis`
171-
- :meth:`DataFrame.reindex` / :meth:`Series.reindex`
172-
- :meth:`DataFrame.reindex_like` / :meth:`Series.reindex_like`
173-
- :meth:`DataFrame.assign`
174-
- :meth:`DataFrame.drop`
175-
- :meth:`DataFrame.dropna` / :meth:`Series.dropna`
176-
- :meth:`DataFrame.select_dtypes`
177-
- :meth:`DataFrame.align` / :meth:`Series.align`
178-
- :meth:`Series.to_frame`
179-
- :meth:`DataFrame.rename` / :meth:`Series.rename`
180-
- :meth:`DataFrame.add_prefix` / :meth:`Series.add_prefix`
181-
- :meth:`DataFrame.add_suffix` / :meth:`Series.add_suffix`
182-
- :meth:`DataFrame.drop_duplicates` / :meth:`Series.drop_duplicates`
183-
- :meth:`DataFrame.droplevel` / :meth:`Series.droplevel`
184-
- :meth:`DataFrame.reorder_levels` / :meth:`Series.reorder_levels`
185-
- :meth:`DataFrame.between_time` / :meth:`Series.between_time`
186-
- :meth:`DataFrame.filter` / :meth:`Series.filter`
187-
- :meth:`DataFrame.head` / :meth:`Series.head`
188-
- :meth:`DataFrame.tail` / :meth:`Series.tail`
189-
- :meth:`DataFrame.isetitem`
190-
- :meth:`DataFrame.pipe` / :meth:`Series.pipe`
191-
- :meth:`DataFrame.pop` / :meth:`Series.pop`
192-
- :meth:`DataFrame.replace` / :meth:`Series.replace`
193-
- :meth:`DataFrame.shift` / :meth:`Series.shift`
194-
- :meth:`DataFrame.sort_index` / :meth:`Series.sort_index`
195-
- :meth:`DataFrame.sort_values` / :meth:`Series.sort_values`
196-
- :meth:`DataFrame.squeeze` / :meth:`Series.squeeze`
197-
- :meth:`DataFrame.swapaxes`
198-
- :meth:`DataFrame.swaplevel` / :meth:`Series.swaplevel`
199-
- :meth:`DataFrame.take` / :meth:`Series.take`
200-
- :meth:`DataFrame.to_timestamp` / :meth:`Series.to_timestamp`
201-
- :meth:`DataFrame.to_period` / :meth:`Series.to_period`
202-
- :meth:`DataFrame.truncate`
203-
- :meth:`DataFrame.iterrows`
204-
- :meth:`DataFrame.tz_convert` / :meth:`Series.tz_localize`
205-
- :meth:`DataFrame.fillna` / :meth:`Series.fillna`
206-
- :meth:`DataFrame.interpolate` / :meth:`Series.interpolate`
207-
- :meth:`DataFrame.ffill` / :meth:`Series.ffill`
208-
- :meth:`DataFrame.bfill` / :meth:`Series.bfill`
209-
- :meth:`DataFrame.where` / :meth:`Series.where`
210-
- :meth:`DataFrame.infer_objects` / :meth:`Series.infer_objects`
211-
- :meth:`DataFrame.astype` / :meth:`Series.astype`
212-
- :meth:`DataFrame.convert_dtypes` / :meth:`Series.convert_dtypes`
213-
- :meth:`DataFrame.join`
214-
- :meth:`DataFrame.eval`
215-
- :func:`concat`
216-
- :func:`merge`
235+
methods that don't require a copy of the underlying data. Popular examples are :meth:`DataFrame.drop` for ``axis=1``
236+
and :meth:`DataFrame.rename`.
217237

218238
These methods return views when Copy-on-Write is enabled, which provides a significant
219239
performance improvement compared to the regular execution.

doc/source/whatsnew/index.rst

+1
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@ Version 2.1
1616
.. toctree::
1717
:maxdepth: 2
1818

19+
v2.1.3
1920
v2.1.2
2021
v2.1.1
2122
v2.1.0

doc/source/whatsnew/v2.1.1.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -52,4 +52,4 @@ Other
5252
Contributors
5353
~~~~~~~~~~~~
5454

55-
.. contributors:: v2.1.0..v2.1.1|HEAD
55+
.. contributors:: v2.1.0..v2.1.1

0 commit comments

Comments
 (0)