Skip to content

Commit 364b8a1

Browse files
Merge branch 'main' into fix-ts-lineplot-padding
2 parents 4609baf + 9eb1553 commit 364b8a1

File tree

21 files changed

+192
-378
lines changed

21 files changed

+192
-378
lines changed

ci/code_checks.sh

+1-1
Original file line numberDiff line numberDiff line change
@@ -155,7 +155,6 @@ if [[ -z "$CHECK" || "$CHECK" == "docstrings" ]]; then
155155
pandas.Period.ordinal\
156156
pandas.PeriodIndex.freq\
157157
pandas.PeriodIndex.qyear\
158-
pandas.Series.dt\
159158
pandas.Series.dt.as_unit\
160159
pandas.Series.dt.freq\
161160
pandas.Series.dt.qyear\
@@ -437,6 +436,7 @@ if [[ -z "$CHECK" || "$CHECK" == "docstrings" ]]; then
437436
pandas.Series.cat.rename_categories\
438437
pandas.Series.cat.reorder_categories\
439438
pandas.Series.cat.set_categories\
439+
pandas.Series.dt `# Accessors are implemented as classes, but we do not document the Parameters section` \
440440
pandas.Series.dt.as_unit\
441441
pandas.Series.dt.ceil\
442442
pandas.Series.dt.day_name\

doc/source/user_guide/copy_on_write.rst

+52-63
Original file line numberDiff line numberDiff line change
@@ -8,16 +8,12 @@ Copy-on-Write (CoW)
88

99
.. note::
1010

11-
Copy-on-Write will become the default in pandas 3.0. We recommend
12-
:ref:`turning it on now <copy_on_write_enabling>`
13-
to benefit from all improvements.
11+
Copy-on-Write is now the default with pandas 3.0.
1412

1513
Copy-on-Write was first introduced in version 1.5.0. Starting from version 2.0 most of the
1614
optimizations that become possible through CoW are implemented and supported. All possible
1715
optimizations are supported starting from pandas 2.1.
1816

19-
CoW will be enabled by default in version 3.0.
20-
2117
CoW will lead to more predictable behavior since it is not possible to update more than
2218
one object with one statement, e.g. indexing operations or methods won't have side-effects. Additionally, through
2319
delaying copies as long as possible, the average performance and memory usage will improve.
@@ -29,21 +25,25 @@ pandas indexing behavior is tricky to understand. Some operations return views w
2925
other return copies. Depending on the result of the operation, mutating one object
3026
might accidentally mutate another:
3127

32-
.. ipython:: python
28+
.. code-block:: ipython
3329
34-
df = pd.DataFrame({"foo": [1, 2, 3], "bar": [4, 5, 6]})
35-
subset = df["foo"]
36-
subset.iloc[0] = 100
37-
df
30+
In [1]: df = pd.DataFrame({"foo": [1, 2, 3], "bar": [4, 5, 6]})
31+
In [2]: subset = df["foo"]
32+
In [3]: subset.iloc[0] = 100
33+
In [4]: df
34+
Out[4]:
35+
foo bar
36+
0 100 4
37+
1 2 5
38+
2 3 6
3839
39-
Mutating ``subset``, e.g. updating its values, also updates ``df``. The exact behavior is
40+
41+
Mutating ``subset``, e.g. updating its values, also updated ``df``. The exact behavior was
4042
hard to predict. Copy-on-Write solves accidentally modifying more than one object,
41-
it explicitly disallows this. With CoW enabled, ``df`` is unchanged:
43+
it explicitly disallows this. ``df`` is unchanged:
4244

4345
.. ipython:: python
4446
45-
pd.options.mode.copy_on_write = True
46-
4747
df = pd.DataFrame({"foo": [1, 2, 3], "bar": [4, 5, 6]})
4848
subset = df["foo"]
4949
subset.iloc[0] = 100
@@ -57,13 +57,13 @@ applications.
5757
Migrating to Copy-on-Write
5858
--------------------------
5959

60-
Copy-on-Write will be the default and only mode in pandas 3.0. This means that users
60+
Copy-on-Write is the default and only mode in pandas 3.0. This means that users
6161
need to migrate their code to be compliant with CoW rules.
6262

63-
The default mode in pandas will raise warnings for certain cases that will actively
63+
The default mode in pandas < 3.0 raises warnings for certain cases that will actively
6464
change behavior and thus change user intended behavior.
6565

66-
We added another mode, e.g.
66+
pandas 2.2 has a warning mode
6767

6868
.. code-block:: python
6969
@@ -84,7 +84,6 @@ The following few items describe the user visible changes:
8484

8585
**Accessing the underlying array of a pandas object will return a read-only view**
8686

87-
8887
.. ipython:: python
8988
9089
ser = pd.Series([1, 2, 3])
@@ -101,16 +100,21 @@ for more details.
101100

102101
**Only one pandas object is updated at once**
103102

104-
The following code snippet updates both ``df`` and ``subset`` without CoW:
103+
The following code snippet updated both ``df`` and ``subset`` without CoW:
105104

106-
.. ipython:: python
105+
.. code-block:: ipython
107106
108-
df = pd.DataFrame({"foo": [1, 2, 3], "bar": [4, 5, 6]})
109-
subset = df["foo"]
110-
subset.iloc[0] = 100
111-
df
107+
In [1]: df = pd.DataFrame({"foo": [1, 2, 3], "bar": [4, 5, 6]})
108+
In [2]: subset = df["foo"]
109+
In [3]: subset.iloc[0] = 100
110+
In [4]: df
111+
Out[4]:
112+
foo bar
113+
0 100 4
114+
1 2 5
115+
2 3 6
112116
113-
This won't be possible anymore with CoW, since the CoW rules explicitly forbid this.
117+
This is not possible anymore with CoW, since the CoW rules explicitly forbid this.
114118
This includes updating a single column as a :class:`Series` and relying on the change
115119
propagating back to the parent :class:`DataFrame`.
116120
This statement can be rewritten into a single statement with ``loc`` or ``iloc`` if
@@ -146,7 +150,7 @@ A different alternative would be to not use ``inplace``:
146150
147151
**Constructors now copy NumPy arrays by default**
148152

149-
The Series and DataFrame constructors will now copy NumPy array by default when not
153+
The Series and DataFrame constructors now copies a NumPy array by default when not
150154
otherwise specified. This was changed to avoid mutating a pandas object when the
151155
NumPy array is changed inplace outside of pandas. You can set ``copy=False`` to
152156
avoid this copy.
@@ -162,7 +166,7 @@ that shares data with another DataFrame or Series object inplace.
162166
This avoids side-effects when modifying values and hence, most methods can avoid
163167
actually copying the data and only trigger a copy when necessary.
164168

165-
The following example will operate inplace with CoW:
169+
The following example will operate inplace:
166170

167171
.. ipython:: python
168172
@@ -207,15 +211,17 @@ listed in :ref:`Copy-on-Write optimizations <copy_on_write.optimizations>`.
207211

208212
Previously, when operating on views, the view and the parent object was modified:
209213

210-
.. ipython:: python
211-
212-
with pd.option_context("mode.copy_on_write", False):
213-
df = pd.DataFrame({"foo": [1, 2, 3], "bar": [4, 5, 6]})
214-
view = df[:]
215-
df.iloc[0, 0] = 100
214+
.. code-block:: ipython
216215
217-
df
218-
view
216+
In [1]: df = pd.DataFrame({"foo": [1, 2, 3], "bar": [4, 5, 6]})
217+
In [2]: subset = df["foo"]
218+
In [3]: subset.iloc[0] = 100
219+
In [4]: df
220+
Out[4]:
221+
foo bar
222+
0 100 4
223+
1 2 5
224+
2 3 6
219225
220226
CoW triggers a copy when ``df`` is changed to avoid mutating ``view`` as well:
221227

@@ -236,16 +242,19 @@ Chained Assignment
236242
Chained assignment references a technique where an object is updated through
237243
two subsequent indexing operations, e.g.
238244

239-
.. ipython:: python
240-
:okwarning:
245+
.. code-block:: ipython
241246
242-
with pd.option_context("mode.copy_on_write", False):
243-
df = pd.DataFrame({"foo": [1, 2, 3], "bar": [4, 5, 6]})
244-
df["foo"][df["bar"] > 5] = 100
245-
df
247+
In [1]: df = pd.DataFrame({"foo": [1, 2, 3], "bar": [4, 5, 6]})
248+
In [2]: df["foo"][df["bar"] > 5] = 100
249+
In [3]: df
250+
Out[3]:
251+
foo bar
252+
0 100 4
253+
1 2 5
254+
2 3 6
246255
247-
The column ``foo`` is updated where the column ``bar`` is greater than 5.
248-
This violates the CoW principles though, because it would have to modify the
256+
The column ``foo`` was updated where the column ``bar`` is greater than 5.
257+
This violated the CoW principles though, because it would have to modify the
249258
view ``df["foo"]`` and ``df`` in one step. Hence, chained assignment will
250259
consistently never work and raise a ``ChainedAssignmentError`` warning
251260
with CoW enabled:
@@ -272,7 +281,6 @@ shares data with the initial DataFrame:
272281

273282
The array is a copy if the initial DataFrame consists of more than one array:
274283

275-
276284
.. ipython:: python
277285
278286
df = pd.DataFrame({"a": [1, 2], "b": [1.5, 2.5]})
@@ -347,22 +355,3 @@ and :meth:`DataFrame.rename`.
347355

348356
These methods return views when Copy-on-Write is enabled, which provides a significant
349357
performance improvement compared to the regular execution.
350-
351-
.. _copy_on_write_enabling:
352-
353-
How to enable CoW
354-
-----------------
355-
356-
Copy-on-Write can be enabled through the configuration option ``copy_on_write``. The option can
357-
be turned on __globally__ through either of the following:
358-
359-
.. ipython:: python
360-
361-
pd.set_option("mode.copy_on_write", True)
362-
363-
pd.options.mode.copy_on_write = True
364-
365-
.. ipython:: python
366-
:suppress:
367-
368-
pd.options.mode.copy_on_write = False

0 commit comments

Comments
 (0)