Skip to content

Commit 4b6a80e

Browse files
authored
DOC: reoder whatsnew enhancements (#56196)
1 parent 52ddb2e commit 4b6a80e

File tree

1 file changed

+75
-75
lines changed

1 file changed

+75
-75
lines changed

doc/source/whatsnew/v2.2.0.rst

+75-75
Original file line numberDiff line numberDiff line change
@@ -14,81 +14,6 @@ including other versions of pandas.
1414
Enhancements
1515
~~~~~~~~~~~~
1616

17-
.. _whatsnew_220.enhancements.calamine:
18-
19-
Calamine engine for :func:`read_excel`
20-
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21-
22-
The ``calamine`` engine was added to :func:`read_excel`.
23-
It uses ``python-calamine``, which provides Python bindings for the Rust library `calamine <https://crates.io/crates/calamine>`__.
24-
This engine supports Excel files (``.xlsx``, ``.xlsm``, ``.xls``, ``.xlsb``) and OpenDocument spreadsheets (``.ods``) (:issue:`50395`).
25-
26-
There are two advantages of this engine:
27-
28-
1. Calamine is often faster than other engines, some benchmarks show results up to 5x faster than 'openpyxl', 20x - 'odf', 4x - 'pyxlsb', and 1.5x - 'xlrd'.
29-
But, 'openpyxl' and 'pyxlsb' are faster in reading a few rows from large files because of lazy iteration over rows.
30-
2. Calamine supports the recognition of datetime in ``.xlsb`` files, unlike 'pyxlsb' which is the only other engine in pandas that can read ``.xlsb`` files.
31-
32-
.. code-block:: python
33-
34-
pd.read_excel("path_to_file.xlsb", engine="calamine")
35-
36-
37-
For more, see :ref:`io.calamine` in the user guide on IO tools.
38-
39-
.. _whatsnew_220.enhancements.struct_accessor:
40-
41-
Series.struct accessor to with PyArrow structured data
42-
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
43-
44-
The ``Series.struct`` accessor provides attributes and methods for processing
45-
data with ``struct[pyarrow]`` dtype Series. For example,
46-
:meth:`Series.struct.explode` converts PyArrow structured data to a pandas
47-
DataFrame. (:issue:`54938`)
48-
49-
.. ipython:: python
50-
51-
import pyarrow as pa
52-
series = pd.Series(
53-
[
54-
{"project": "pandas", "version": "2.2.0"},
55-
{"project": "numpy", "version": "1.25.2"},
56-
{"project": "pyarrow", "version": "13.0.0"},
57-
],
58-
dtype=pd.ArrowDtype(
59-
pa.struct([
60-
("project", pa.string()),
61-
("version", pa.string()),
62-
])
63-
),
64-
)
65-
series.struct.explode()
66-
67-
.. _whatsnew_220.enhancements.list_accessor:
68-
69-
Series.list accessor for PyArrow list data
70-
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
71-
72-
The ``Series.list`` accessor provides attributes and methods for processing
73-
data with ``list[pyarrow]`` dtype Series. For example,
74-
:meth:`Series.list.__getitem__` allows indexing pyarrow lists in
75-
a Series. (:issue:`55323`)
76-
77-
.. ipython:: python
78-
79-
import pyarrow as pa
80-
series = pd.Series(
81-
[
82-
[1, 2, 3],
83-
[4, 5],
84-
[6],
85-
],
86-
dtype=pd.ArrowDtype(
87-
pa.list_(pa.int64())
88-
),
89-
)
90-
series.list[0]
91-
9217
.. _whatsnew_220.enhancements.adbc_support:
9318

9419
ADBC Driver support in to_sql and read_sql
@@ -180,6 +105,81 @@ For a full list of ADBC drivers and their development status, see the `ADBC Driv
180105
Implementation Status <https://arrow.apache.org/adbc/current/driver/status.html>`_
181106
documentation.
182107

108+
.. _whatsnew_220.enhancements.struct_accessor:
109+
110+
Series.struct accessor to with PyArrow structured data
111+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
112+
113+
The ``Series.struct`` accessor provides attributes and methods for processing
114+
data with ``struct[pyarrow]`` dtype Series. For example,
115+
:meth:`Series.struct.explode` converts PyArrow structured data to a pandas
116+
DataFrame. (:issue:`54938`)
117+
118+
.. ipython:: python
119+
120+
import pyarrow as pa
121+
series = pd.Series(
122+
[
123+
{"project": "pandas", "version": "2.2.0"},
124+
{"project": "numpy", "version": "1.25.2"},
125+
{"project": "pyarrow", "version": "13.0.0"},
126+
],
127+
dtype=pd.ArrowDtype(
128+
pa.struct([
129+
("project", pa.string()),
130+
("version", pa.string()),
131+
])
132+
),
133+
)
134+
series.struct.explode()
135+
136+
.. _whatsnew_220.enhancements.list_accessor:
137+
138+
Series.list accessor for PyArrow list data
139+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
140+
141+
The ``Series.list`` accessor provides attributes and methods for processing
142+
data with ``list[pyarrow]`` dtype Series. For example,
143+
:meth:`Series.list.__getitem__` allows indexing pyarrow lists in
144+
a Series. (:issue:`55323`)
145+
146+
.. ipython:: python
147+
148+
import pyarrow as pa
149+
series = pd.Series(
150+
[
151+
[1, 2, 3],
152+
[4, 5],
153+
[6],
154+
],
155+
dtype=pd.ArrowDtype(
156+
pa.list_(pa.int64())
157+
),
158+
)
159+
series.list[0]
160+
161+
.. _whatsnew_220.enhancements.calamine:
162+
163+
Calamine engine for :func:`read_excel`
164+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
165+
166+
The ``calamine`` engine was added to :func:`read_excel`.
167+
It uses ``python-calamine``, which provides Python bindings for the Rust library `calamine <https://crates.io/crates/calamine>`__.
168+
This engine supports Excel files (``.xlsx``, ``.xlsm``, ``.xls``, ``.xlsb``) and OpenDocument spreadsheets (``.ods``) (:issue:`50395`).
169+
170+
There are two advantages of this engine:
171+
172+
1. Calamine is often faster than other engines, some benchmarks show results up to 5x faster than 'openpyxl', 20x - 'odf', 4x - 'pyxlsb', and 1.5x - 'xlrd'.
173+
But, 'openpyxl' and 'pyxlsb' are faster in reading a few rows from large files because of lazy iteration over rows.
174+
2. Calamine supports the recognition of datetime in ``.xlsb`` files, unlike 'pyxlsb' which is the only other engine in pandas that can read ``.xlsb`` files.
175+
176+
.. code-block:: python
177+
178+
pd.read_excel("path_to_file.xlsb", engine="calamine")
179+
180+
181+
For more, see :ref:`io.calamine` in the user guide on IO tools.
182+
183183
.. _whatsnew_220.enhancements.other:
184184

185185
Other enhancements

0 commit comments

Comments
 (0)