You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: doc/source/whatsnew/v2.2.0.rst
+75-75
Original file line number
Diff line number
Diff line change
@@ -14,81 +14,6 @@ including other versions of pandas.
14
14
Enhancements
15
15
~~~~~~~~~~~~
16
16
17
-
.. _whatsnew_220.enhancements.calamine:
18
-
19
-
Calamine engine for :func:`read_excel`
20
-
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21
-
22
-
The ``calamine`` engine was added to :func:`read_excel`.
23
-
It uses ``python-calamine``, which provides Python bindings for the Rust library `calamine <https://crates.io/crates/calamine>`__.
24
-
This engine supports Excel files (``.xlsx``, ``.xlsm``, ``.xls``, ``.xlsb``) and OpenDocument spreadsheets (``.ods``) (:issue:`50395`).
25
-
26
-
There are two advantages of this engine:
27
-
28
-
1. Calamine is often faster than other engines, some benchmarks show results up to 5x faster than 'openpyxl', 20x - 'odf', 4x - 'pyxlsb', and 1.5x - 'xlrd'.
29
-
But, 'openpyxl' and 'pyxlsb' are faster in reading a few rows from large files because of lazy iteration over rows.
30
-
2. Calamine supports the recognition of datetime in ``.xlsb`` files, unlike 'pyxlsb' which is the only other engine in pandas that can read ``.xlsb`` files.
The ``Series.struct`` accessor provides attributes and methods for processing
114
+
data with ``struct[pyarrow]`` dtype Series. For example,
115
+
:meth:`Series.struct.explode` converts PyArrow structured data to a pandas
116
+
DataFrame. (:issue:`54938`)
117
+
118
+
.. ipython:: python
119
+
120
+
import pyarrow as pa
121
+
series = pd.Series(
122
+
[
123
+
{"project": "pandas", "version": "2.2.0"},
124
+
{"project": "numpy", "version": "1.25.2"},
125
+
{"project": "pyarrow", "version": "13.0.0"},
126
+
],
127
+
dtype=pd.ArrowDtype(
128
+
pa.struct([
129
+
("project", pa.string()),
130
+
("version", pa.string()),
131
+
])
132
+
),
133
+
)
134
+
series.struct.explode()
135
+
136
+
.. _whatsnew_220.enhancements.list_accessor:
137
+
138
+
Series.list accessor for PyArrow list data
139
+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
140
+
141
+
The ``Series.list`` accessor provides attributes and methods for processing
142
+
data with ``list[pyarrow]`` dtype Series. For example,
143
+
:meth:`Series.list.__getitem__` allows indexing pyarrow lists in
144
+
a Series. (:issue:`55323`)
145
+
146
+
.. ipython:: python
147
+
148
+
import pyarrow as pa
149
+
series = pd.Series(
150
+
[
151
+
[1, 2, 3],
152
+
[4, 5],
153
+
[6],
154
+
],
155
+
dtype=pd.ArrowDtype(
156
+
pa.list_(pa.int64())
157
+
),
158
+
)
159
+
series.list[0]
160
+
161
+
.. _whatsnew_220.enhancements.calamine:
162
+
163
+
Calamine engine for :func:`read_excel`
164
+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
165
+
166
+
The ``calamine`` engine was added to :func:`read_excel`.
167
+
It uses ``python-calamine``, which provides Python bindings for the Rust library `calamine <https://crates.io/crates/calamine>`__.
168
+
This engine supports Excel files (``.xlsx``, ``.xlsm``, ``.xls``, ``.xlsb``) and OpenDocument spreadsheets (``.ods``) (:issue:`50395`).
169
+
170
+
There are two advantages of this engine:
171
+
172
+
1. Calamine is often faster than other engines, some benchmarks show results up to 5x faster than 'openpyxl', 20x - 'odf', 4x - 'pyxlsb', and 1.5x - 'xlrd'.
173
+
But, 'openpyxl' and 'pyxlsb' are faster in reading a few rows from large files because of lazy iteration over rows.
174
+
2. Calamine supports the recognition of datetime in ``.xlsb`` files, unlike 'pyxlsb' which is the only other engine in pandas that can read ``.xlsb`` files.
0 commit comments