Skip to content

Commit a614b7a

Browse files
authored
PERF: ArrowExtensionArray.__iter__ (#49825)
* faster ArrowExtensionArray.__iter__ * gh ref
1 parent 9691bba commit a614b7a

File tree

2 files changed

+14
-0
lines changed

2 files changed

+14
-0
lines changed

doc/source/whatsnew/v2.0.0.rst

+1
Original file line numberDiff line numberDiff line change
@@ -600,6 +600,7 @@ Performance improvements
600600
- Performance improvement in :meth:`DataFrame.join` when joining on a subset of a :class:`MultiIndex` (:issue:`48611`)
601601
- Performance improvement for :meth:`MultiIndex.intersection` (:issue:`48604`)
602602
- Performance improvement in ``var`` for nullable dtypes (:issue:`48379`).
603+
- Performance improvement when iterating over a :class:`~arrays.ArrowExtensionArray` (:issue:`49825`).
603604
- Performance improvements to :func:`read_sas` (:issue:`47403`, :issue:`47405`, :issue:`47656`, :issue:`48502`)
604605
- Memory improvement in :meth:`RangeIndex.sort_values` (:issue:`48801`)
605606
- Performance improvement in :class:`DataFrameGroupBy` and :class:`SeriesGroupBy` when ``by`` is a categorical type and ``sort=False`` (:issue:`48976`)

pandas/core/arrays/arrow/array.py

+13
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@
1313
ArrayLike,
1414
Dtype,
1515
FillnaOptions,
16+
Iterator,
1617
PositionalIndexer,
1718
SortKind,
1819
TakeIndexer,
@@ -334,6 +335,18 @@ def __getitem__(self, item: PositionalIndexer):
334335
else:
335336
return scalar
336337

338+
def __iter__(self) -> Iterator[Any]:
339+
"""
340+
Iterate over elements of the array.
341+
"""
342+
na_value = self._dtype.na_value
343+
for value in self._data:
344+
val = value.as_py()
345+
if val is None:
346+
yield na_value
347+
else:
348+
yield val
349+
337350
def __arrow_array__(self, type=None):
338351
"""Convert myself to a pyarrow ChunkedArray."""
339352
return self._data

0 commit comments

Comments
 (0)