CLN: ArrowStringArray._cmp_method - use ChunkedArray.to_numpy() #46417

lukemanley · 2022-03-18T14:10:30Z

Tests added and passed if fixing a bug or adding a new feature
All code checks passed.

Addresses minor TODO comment in ArrowStringArray._cmp_method.

Also, better perf:

import numpy as np
import pyarrow as pa

ca = pa.chunked_array([ 
    np.random.choice([0, 1], 1000),
    np.random.choice([0, 1], 1000),
], type=pa.bool_())

%timeit ca.to_pandas().values
45.9 µs ± 2.75 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

%timeit ca.to_numpy()
3.8 µs ± 273 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

ArrowStringArray._cmp_method TODO

445203f

lukemanley changed the title ~~ArrowStringArray._cmp_method - use ChunkedArray.to_numpy()~~ CLN: ArrowStringArray._cmp_method - use ChunkedArray.to_numpy() Mar 18, 2022

lukemanley added the Arrow pyarrow functionality label Mar 18, 2022

simonjayhawkins added the Performance Memory or execution speed performance label Mar 18, 2022

simonjayhawkins added this to the 1.5 milestone Mar 18, 2022

jreback merged commit d2478f5 into pandas-dev:main Mar 18, 2022

lukemanley deleted the arrowstringarray-todo branch March 20, 2022 23:18

yehoshuadimarsky pushed a commit to yehoshuadimarsky/pandas that referenced this pull request Jul 13, 2022

ArrowStringArray._cmp_method TODO (pandas-dev#46417)

a4f5f22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CLN: ArrowStringArray._cmp_method - use ChunkedArray.to_numpy() #46417

CLN: ArrowStringArray._cmp_method - use ChunkedArray.to_numpy() #46417

lukemanley commented Mar 18, 2022 •

edited

Loading

CLN: ArrowStringArray._cmp_method - use ChunkedArray.to_numpy() #46417

CLN: ArrowStringArray._cmp_method - use ChunkedArray.to_numpy() #46417

Conversation

lukemanley commented Mar 18, 2022 • edited Loading

lukemanley commented Mar 18, 2022 •

edited

Loading