PERF: Series.apply is slower on single element dict compared with multi elements dict #56942

Alexia-I · 2024-01-18T12:18:22Z

Pandas version checks

I have checked that this issue has not already been reported.
I have confirmed this issue exists on the latest version of pandas.
I have confirmed this issue exists on the main branch of pandas.

Reproducible Example

Hello, I noticed that applying the apply function to a Series with a single element dictionary takes more time than its counterpart with a multi-element dictionary. I'm curious if this is due to something I did wrong.

import pandas as pd
import timeit
import random
import string
import time

# Create dictionary inputs
single_pair_dict = {'a': 42}
# Create a dictionary containing 10 key-value pairs
multi_pair_dict = {letter: random.randint(1, 100) for letter in string.ascii_lowercase[:10]}

# Self-defined apply function
def complex_operation(x):
    return x * x - x + 42

n = 10000
# Create Series and test the execution time of apply() method
times = time.time()
for i in range(n):
    pd.Series(single_pair_dict).apply(complex_operation)
time_now_single = time.time() - times

timem = time.time()
for i in range(n):
    pd.Series(multi_pair_dict).apply(complex_operation)
time_now_multi = time.time() - timem

# Print the results
print(f"Time for apply() on Series with a single key-value pair: {time_now_single} seconds")
print(f"Time for apply() on Series with multiple key-value pairs: {time_now_multi} seconds")

Time for apply() on Series with single key-value pair: 1.1293659210205078 seconds
Time for apply() on Series with multiple key-value pairs: 1.0656580924987793 seconds

Installed Versions

INSTALLED VERSIONS

commit : d4c8d82
python : 3.9.18.final.0
python-bits : 64
OS : Linux
OS-release : 5.15.0-88-generic
Version : #98~20.04.1-Ubuntu SMP Mon Oct 9 16:43:45 UTC 2023
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8

pandas : 2.2.0rc0
numpy : 1.26.3
pytz : 2023.3.post1
dateutil : 2.8.2
setuptools : 68.2.2
pip : 23.3.1
Cython : 3.0.8
pytest : None
hypothesis : None
...
zstandard : None
tzdata : 2023.4
qtpy : None
pyqt5 : None

Prior Performance

No response

The text was updated successfully, but these errors were encountered:

rhshadrach · 2024-01-18T21:39:56Z

Thanks for the report, I can't reproduce on main. Can you reliably reproduce the results you posted?

Time for apply() on Series with a single key-value pair: 0.777146577835083 seconds
Time for apply() on Series with multiple key-value pairs: 0.80838942527771 seconds

Also confirmed using timeit on just the apply operation:

# Create dictionary inputs
single_pair_dict = {'a': 42}
# Create a dictionary containing 10 key-value pairs
multi_pair_dict = {letter: random.randint(1, 100) for letter in string.ascii_lowercase[:10]}

# Self-defined apply function
def complex_operation(x):
    return x * x - x + 42

ser1 = pd.Series(single_pair_dict)
%timeit ser1.apply(complex_operation)
# 25 µs ± 94.4 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

ser2 = pd.Series(multi_pair_dict)
%timeit ser2.apply(complex_operation)
# 26.8 µs ± 93.7 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

Alexia-I added Needs Triage Issue that has not been reviewed by a pandas team member Performance Memory or execution speed performance labels Jan 18, 2024

rhshadrach added Needs Info Clarification about behavior needed to assess issue Apply Apply, Aggregate, Transform, Map and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Jan 18, 2024

Alexia-I closed this as completed Jan 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PERF: Series.apply is slower on single element dict compared with multi elements dict #56942

PERF: Series.apply is slower on single element dict compared with multi elements dict #56942

Alexia-I commented Jan 18, 2024

INSTALLED VERSIONS

rhshadrach commented Jan 18, 2024 •

edited

Loading

PERF: Series.apply is slower on single element dict compared with multi elements dict #56942

PERF: Series.apply is slower on single element dict compared with multi elements dict #56942

Comments

Alexia-I commented Jan 18, 2024

Pandas version checks

Reproducible Example

Installed Versions

INSTALLED VERSIONS

Prior Performance

rhshadrach commented Jan 18, 2024 • edited Loading

rhshadrach commented Jan 18, 2024 •

edited

Loading