Skip to content

PERF: Cython 3.0 regression with frame constructor #55931

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

rhshadrach
Copy link
Member

@rhshadrach rhshadrach commented Nov 13, 2023

  • closes #xxxx (Replace xxxx with the GitHub issue number)
  • Tests added and passed if fixing a bug or adding a new feature
  • All code checks passed.
  • Added type annotations to new arguments/methods/functions.
  • Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

Handles the 5 frame_ctor.FromDicts regressions in #55179

There is only one use of this function:

val = lib.fast_multiget(val, oindex._values, default=np.nan)

and it's guaranteed to always have object dtype:

if oindex is None:
oindex = index.astype("O")

ASVs on this branch:

| Change   | Before [b2d9ec17] <perf_cython3_frame_dict_constructor~1>   | After [951c5bfa] <perf_cython3_frame_dict_constructor>   |   Ratio | Benchmark (Parameter)                               |
|----------|-------------------------------------------------------------|----------------------------------------------------------|---------|-----------------------------------------------------|
| -        | 17.2±0.2ms                                                  | 15.3±0.3ms                                               |    0.89 | frame_ctor.FromDicts.time_nested_dict               |
| -        | 13.2±0.2ms                                                  | 11.7±0.3ms                                               |    0.89 | frame_ctor.FromDicts.time_nested_dict_index_columns |
| -        | 13.1±0.2ms                                                  | 11.3±0.3ms                                               |    0.86 | frame_ctor.FromDicts.time_nested_dict_index         |

SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE INCREASED.

@rhshadrach rhshadrach added Performance Memory or execution speed performance DataFrame DataFrame data structure Constructors Series/DataFrame/Index/pd.array Constructors labels Nov 13, 2023
@rhshadrach rhshadrach requested a review from WillAyd as a code owner November 13, 2023 00:00
Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm

@WillAyd WillAyd merged commit 8df36a2 into pandas-dev:main Nov 13, 2023
@WillAyd
Copy link
Member

WillAyd commented Nov 13, 2023

Thanks @rhshadrach nice find

@rhshadrach rhshadrach deleted the perf_cython3_frame_dict_constructor branch November 13, 2023 13:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Constructors Series/DataFrame/Index/pd.array Constructors DataFrame DataFrame data structure Performance Memory or execution speed performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants