-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
PERF: Allow groupby transform with numba engine to be fully parallelizable #36240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PERF: Allow groupby transform with numba engine to be fully parallelizable #36240
Conversation
How does this compare to non-parallel numba? |
|
so a little better :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add an asv which covers this (or do we have an existing one)?
pandas/core/groupby/generic.py
Outdated
@@ -1362,13 +1358,23 @@ def _transform_general( | |||
@Appender(_transform_template) | |||
def transform(self, func, *args, engine=None, engine_kwargs=None, **kwargs): | |||
|
|||
if maybe_use_numba(engine): | |||
if not callable(func): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe this check should actually be in _transform_with_numba to keep DRY (you have it above as well)
thanks @mroeschke |
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff
New performance comparison with 10k groups