Skip to content

POC: Don't special case Python builtin and NumPy functions #53426

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

rhshadrach
Copy link
Member

@rhshadrach rhshadrach commented May 28, 2023

  • closes #xxxx (Replace xxxx with the GitHub issue number)
  • Tests added and passed if fixing a bug or adding a new feature
  • All code checks passed.
  • Added type annotations to new arguments/methods/functions.
  • Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

This PR is just to demonstrate the impact of #53425 on tests.

np.random.seed(26)
size = 100000
df = pd.DataFrame(
    {
        "a": 0,
        "b": np.random.random(size),
    }
)
gb = df.groupby("a")

# This PR
print(gb.agg("sum").iloc[0, 0])
# 50150.538337372185
print(gb.agg(np.sum).iloc[0, 0])
# 50150.53833737219

# main
print(gb.agg("sum").iloc[0, 0])
# 50150.538337372185
print(gb.agg(np.sum).iloc[0, 0])
# 50150.538337372185

@rhshadrach rhshadrach added the Apply Apply, Aggregate, Transform, Map label May 28, 2023
builtins.max: np.maximum.reduce,
builtins.min: np.minimum.reduce,
}

_cython_table = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can _cython_table also ve removed?

Copy link
Member Author

@rhshadrach rhshadrach May 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to know whether the passed function is a NumPy function (maybe there is a better way to do this?) for backwards compatibility. This is because Series.apply treats these differently.


result = df.agg(np.std)
expected = Series({"A": 2.0, "B": 2.0}, dtype=float)
expected = Series({"A": expected_value, "B": expected_value}, dtype=float)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Different ddof here will be import to communicate in a possible deprecation.

result = df.agg(func, axis=axis)
tm.assert_frame_equal(result, expected)


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests should probably be testing, eventually for the results of the numpy functions?

Copy link
Member Author

@rhshadrach rhshadrach May 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I wasn't sure if this was just testing compatibility between e.g. "sum" and np.sum. It could be rewritten to test NumPy functions directly.

With this test removed, we have 1308 tests (counting different parametrizations) that pass a NumPy function into agg/apply/transform (within pandas.core.apply). 306 of those are in tests.apply.

Copy link
Contributor

@topper-123 topper-123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few comments because I read it anyway, though this is still in draft mode.

@github-actions
Copy link
Contributor

This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this.

@github-actions github-actions bot added the Stale label Jun 29, 2023
@rhshadrach
Copy link
Member Author

rhshadrach commented Jul 3, 2023

Closing in favor of #53974

@rhshadrach rhshadrach closed this Jul 3, 2023
@rhshadrach rhshadrach deleted the poc_replacing_funcs branch September 27, 2023 21:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Apply Apply, Aggregate, Transform, Map Stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants