-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: groupby.transform doesn't raise on missing arguments #26840
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Makes sense. Would take a PR for sure if you’d like to try one.
…Sent from my iPhone
On Jun 13, 2019, at 9:17 PM, pilkibun ***@***.***> wrote:
For Example, fillna requires method kwarg to be specified. the frame method
raises an exception if it is missing, but transform silently returns an empty frame
In [22]: df=pd.DataFrame(dict(price=[10,10,20,20,30,30],color=[10,10,20,20,30,30],cost=(100,200,300,400,500,600)))
...: g=df.groupby(['price'])
...: try:
...: df.fillna()
...: except:
...: print("Raised: OK")
...: try:
...: g.fillna()
...: except:
...: print("Raised: OK")
...: else:
...: print("Raised: No, Failed")
...: try:
...: g.transform('fillna')
...: except:
...: print("Raised: OK")
...: else:
...: print("Raised: No, Failed")
Raised: OK
Raised: No, Failed
Raised: No, Failed
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
Looks to work on master now. Could use a test
|
take |
Hey @mroeschke I want to try to take this on. This will be my first open source contribution, so I just want to make sure I do this right. I do not see any specific file devoted to testing It looks to me like a test would best be housed at I don't see anything already testing this condition, or a similar test that this could be added to. Based on that, it seems like it should be its own test case. My inclination is to write one test function that requires positional arguments, one that requires keyword arguments, and one that requires both. That way, the test is not coupled to the What is the thought on adding such test functions as fixtures in |
|
@mroeschke Sorry for length of the essay to follow The Issue Reported HereThe specific issue reported here has to do with using a built-in function referenced with a string. Those functions are defined in Crucially, the transform method takes totally different routes depending on whether the function it receives is a function or a string. You can see that on lines 1431-1439 in
If the input is a string like In my opinion, that is the best way for it to be designed. However, it means that any test that tests the use of string inputs into Tangentially Related: User Generated FunctionsThe situation is much stickier if a function is passed into Firstly, the method starts by defining two functions which wrap the original function. They are called
Possible Problem 1: Original Function Stripped of MetadataGoing into the It may be beneficial to rewrite these declarations as proper decorated functions. Something like the following (func is defined in the outer scope):
Possible Problem 2: Fast and Slow Paths Accept Different InputsAs you can see, the In the case This would be fine with simple operations that can be broadcast across either a DataFrame or a Series. However, I don't think it would support more complex operations. For example, if I wanted to index a given column for every group I transformed, I would be unable to do that since I would get a It may be beneficial to rewrite the Possible Problem 3: THE BIGGEST PROBLEM I FOUNDOnce The result from The problem is,
In other words: both paths are currently being run on every group. Every transformation calculation is being done twice. It's possible there was supposed to be logic to only run Back to the Original Question: TestingAside from all the issues mentioned above, we may also benefit from testing to verify that user generated functions raise errors properly, even if we don't verify the built-in "string" functions like Right now the
Ultimately, it seems to me like tests enforcing the behavior that I described above (if that is what we desire the behavior to be) could be pretty beneficial. Definitely more beneficial than I think writing tests for the "string" methods would be. Summary/TL;DRWe could write tests for built-in functions like Built-in functions and user-generated functions take completely different routes, and there are some possible problems I identified with the user-generated route:
Ultimately, tests to enforce the current behavior for user-generated functions could be useful. Possibly more useful than tests for the built-in functions. Thanks if you read that whole thing!! Let me know what you think. |
Problem 3 should no longer exist: #41598 |
For Example, fillna requires
method
kwarg to be specified. the frame methodraises an exception if it is missing, but transform silently returns an empty frame
The text was updated successfully, but these errors were encountered: