-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
API: Signature of UDF methods #40112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Two thoughts
As many of these methods are very heavily used, I'm worried about the impact of having a FutureWarning for so many at once. I am wondering if a gradual transition may be more user friendly. |
For your list:
|
I'm now of the opinion these signatures should be:
Note the args / kwargs are not |
Maybe a more radical simplification of the signature. Instead of using func, args, kwargs, other_arg what if we used just: func, *, other_arg=value By dropping explicit My thinking is that this more dramatic change would simplify the API and reduce the risk of subtle bugs. In the current design, there’s always the potential for positional and keyword arguments to collide, possibly resulting in hard-to-debug issues. With this proposal, if there’s any mismatch or if a required argument isn’t provided correctly, the code will simply break immediately rather than silently introducing errors. That said, I remain cautious about whether this method will play nicely with JIT compilers or numpy ufuncs. It might require further investigation in those environments. Also, moving away from the standard Python idiom of How do you feel this over simplification will impact user experience? |
@simonjayhawkins - I find requiring users to use |
in #61128 (comment) @datapythonista wrote
Now while not wanting to block any progress, I felt that this was overlapping with the proposal here and that we should have agreement on your proposal here before we can consider adding missing args to other methods. My comment #40112 (comment) was more intended to restart this 4 year old discussion and not necessarily to not approve or agree other suggestions.
Now these two signatures are definitely clumsy IMO. if i have a function, say, def f(s, arg, kwarg=None):
... then to pass the required arg as an positional arg I would need to do s.agg(f,0,1) as I also need to pass the (unused) axis parameter of .agg as a positional arg to avoid whereas if i pass the arg for my function as a kwarg, I can simply write... s.agg(f,arg=1) and the required parameter for the function is passed and it is clearer and less error prone? So it makes me think that having signatures with Now, you originally proposed #40112 (comment)
to avoid the positional arguments after the keyword arguments. If we look at this differently and do not allow positional args to be passed to the function, your original proposal becomes... func=None, other_arg=value, **kwargs which is IMO better than you latter proposal of func, args, kwargs, other_arg sweeping up kwargs at the end seems neater and less of a change to existing methods. Now I appreciate that using **kwargs is not typing friendly and hence my proposal to remove that too! |
There are many signatures of pandas methods that take a UDF. If I've missed methods, please let me know and will update.
Below,
other_arg
is a placeholder for 1 or more keyword arguments, andvalue
is a placeholder for the default value.Signature:
func, other_arg=value, args=value, **kwargs
Signature:
func=None, other_arg=value, *args, **kwargs
Signature:
func, other_arg=value, *args, **kwargs
Signature:
func, *args, other_arg=Value, **kwargs
Signature:
func=None, *args, other_arg=Value, **kwargs
Signature:
func, *args, **kwargs
Signature:
arg, *args, **kwargs
The text was updated successfully, but these errors were encountered: