`DataFrame.apply` stub doesn't reflect default value of `axis` parameter #393

debonte · 2022-10-18T19:06:36Z

Describe the bug
If you call DataFrame.apply and pass parameters beyond f but not including axis, you'll get the following error from Pyright:

No overloads for "apply" match the provided arguments

To Reproduce

import pandas as pd

def gethead(s, y):
    return s.head(y)

df = pd.DataFrame({"A": [1, 2, 3, 4, 5, 6], "B": [11, 12, 13, 14, 15, 16]})

df2 = df.apply(gethead, args=tuple([4]))  # error

Pylance gives the following error on the apply call:

No overloads for "apply" match the provided arguments
  Argument types: ((s: Unknown, y: Unknown) -> Unknown, tuple[int, ...])Pylance[reportGeneralTypeIssues](https://github.com/microsoft/pylance-release/blob/main/DIAGNOSTIC_SEVERITY_RULES.md#diagnostic-severity-rules)

Please complete the following information:

OS: Windows
OS Version: 10.0.25197 Build 25197
python version: 3.10.0
version of type checker: Pylance 2022.10.21
version of installed pandas-stubs: 1.5.0.220926

Additional context
Originally reported by @jonmooser at microsoft/pylance-release#3491

I was originally thinking that the fix was to indicate in the stubs that axis has a default value (axis: AxisType = ...), but then the two overloads of apply overlap with each other. It was looking like the correct fix would require more knowledge of the innards of apply and Pandas in general than I have.

The text was updated successfully, but these errors were encountered:

Dr-Irv · 2022-10-19T16:50:43Z

The trick here is that the function gethead() has to be properly typed to return a Series, and if default arguments are used, then we should return a DataFrame when we know that the function returns a Series. See the description of the result_type argument which describes the default behavior.

So for your example to work, it would have to change, and we have to add an overload for apply()

jonmooser · 2022-10-19T17:49:05Z

See the description of the result_type argument which describes the default behavior.

Dr-Irv,
Could you clarify the significance of the result_type argument? The doc says that only applies when axis=1. In my example axis=0 (implicitly)

Dr-Irv · 2022-10-19T18:23:18Z

Dr-Irv, Could you clarify the significance of the result_type argument? The doc says that only applies when axis=1. In my example axis=0 (implicitly)

I think (but not 100% sure) that the comment about axis=1 means that the values that are not None only take effect when axis=1, but do nothing when axis=0. But the default value of result_type==None does change the type of the result independent of the value of axis.

jonmooser · 2022-10-19T19:05:16Z

I think (but not 100% sure) that the comment about axis=1 means that the values that are not None only take effect when axis=1, but do nothing when axis=0. But the default value of result_type==None does change the type of the result independent of the value of axis.

Based on some experimenting, that's correct. If the passed function returns a scalar, apply() will create a Series. If the function returns something list-like, apply() will create a DataFrame.

But I think the underlying problem here may be a bit different. None of the overloads allows for an args param with no axis param.

Dr-Irv · 2022-10-19T19:09:09Z

But I think the underlying problem here may be a bit different. None of the overloads allows for an args param with no axis param.

We need to have a new overload, where the return type is based on whether the function f is Callable[..., ScalarT] or Callable[..., list-like], where list-like is appropriately defined.

* Series.value_counts returns Series[int]. * Series.apply callable result might not be hashable It's possible for the callable in Series.apply to return something non-hashable like a list, but the result of apply should still be a Series. * More detailed typing for DataFrame.apply. Whether it returns a Series or a DataFrame depends on the return type of the callable. In the case of the callable returning a scalar, the result is a Series unless the result_type is "broadcast". * Add test for #393.

Dr-Irv · 2022-10-27T11:59:39Z

fixed in #401

debonte mentioned this issue Oct 18, 2022

Incorrect 'No overloads for "apply" match the provided arguments' on the pandas apply function microsoft/pylance-release#3491

Closed

danielroseman added a commit to danielroseman/pandas-stubs that referenced this issue Oct 27, 2022

Add test for pandas-dev#393.

b1d4b25

Dr-Irv closed this as completed Oct 27, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`DataFrame.apply` stub doesn't reflect default value of `axis` parameter #393

`DataFrame.apply` stub doesn't reflect default value of `axis` parameter #393

debonte commented Oct 18, 2022

Dr-Irv commented Oct 19, 2022

jonmooser commented Oct 19, 2022

Dr-Irv commented Oct 19, 2022

jonmooser commented Oct 19, 2022

Dr-Irv commented Oct 19, 2022

Dr-Irv commented Oct 27, 2022

DataFrame.apply stub doesn't reflect default value of axis parameter #393

DataFrame.apply stub doesn't reflect default value of axis parameter #393

Comments

debonte commented Oct 18, 2022

Dr-Irv commented Oct 19, 2022

jonmooser commented Oct 19, 2022

Dr-Irv commented Oct 19, 2022

jonmooser commented Oct 19, 2022

Dr-Irv commented Oct 19, 2022

Dr-Irv commented Oct 27, 2022

`DataFrame.apply` stub doesn't reflect default value of `axis` parameter #393

`DataFrame.apply` stub doesn't reflect default value of `axis` parameter #393