Skip to content

DOC: Dataframe.skew invalid function signature for "skipna" #34063

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
bf opened this issue May 8, 2020 · 10 comments · Fixed by #43980
Closed

DOC: Dataframe.skew invalid function signature for "skipna" #34063

bf opened this issue May 8, 2020 · 10 comments · Fixed by #43980
Labels
Docs Reduction Operations sum, mean, min, max, etc.
Milestone

Comments

@bf
Copy link

bf commented May 8, 2020

Location of the documentation

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.skew.html

Documentation problem

In the function signature it says skipna=None even though the parameter documentation underneath says skipna bool, default True.

Suggested fix for documentation

The skipna=None in function signature should be changed to skipna=True.

@bf bf added Docs Needs Triage Issue that has not been reviewed by a pandas team member labels May 8, 2020
@bf bf changed the title DOC: DOC: Dataframe.skew invalid function signature for "skipna" May 8, 2020
@MarcoGorelli
Copy link
Member

Thanks @bf - are you interested in opening a pull request?

@MarcoGorelli MarcoGorelli added good first issue and removed Needs Triage Issue that has not been reviewed by a pandas team member labels May 8, 2020
@bf
Copy link
Author

bf commented May 8, 2020

@MarcoGorelli I'd love to create a PR, but I have trouble finding the proper source file.

Could you help me find the root cause?

I have found

cls.skew = _make_stat_function(
which seems to indicate that the definition of _make_stat_function() in
def _make_stat_function(
erroneously sets skipna=None at
self, axis=None, skipna=None, level=None, numeric_only=None, **kwargs

But as _make_stat_function() is used all over the place in that file, we'd need to check every single function if the default skipna=True is really correct.

Do you agree with this process?

@MarcoGorelli
Copy link
Member

MarcoGorelli commented May 8, 2020

Thanks for taking a look

Have just had a look at the source code for _make_stat_function, and it says

        if skipna is None:
            skipna = True

TBH I don't understand why the default value is set to None, and then it's changed to True in every case.

@simonjayhawkins would it be OK to change stat_func's signature to have

skipna=True

and then remove the lines

        if skipna is None:
            skipna = True

so that in the documentation, the default value of True (from _num_doc) would match the default shown in the signature? I'm sorry if there's something obvious I'm missing here

@simonjayhawkins
Copy link
Member

@bf Thanks for the report.

the latest docs are at https://pandas.pydata.org/docs/dev/reference/api/pandas.DataFrame.skew.html

The type annotations in the function signature are now removed to avoid this type of conflict with the paramater descriptions, see #33312

@bf
Copy link
Author

bf commented May 8, 2020

You are right, the issue exists also for other doc pages which use _make_stat_function, e.g. https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.std.html

I imagine the if skipna is None: skipna = True is most likely done to enable some static type checking for the underlying functions, or to remove some weird testing error, as it makes sure all functions only receive a proper boolean parameter.

@bf
Copy link
Author

bf commented May 8, 2020

@simonjayhawkins seems like we posted at the same time. Thanks for the fix and for your great work.

@simonjayhawkins
Copy link
Member

I'll reopen, since even without the type annotations, the default is still shown as None

@MarcoGorelli
Copy link
Member

MarcoGorelli commented May 8, 2020

@simonjayhawkins isn't this still present even without the type annotations?
image

EDIT

sorry, commenting almost at the same again :)

@bf
Copy link
Author

bf commented May 8, 2020

Not sure if I need to create a new issue, but in the docs for Dataframe.mode the skipna parameter is suddenly called dropna. https://pandas.pydata.org/pandas-docs/dev/reference/api/pandas.DataFrame.mode.html

Maybe it makes sense to either call everything skipna or dropna.

@MarcoGorelli
Copy link
Member

@bf that would be a separate issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Docs Reduction Operations sum, mean, min, max, etc.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants