Skip to content

Fix SS03 issues in docstrings #30733

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
galuhsahid opened this issue Jan 6, 2020 · 13 comments
Closed

Fix SS03 issues in docstrings #30733

galuhsahid opened this issue Jan 6, 2020 · 13 comments

Comments

@galuhsahid
Copy link
Contributor

galuhsahid commented Jan 6, 2020

Fix the docstrings where summary does not end with a period.

Current errors found:

$ ./scripts/validate_docstrings.py --errors=SS03
None:None:SS03:pandas.tseries.offsets.DateOffset.normalize:Summary does not end with a period
None:None:SS03:pandas.tseries.offsets.BusinessDay.normalize:Summary does not end with a period
None:None:SS03:pandas.tseries.offsets.BusinessHour.normalize:Summary does not end with a period
None:None:SS03:pandas.tseries.offsets.CustomBusinessDay.normalize:Summary does not end with a period
None:None:SS03:pandas.tseries.offsets.CustomBusinessHour.normalize:Summary does not end with a period
None:None:SS03:pandas.tseries.offsets.MonthOffset.normalize:Summary does not end with a period
None:None:SS03:pandas.tseries.offsets.MonthEnd.normalize:Summary does not end with a period
None:None:SS03:pandas.tseries.offsets.MonthBegin.normalize:Summary does not end with a period
None:None:SS03:pandas.tseries.offsets.BusinessMonthEnd.normalize:Summary does not end with a period
None:None:SS03:pandas.tseries.offsets.BusinessMonthBegin.normalize:Summary does not end with a period
None:None:SS03:pandas.tseries.offsets.CustomBusinessMonthEnd.normalize:Summary does not end with a period
None:None:SS03:pandas.tseries.offsets.CustomBusinessMonthBegin.normalize:Summary does not end with a period
None:None:SS03:pandas.tseries.offsets.SemiMonthOffset.normalize:Summary does not end with a period
None:None:SS03:pandas.tseries.offsets.SemiMonthEnd.normalize:Summary does not end with a period
None:None:SS03:pandas.tseries.offsets.SemiMonthBegin.normalize:Summary does not end with a period
None:None:SS03:pandas.tseries.offsets.Week.normalize:Summary does not end with a period
None:None:SS03:pandas.tseries.offsets.WeekOfMonth.normalize:Summary does not end with a period
None:None:SS03:pandas.tseries.offsets.LastWeekOfMonth.normalize:Summary does not end with a period
None:None:SS03:pandas.tseries.offsets.QuarterOffset.normalize:Summary does not end with a period
None:None:SS03:pandas.tseries.offsets.BQuarterEnd.normalize:Summary does not end with a period
None:None:SS03:pandas.tseries.offsets.BQuarterBegin.normalize:Summary does not end with a period
None:None:SS03:pandas.tseries.offsets.QuarterEnd.normalize:Summary does not end with a period
None:None:SS03:pandas.tseries.offsets.QuarterBegin.normalize:Summary does not end with a period
None:None:SS03:pandas.tseries.offsets.YearOffset.normalize:Summary does not end with a period
None:None:SS03:pandas.tseries.offsets.BYearEnd.normalize:Summary does not end with a period
None:None:SS03:pandas.tseries.offsets.BYearBegin.normalize:Summary does not end with a period
None:None:SS03:pandas.tseries.offsets.YearEnd.normalize:Summary does not end with a period
None:None:SS03:pandas.tseries.offsets.YearBegin.normalize:Summary does not end with a period
None:None:SS03:pandas.tseries.offsets.FY5253.normalize:Summary does not end with a period
None:None:SS03:pandas.tseries.offsets.FY5253Quarter.normalize:Summary does not end with a period
None:None:SS03:pandas.tseries.offsets.Easter.normalize:Summary does not end with a period
None:None:SS03:pandas.tseries.offsets.Tick.normalize:Summary does not end with a period
None:None:SS03:pandas.tseries.offsets.Day.normalize:Summary does not end with a period
None:None:SS03:pandas.tseries.offsets.Hour.normalize:Summary does not end with a period
None:None:SS03:pandas.tseries.offsets.Minute.normalize:Summary does not end with a period
None:None:SS03:pandas.tseries.offsets.Second.normalize:Summary does not end with a period
None:None:SS03:pandas.tseries.offsets.Milli.normalize:Summary does not end with a period
None:None:SS03:pandas.tseries.offsets.Micro.normalize:Summary does not end with a period
None:None:SS03:pandas.tseries.offsets.Nano.normalize:Summary does not end with a period
None:None:SS03:pandas.tseries.offsets.BDay.normalize:Summary does not end with a period
None:None:SS03:pandas.tseries.offsets.BMonthEnd.normalize:Summary does not end with a period
None:None:SS03:pandas.tseries.offsets.BMonthBegin.normalize:Summary does not end with a period
None:None:SS03:pandas.tseries.offsets.CBMonthEnd.normalize:Summary does not end with a period
None:None:SS03:pandas.tseries.offsets.CBMonthBegin.normalize:Summary does not end with a period
None:None:SS03:pandas.tseries.offsets.CDay.normalize:Summary does not end with a period
None:None:SS03:pandas.Timestamp.isoweekday:Summary does not end with a period
None:None:SS03:pandas.Timestamp.weekday:Summary does not end with a period
None:None:SS03:pandas.DatetimeIndex.freqstr:Summary does not end with a period
None:None:SS03:pandas.PeriodIndex.freqstr:Summary does not end with a period
pandas/pandas/core/window/indexers.py:34:SS03:pandas.api.indexers.BaseIndexer:Summary does not end with a period
None:None:SS03:pandas.io.formats.style.Styler.loader:Summary does not end with a period
@galuhsahid
Copy link
Contributor Author

I'll be working on this

@datapythonista
Copy link
Member

I think all those are false positives. The docstring being analyzed is not in pandas, but a the one of a Python object. This should be already fixed in the script, but not in the pandas version, but in the version moved to numpydoc. Which we still need to integrate into pandas.

Feel free to have a look and double check.

@galuhsahid
Copy link
Contributor Author

I've randomly sampled some examples and yes seems like those are all false positives. Is the revised script ready to use? If so, where can one find it? Or shall we wait until it is integrated into pandas? Thank you!

@datapythonista
Copy link
Member

I'm on it, just opened the PR, #30746, hopefully it merged in few days, and we can start using the new validation. In the meantime, feel free to fetch my branch locally, and run it, so get the new list of errors.

@galuhsahid galuhsahid reopened this Jan 6, 2020
@galuhsahid
Copy link
Contributor Author

I fetched your branch locally & tried running the new script but I get the same list of errors as previously. I'm not sure if I'm missing something, though.

@datapythonista
Copy link
Member

There were some changes implemented to the script, that were supposed to help get the docstring of the correct object. I'll have a look and find that PR, and see if everything is working as expected. I guess it's not.

@datapythonista
Copy link
Member

I was checking, I think this is the PR that was expected to fix this: #29012

Which was added to numpydoc: https://github.com/numpy/numpydoc/blob/master/numpydoc/validate.py#L130

So, my branch shouldn't report those errors. So, I guess the unwraps function is failing in the same way as our custom function.

Do you want to have a look and see if you can find the problem and a fix for it? Thanks!

@galuhsahid
Copy link
Contributor Author

Sure, I will take a look at it later!

@galuhsahid
Copy link
Contributor Author

OK, so I just noticed that all of these false positives have None as their source code file and line number. And the validation script says:

            # In some cases the object is something complex like a cython
            # object that can't be easily introspected. An it's better to
            # return the source code file of the object as None, than crash

I'm guessing the errors returned here are too difficult to be introspected, so they are still reported but the source code file returns None instead. Is my understanding correct? If so do you think a fix is possible or should we just let it be?

@datapythonista
Copy link
Member

I think you're mixing two different things. That comment refers to knowing where is the docstring being written in the code (in which file, and in which line number). That in some cases it's impossible to know. Imagine something like:

def foo():
   pass
foo.__doc__ = open('/tmp/docstring').read()

And there could be even trickier things, that won't let us know where to find the docstring. This is why we return None for the file.

For those false positives, we don't care about where the docstring is (we care in the json, to help contributors find where they need to make the changes). What we care is about the content of the docstring, and this should be always possible to obtain with:

foo.__doc__

The only problem is that if foo is a property, a decorated function... we don't really want the docstring of foo, but the docstring of the object that foo is wrapping. So, we have the "magic" you saw, with the unwraps and the previous _to_original_callable to deal with this. But that doesn't seem to be working well. If I'm not wrong, those false positives are of objects that return a string, and the docstring we're finally validating is the one of the Python str object.

@galuhsahid
Copy link
Contributor Author

I see, noted. I assumed there missing source file and the false positives are always related, but now I understand that they aren't.

I checked again and some of the errors reported were indeed true positives (e.g. pandas.DatetimeIndex.freqstr). Will open a PR for these.

But there are some that I'm not sure whether they are false positives or not, e.g. all the errors that end with normalize such as pandas.tseries.offsets.CustomBusinessMonthBegin.normalize. I somehow can't find the related docstrings in pandas/tseries/offsets.py. I thought they were false positives but now I'm not so sure. If they are also true positives it means the script is reporting the correct errors.

@datapythonista
Copy link
Member

I think I answered via email, but the problem with normalize and most false positives are with the unwraps and how objects are being fetched.

I think the best would be to find a minimal example when instead of obtaining the right object, we validate something else. For example:

class MyClass:
    @property
    def my_property(self):
        """We should validate this."""
        return 'foo'  # we are validating the docstring of the Python type str instead

Then, add a unit test in numpydoc, and fix it. I think it's probably not easy, but I think it's doable.

@galuhsahid
Copy link
Contributor Author

Closing this issue since I've fixed SS03 errors in #30939; will open a separate issue regarding the validation script

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants