Skip to content

DOC: Fix docstring quotes in pandas.tseries #26982

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
datapythonista opened this issue Jun 21, 2019 · 4 comments · Fixed by #36606
Closed

DOC: Fix docstring quotes in pandas.tseries #26982

datapythonista opened this issue Jun 21, 2019 · 4 comments · Fixed by #36606

Comments

@datapythonista
Copy link
Member

In pandas we keep the quotes in the docstrings standardized in the next way:

def foo():
    """
    This is correct.
    """

But for historical reasons we still have many of this form:

def foo():
    """This is incorrect."""

We have a script that is able to detect the wrong cases, that gives the next errors in pandas.tseries:

$ ./scripts/validate_docstrings.py --errors=GL01,GL02 --prefix=pandas.tseries
pandas.tseries.offsets.DateOffset.normalize: Docstring text (summary) should start in the line immediately after the opening quotes (not in the same line, or leaving a blank line in between)
pandas.tseries.offsets.DateOffset.normalize: Closing quotes should be placed in the line after the last text in the docstring (do not close the quotes in the same line as the text, or leave a blank line between the last text and the quotes)
pandas.tseries.offsets.BusinessDay.normalize: Docstring text (summary) should start in the line immediately after the opening quotes (not in the same line, or leaving a blank line in between)
pandas.tseries.offsets.BusinessDay.normalize: Closing quotes should be placed in the line after the last text in the docstring (do not close the quotes in the same line as the text, or leave a blank line between the last text and the quotes)
pandas.tseries.offsets.BusinessHour.normalize: Docstring text (summary) should start in the line immediately after the opening quotes (not in the same line, or leaving a blank line in between)
pandas.tseries.offsets.BusinessHour.normalize: Closing quotes should be placed in the line after the last text in the docstring (do not close the quotes in the same line as the text, or leave a blank line between the last text and the quotes)
pandas.tseries.offsets.CustomBusinessDay.normalize: Docstring text (summary) should start in the line immediately after the opening quotes (not in the same line, or leaving a blank line in between)
pandas.tseries.offsets.CustomBusinessDay.normalize: Closing quotes should be placed in the line after the last text in the docstring (do not close the quotes in the same line as the text, or leave a blank line between the last text and the quotes)
pandas.tseries.offsets.CustomBusinessHour.normalize: Docstring text (summary) should start in the line immediately after the opening quotes (not in the same line, or leaving a blank line in between)
pandas.tseries.offsets.CustomBusinessHour.normalize: Closing quotes should be placed in the line after the last text in the docstring (do not close the quotes in the same line as the text, or leave a blank line between the last text and the quotes)
pandas.tseries.offsets.MonthOffset.normalize: Docstring text (summary) should start in the line immediately after the opening quotes (not in the same line, or leaving a blank line in between)
pandas.tseries.offsets.MonthOffset.normalize: Closing quotes should be placed in the line after the last text in the docstring (do not close the quotes in the same line as the text, or leave a blank line between the last text and the quotes)
pandas.tseries.offsets.MonthEnd.normalize: Docstring text (summary) should start in the line immediately after the opening quotes (not in the same line, or leaving a blank line in between)
pandas.tseries.offsets.MonthEnd.normalize: Closing quotes should be placed in the line after the last text in the docstring (do not close the quotes in the same line as the text, or leave a blank line between the last text and the quotes)
pandas.tseries.offsets.MonthBegin.normalize: Docstring text (summary) should start in the line immediately after the opening quotes (not in the same line, or leaving a blank line in between)
pandas.tseries.offsets.MonthBegin.normalize: Closing quotes should be placed in the line after the last text in the docstring (do not close the quotes in the same line as the text, or leave a blank line between the last text and the quotes)
pandas.tseries.offsets.BusinessMonthEnd.normalize: Docstring text (summary) should start in the line immediately after the opening quotes (not in the same line, or leaving a blank line in between)
pandas.tseries.offsets.BusinessMonthEnd.normalize: Closing quotes should be placed in the line after the last text in the docstring (do not close the quotes in the same line as the text, or leave a blank line between the last text and the quotes)
pandas.tseries.offsets.BusinessMonthBegin.normalize: Docstring text (summary) should start in the line immediately after the opening quotes (not in the same line, or leaving a blank line in between)
pandas.tseries.offsets.BusinessMonthBegin.normalize: Closing quotes should be placed in the line after the last text in the docstring (do not close the quotes in the same line as the text, or leave a blank line between the last text and the quotes)
pandas.tseries.offsets.CustomBusinessMonthEnd.normalize: Docstring text (summary) should start in the line immediately after the opening quotes (not in the same line, or leaving a blank line in between)
pandas.tseries.offsets.CustomBusinessMonthEnd.normalize: Closing quotes should be placed in the line after the last text in the docstring (do not close the quotes in the same line as the text, or leave a blank line between the last text and the quotes)
pandas.tseries.offsets.CustomBusinessMonthBegin.normalize: Docstring text (summary) should start in the line immediately after the opening quotes (not in the same line, or leaving a blank line in between)
pandas.tseries.offsets.CustomBusinessMonthBegin.normalize: Closing quotes should be placed in the line after the last text in the docstring (do not close the quotes in the same line as the text, or leave a blank line between the last text and the quotes)
pandas.tseries.offsets.SemiMonthOffset.normalize: Docstring text (summary) should start in the line immediately after the opening quotes (not in the same line, or leaving a blank line in between)
pandas.tseries.offsets.SemiMonthOffset.normalize: Closing quotes should be placed in the line after the last text in the docstring (do not close the quotes in the same line as the text, or leave a blank line between the last text and the quotes)
pandas.tseries.offsets.SemiMonthEnd.normalize: Docstring text (summary) should start in the line immediately after the opening quotes (not in the same line, or leaving a blank line in between)
pandas.tseries.offsets.SemiMonthEnd.normalize: Closing quotes should be placed in the line after the last text in the docstring (do not close the quotes in the same line as the text, or leave a blank line between the last text and the quotes)
pandas.tseries.offsets.SemiMonthBegin.normalize: Docstring text (summary) should start in the line immediately after the opening quotes (not in the same line, or leaving a blank line in between)
pandas.tseries.offsets.SemiMonthBegin.normalize: Closing quotes should be placed in the line after the last text in the docstring (do not close the quotes in the same line as the text, or leave a blank line between the last text and the quotes)
pandas.tseries.offsets.Week.normalize: Docstring text (summary) should start in the line immediately after the opening quotes (not in the same line, or leaving a blank line in between)
pandas.tseries.offsets.Week.normalize: Closing quotes should be placed in the line after the last text in the docstring (do not close the quotes in the same line as the text, or leave a blank line between the last text and the quotes)
pandas.tseries.offsets.WeekOfMonth.normalize: Docstring text (summary) should start in the line immediately after the opening quotes (not in the same line, or leaving a blank line in between)
pandas.tseries.offsets.WeekOfMonth.normalize: Closing quotes should be placed in the line after the last text in the docstring (do not close the quotes in the same line as the text, or leave a blank line between the last text and the quotes)
pandas.tseries.offsets.LastWeekOfMonth.normalize: Docstring text (summary) should start in the line immediately after the opening quotes (not in the same line, or leaving a blank line in between)
pandas.tseries.offsets.LastWeekOfMonth.normalize: Closing quotes should be placed in the line after the last text in the docstring (do not close the quotes in the same line as the text, or leave a blank line between the last text and the quotes)
pandas.tseries.offsets.QuarterOffset.normalize: Docstring text (summary) should start in the line immediately after the opening quotes (not in the same line, or leaving a blank line in between)
pandas.tseries.offsets.QuarterOffset.normalize: Closing quotes should be placed in the line after the last text in the docstring (do not close the quotes in the same line as the text, or leave a blank line between the last text and the quotes)
pandas.tseries.offsets.BQuarterEnd.normalize: Docstring text (summary) should start in the line immediately after the opening quotes (not in the same line, or leaving a blank line in between)
pandas.tseries.offsets.BQuarterEnd.normalize: Closing quotes should be placed in the line after the last text in the docstring (do not close the quotes in the same line as the text, or leave a blank line between the last text and the quotes)
pandas.tseries.offsets.BQuarterBegin.normalize: Docstring text (summary) should start in the line immediately after the opening quotes (not in the same line, or leaving a blank line in between)
pandas.tseries.offsets.BQuarterBegin.normalize: Closing quotes should be placed in the line after the last text in the docstring (do not close the quotes in the same line as the text, or leave a blank line between the last text and the quotes)
pandas.tseries.offsets.QuarterEnd.normalize: Docstring text (summary) should start in the line immediately after the opening quotes (not in the same line, or leaving a blank line in between)
pandas.tseries.offsets.QuarterEnd.normalize: Closing quotes should be placed in the line after the last text in the docstring (do not close the quotes in the same line as the text, or leave a blank line between the last text and the quotes)
pandas.tseries.offsets.QuarterBegin.normalize: Docstring text (summary) should start in the line immediately after the opening quotes (not in the same line, or leaving a blank line in between)
pandas.tseries.offsets.QuarterBegin.normalize: Closing quotes should be placed in the line after the last text in the docstring (do not close the quotes in the same line as the text, or leave a blank line between the last text and the quotes)
pandas.tseries.offsets.YearOffset.normalize: Docstring text (summary) should start in the line immediately after the opening quotes (not in the same line, or leaving a blank line in between)
pandas.tseries.offsets.YearOffset.normalize: Closing quotes should be placed in the line after the last text in the docstring (do not close the quotes in the same line as the text, or leave a blank line between the last text and the quotes)
pandas.tseries.offsets.BYearEnd.normalize: Docstring text (summary) should start in the line immediately after the opening quotes (not in the same line, or leaving a blank line in between)
pandas.tseries.offsets.BYearEnd.normalize: Closing quotes should be placed in the line after the last text in the docstring (do not close the quotes in the same line as the text, or leave a blank line between the last text and the quotes)
pandas.tseries.offsets.BYearBegin.normalize: Docstring text (summary) should start in the line immediately after the opening quotes (not in the same line, or leaving a blank line in between)
pandas.tseries.offsets.BYearBegin.normalize: Closing quotes should be placed in the line after the last text in the docstring (do not close the quotes in the same line as the text, or leave a blank line between the last text and the quotes)
pandas.tseries.offsets.YearEnd.normalize: Docstring text (summary) should start in the line immediately after the opening quotes (not in the same line, or leaving a blank line in between)
pandas.tseries.offsets.YearEnd.normalize: Closing quotes should be placed in the line after the last text in the docstring (do not close the quotes in the same line as the text, or leave a blank line between the last text and the quotes)
pandas.tseries.offsets.YearBegin.normalize: Docstring text (summary) should start in the line immediately after the opening quotes (not in the same line, or leaving a blank line in between)
pandas.tseries.offsets.YearBegin.normalize: Closing quotes should be placed in the line after the last text in the docstring (do not close the quotes in the same line as the text, or leave a blank line between the last text and the quotes)
pandas.tseries.offsets.FY5253.normalize: Docstring text (summary) should start in the line immediately after the opening quotes (not in the same line, or leaving a blank line in between)
pandas.tseries.offsets.FY5253.normalize: Closing quotes should be placed in the line after the last text in the docstring (do not close the quotes in the same line as the text, or leave a blank line between the last text and the quotes)
pandas.tseries.offsets.FY5253Quarter.normalize: Docstring text (summary) should start in the line immediately after the opening quotes (not in the same line, or leaving a blank line in between)
pandas.tseries.offsets.FY5253Quarter.normalize: Closing quotes should be placed in the line after the last text in the docstring (do not close the quotes in the same line as the text, or leave a blank line between the last text and the quotes)
pandas.tseries.offsets.Easter.normalize: Docstring text (summary) should start in the line immediately after the opening quotes (not in the same line, or leaving a blank line in between)
pandas.tseries.offsets.Easter.normalize: Closing quotes should be placed in the line after the last text in the docstring (do not close the quotes in the same line as the text, or leave a blank line between the last text and the quotes)
pandas.tseries.offsets.Tick.normalize: Docstring text (summary) should start in the line immediately after the opening quotes (not in the same line, or leaving a blank line in between)
pandas.tseries.offsets.Tick.normalize: Closing quotes should be placed in the line after the last text in the docstring (do not close the quotes in the same line as the text, or leave a blank line between the last text and the quotes)
pandas.tseries.offsets.Day.normalize: Docstring text (summary) should start in the line immediately after the opening quotes (not in the same line, or leaving a blank line in between)
pandas.tseries.offsets.Day.normalize: Closing quotes should be placed in the line after the last text in the docstring (do not close the quotes in the same line as the text, or leave a blank line between the last text and the quotes)
pandas.tseries.offsets.Hour.normalize: Docstring text (summary) should start in the line immediately after the opening quotes (not in the same line, or leaving a blank line in between)
pandas.tseries.offsets.Hour.normalize: Closing quotes should be placed in the line after the last text in the docstring (do not close the quotes in the same line as the text, or leave a blank line between the last text and the quotes)
pandas.tseries.offsets.Minute.normalize: Docstring text (summary) should start in the line immediately after the opening quotes (not in the same line, or leaving a blank line in between)
pandas.tseries.offsets.Minute.normalize: Closing quotes should be placed in the line after the last text in the docstring (do not close the quotes in the same line as the text, or leave a blank line between the last text and the quotes)
pandas.tseries.offsets.Second.normalize: Docstring text (summary) should start in the line immediately after the opening quotes (not in the same line, or leaving a blank line in between)
pandas.tseries.offsets.Second.normalize: Closing quotes should be placed in the line after the last text in the docstring (do not close the quotes in the same line as the text, or leave a blank line between the last text and the quotes)
pandas.tseries.offsets.Milli.normalize: Docstring text (summary) should start in the line immediately after the opening quotes (not in the same line, or leaving a blank line in between)
pandas.tseries.offsets.Milli.normalize: Closing quotes should be placed in the line after the last text in the docstring (do not close the quotes in the same line as the text, or leave a blank line between the last text and the quotes)
pandas.tseries.offsets.Micro.normalize: Docstring text (summary) should start in the line immediately after the opening quotes (not in the same line, or leaving a blank line in between)
pandas.tseries.offsets.Micro.normalize: Closing quotes should be placed in the line after the last text in the docstring (do not close the quotes in the same line as the text, or leave a blank line between the last text and the quotes)
pandas.tseries.offsets.Nano.normalize: Docstring text (summary) should start in the line immediately after the opening quotes (not in the same line, or leaving a blank line in between)
pandas.tseries.offsets.Nano.normalize: Closing quotes should be placed in the line after the last text in the docstring (do not close the quotes in the same line as the text, or leave a blank line between the last text and the quotes)
pandas.tseries.offsets.BDay.normalize: Docstring text (summary) should start in the line immediately after the opening quotes (not in the same line, or leaving a blank line in between)
pandas.tseries.offsets.BDay.normalize: Closing quotes should be placed in the line after the last text in the docstring (do not close the quotes in the same line as the text, or leave a blank line between the last text and the quotes)
pandas.tseries.offsets.BMonthEnd.normalize: Docstring text (summary) should start in the line immediately after the opening quotes (not in the same line, or leaving a blank line in between)
pandas.tseries.offsets.BMonthEnd.normalize: Closing quotes should be placed in the line after the last text in the docstring (do not close the quotes in the same line as the text, or leave a blank line between the last text and the quotes)
pandas.tseries.offsets.BMonthBegin.normalize: Docstring text (summary) should start in the line immediately after the opening quotes (not in the same line, or leaving a blank line in between)
pandas.tseries.offsets.BMonthBegin.normalize: Closing quotes should be placed in the line after the last text in the docstring (do not close the quotes in the same line as the text, or leave a blank line between the last text and the quotes)
pandas.tseries.offsets.CBMonthEnd.normalize: Docstring text (summary) should start in the line immediately after the opening quotes (not in the same line, or leaving a blank line in between)
pandas.tseries.offsets.CBMonthEnd.normalize: Closing quotes should be placed in the line after the last text in the docstring (do not close the quotes in the same line as the text, or leave a blank line between the last text and the quotes)
pandas.tseries.offsets.CBMonthBegin.normalize: Docstring text (summary) should start in the line immediately after the opening quotes (not in the same line, or leaving a blank line in between)
pandas.tseries.offsets.CBMonthBegin.normalize: Closing quotes should be placed in the line after the last text in the docstring (do not close the quotes in the same line as the text, or leave a blank line between the last text and the quotes)
pandas.tseries.offsets.CDay.normalize: Docstring text (summary) should start in the line immediately after the opening quotes (not in the same line, or leaving a blank line in between)
pandas.tseries.offsets.CDay.normalize: Closing quotes should be placed in the line after the last text in the docstring (do not close the quotes in the same line as the text, or leave a blank line between the last text and the quotes)

We should fix all them, in preparation to validate in the continuous integration that all docstrings in pandas follow our standard.

@Viktor-Demin
Copy link

@datapythonista I checked this issue and while script really returns warnings I wasn't able to find incorrect docstring quotes in offsets.py file.
Does issue still exist and if yes please clarify it.

for example:

class FY5253(DateOffset):
    """
    Describes 52-53 week fiscal year. This is also known as a 4-4-5 calendar.

    It is used by companies that desire that their
    fiscal year always end on the same day of the week.

    It is a method of managing accounting periods.
    It is a common calendar structure for some industries,
    such as retail, manufacturing and parking industry.

    For more information see:
    http://en.wikipedia.org/wiki/4-4-5_calendar

    The year may either:
    - end on the last X day of the Y month.
    - end on the last X day closest to the last day of the Y month.

    X is a specific day of the week.
    Y is a certain month of the year

    Parameters
    ----------
    n : int
    weekday : {0, 1, ..., 6}
        0: Mondays
        1: Tuesdays
        2: Wednesdays
        3: Thursdays
        4: Fridays
        5: Saturdays
        6: Sundays
    startingMonth : The month in which fiscal years end. {1, 2, ... 12}
    variation : str
        {"nearest", "last"} for "LastOfMonth" or "NearestEndMonth"
    """

or

class Easter(DateOffset):
    """
    DateOffset for the Easter holiday using logic defined in dateutil.

    Right now uses the revised method which is valid in years 1583-4099.
    """

@datapythonista
Copy link
Member Author

@jbrockmendel, I think you've been working on offsets, and will have some more context than me. The normalize docstring is giving us some errors, and can't find a good solution to fix it.

I think this is the attribute: https://github.com/pandas-dev/pandas/blob/master/pandas/tseries/offsets.py#L208

In other cases, we fixed this by making the attribute a property, so we can specify the docstring. But in this case I'm getting an exception:

Traceback (most recent call last):
  File "./scripts/validate_docstrings.py", line 52, in <module>
    import pandas
  File "C:\msys64\home\UKC3153\src\pandas\pandas\__init__.py", line 44, in <module>
    from pandas.core.api import (
  File "C:\msys64\home\UKC3153\src\pandas\pandas\core\api.py", line 5, in <module>
    from pandas.core.arrays.integer import (
  File "C:\msys64\home\UKC3153\src\pandas\pandas\core\arrays\__init__.py", line 6, in <module>
    from .datetimes import DatetimeArray  # noqa
  File "C:\msys64\home\UKC3153\src\pandas\pandas\core\arrays\datetimes.py", line 29, in <module>
    from pandas.core.arrays import datetimelike as dtl
  File "C:\msys64\home\UKC3153\src\pandas\pandas\core\arrays\datetimelike.py", line 36, in <module>
    from pandas.tseries import frequencies
  File "C:\msys64\home\UKC3153\src\pandas\pandas\tseries\frequencies.py", line 26, in <module>
    from pandas.tseries.offsets import (
  File "C:\msys64\home\UKC3153\src\pandas\pandas\tseries\offsets.py", line 2422, in <module>
    def generate_range(start=None, end=None, periods=None, offset=BDay()):
  File "C:\msys64\home\UKC3153\src\pandas\pandas\tseries\offsets.py", line 476, in __init__
    BaseOffset.__init__(self, n, normalize)
  File "pandas\_libs\tslibs\offsets.pyx", line 325, in pandas._libs.tslibs.offsets._BaseOffset.__init__
    object.__setattr__(self, "normalize", normalize)
AttributeError: can't set attribute

Didn't check in detail, but I thought this could be fixed with a setter. But I get another exception:

Traceback (most recent call last):
  File "./scripts/validate_docstrings.py", line 52, in <module>
    import pandas
  File "C:\msys64\home\UKC3153\src\pandas\pandas\__init__.py", line 44, in <module>
    from pandas.core.api import (
  File "C:\msys64\home\UKC3153\src\pandas\pandas\core\api.py", line 5, in <module>
    from pandas.core.arrays.integer import (
  File "C:\msys64\home\UKC3153\src\pandas\pandas\core\arrays\__init__.py", line 6, in <module>
    from .datetimes import DatetimeArray  # noqa
  File "C:\msys64\home\UKC3153\src\pandas\pandas\core\arrays\datetimes.py", line 29, in <module>
    from pandas.core.arrays import datetimelike as dtl
  File "C:\msys64\home\UKC3153\src\pandas\pandas\core\arrays\datetimelike.py", line 36, in <module>
    from pandas.tseries import frequencies
  File "C:\msys64\home\UKC3153\src\pandas\pandas\tseries\frequencies.py", line 26, in <module>
    from pandas.tseries.offsets import (
  File "C:\msys64\home\UKC3153\src\pandas\pandas\tseries\offsets.py", line 2432, in <module>
    def generate_range(start=None, end=None, periods=None, offset=BDay()):
  File "C:\msys64\home\UKC3153\src\pandas\pandas\tseries\offsets.py", line 486, in __init__
    BaseOffset.__init__(self, n, normalize)
  File "pandas\_libs\tslibs\offsets.pyx", line 325, in pandas._libs.tslibs.offsets._BaseOffset.__init__
    object.__setattr__(self, "normalize", normalize)
  File "C:\msys64\home\UKC3153\src\pandas\pandas\tseries\offsets.py", line 219, in normalize
    self._normalize = value
  File "pandas\_libs\tslibs\offsets.pyx", line 329, in pandas._libs.tslibs.offsets._BaseOffset.__setattr__
    raise AttributeError("DateOffset objects are immutable.")
AttributeError: DateOffset objects are immutable.

Can't set the __doc__ attribute of normalize either. Any idea on what we can do to specify a docstring for it? We're close to add validation in the CI for some of the errors that normalize is giving, so would be nice to fix it.

The two solutions I have in mind are:

  • Change the validation script, since the problem seems to be that we're detecting the docstring of bool, since the value of normalize is False
  • Remove normalize from the public documentation, since it doesn't have documentation anyway

But I think we should be able to add a docstring and document normalize.

@jbrockmendel
Copy link
Member

There are a few things going on here.

A while back we needed to make DateOffset immutable. Long story short, we hacked it together by patching __setattr__ to just not allow attribute setting. So for e.g. the call you have in 219 self._normalize = value, you would need to use object.__setattr__(self, "_normalize", value)

That probably isn't what you want, because normalize is not a property, so _normalize wouldn't do anything. Making it a property is probably a reasonable solution to the docstring issue.

My only qualm there is the performance impact of property lookups, since we've gone way out of our way to optimize these objects. I don't know how big a performance impact this would actually be, so can't hurt to try.

@datapythonista
Copy link
Member Author

Thanks for the info, that's very helpful. I don't see an easy solution here, I guess we'll change the docstring script for now, so it doesn't fail.

jbrockmendel added a commit to jbrockmendel/pandas that referenced this issue Sep 24, 2020
@jbrockmendel jbrockmendel mentioned this issue Sep 24, 2020
6 tasks
@jreback jreback added this to the 1.2 milestone Sep 24, 2020
jbrockmendel added a commit that referenced this issue Sep 28, 2020
* docstring fixups, closes #26982

* update RangeIndex docstring, closes #22373

* CLN: misc

* CLN: update Makefil

* update nat docstrings to match

* revert controversial
kesmit13 pushed a commit to kesmit13/pandas that referenced this issue Nov 2, 2020
* docstring fixups, closes pandas-dev#26982

* update RangeIndex docstring, closes pandas-dev#22373

* CLN: misc

* CLN: update Makefil

* update nat docstrings to match

* revert controversial
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants