Skip to content

DOC: Fixes PR01, PR02 for Series.cat and Series.dt #58912

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

ghost
Copy link

@ghost ghost commented Jun 4, 2024

Ref #58504

pandas.Series.cat.add_categories PR01,PR02
pandas.Series.cat.as_ordered PR01
pandas.Series.cat.as_unordered PR01
pandas.Series.cat.remove_categories PR01,PR02
pandas.Series.cat.remove_unused_categories PR01
pandas.Series.cat.rename_categories PR01,PR02
pandas.Series.cat.reorder_categories PR01,PR02
pandas.Series.cat.set_categories PR01,PR02
pandas.Series.dt.as_unit PR01,PR02
pandas.Series.dt.ceil PR01,PR02
pandas.Series.dt.day_name PR01,PR02
pandas.Series.dt.floor PR01,PR02
pandas.Series.dt.month_name PR01,PR02
pandas.Series.dt.normalize PR01
pandas.Series.dt.round PR01,PR02
pandas.Series.dt.strftime PR01,PR02
pandas.Series.dt.to_period PR01,PR02
pandas.Series.dt.total_seconds PR01
pandas.Series.dt.tz_convert PR01,PR02
pandas.Series.dt.tz_localize PR01,PR02

When trying to fix PR01 and PR02 for Series.cat in #58750, I found that the current docstring validation raises these errors as false negatives. These false negatives occur for functions that are indirectly accessed through the Series accessors. Instead of the actual functions, they call f(self, *args, **kwargs) in pandas/core/accessor.py. When validating the docstrings, f's signature is used instead of the actual function's signature. PR01 is raised since the docstring does not have *args and **kwargs in the signature, and PR02 is raised since the parameters in the docstrings are not in the signature. I found this happens for Series.cat and Series.dt.

I added a check before validating the docstring to replace accessor names with the actual function's name. I'm sure there's a better way to implement this, but I'm not too knowledgeable on how these accessors work exactly. I couldn't figure out a way to get the delegated function from an accessor so I ended up with this crude solution using string literals.

Any suggestions or advice would be really helpful!

@ghost ghost requested a review from mroeschke as a code owner June 4, 2024 01:22
@Aloqeely
Copy link
Member

Thanks for the PR! Ideally we'd like a proper fix for this rather than hardcoding it.
@jbrockmendel I think you worked on this, do you have any ideas?

@jbrockmendel
Copy link
Member

some of the array classes have e.g. _bools_ops, _object_ops, _field_ops attributes that might be re-usable?

@ghost
Copy link
Author

ghost commented Jun 29, 2024

Thank you for the response! I looked into it and found that DatetimeArray and TimedeltaArray have the _datetimelike_methods attribute that definitely could be used for this. There wasn't anything like that for TimelikeOps and DatelikeOps though, so I'm using dir() for now.

Does this look better? I also tried to make it so the module part of the name isn't hardcoded either.

# Move these imports to the top
from pandas.core.arrays.categorical import Categorical
from pandas.core.arrays.datetimelike import (
    TimelikeOps,
    DatelikeOps
)
from pandas.core.arrays import (
    DatetimeArray,
    TimedeltaArray
)

def _get_delegated_func_name(func_name: str) -> str:
    method_name = func_name.rsplit(".", 1)[-1]

    if "Series.cat" in func_name:
        return ".".join([Categorical.__module__, Categorical.__name__, method_name])
    if "Series.dt" in func_name:
        if method_name in dir(TimelikeOps):
            return ".".join([TimelikeOps.__module__, TimelikeOps.__name__, method_name])
        if method_name in dir(DatelikeOps):
            return ".".join([DatelikeOps.__module__, DatelikeOps.__name__, method_name])
        if method_name in DatetimeArray._datetimelike_methods or method_name in DatetimeArray._datetimelike_ops:
            return ".".join([DatetimeArray.__module__, DatetimeArray.__name__, method_name])
        if method_name in TimedeltaArray._datetimelike_methods or method_name in TimedeltaArray._datetimelike_ops:
            return ".".join([TimedeltaArray.__module__, TimedeltaArray.__name__, method_name])

    return func_name

@ghost ghost closed this Jul 24, 2024
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants