Skip to content

DOC: update docstring validation script #19960

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 22 additions & 5 deletions scripts/validate_docstrings.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@
import inspect
import importlib
import doctest
import textwrap
import pydoc
try:
from io import StringIO
except ImportError:
Expand All @@ -39,6 +39,9 @@
from numpydoc.docscrape import NumpyDocString


PRIVATE_CLASSES = ['NDFrame', 'IndexOpsMixin']


def _to_original_callable(obj):
while True:
if inspect.isfunction(obj) or inspect.isclass(obj):
Expand Down Expand Up @@ -72,8 +75,7 @@ class Docstring:
def __init__(self, method_name, method_obj):
self.method_name = method_name
self.method_obj = method_obj
self.raw_doc = method_obj.__doc__ or ''
self.raw_doc = textwrap.dedent(self.raw_doc)
self.raw_doc = pydoc.getdoc(method_obj)
self.doc = NumpyDocString(self.raw_doc)

def __len__(self):
Expand Down Expand Up @@ -127,7 +129,12 @@ def doc_parameters(self):

@property
def signature_parameters(self):
if not inspect.isfunction(self.method_obj):
if not (inspect.isfunction(self.method_obj)
or inspect.isclass(self.method_obj)):
return tuple()
if (inspect.isclass(self.method_obj)
and self.method_name.split('.')[-1] in {'dt', 'str', 'cat'}):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it difficult to find the accessors that have been registered, instead of hardcoding them here? So we don't need to remember to change this when a new accessor is created.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was looking into the class (eg that I could check they inherit from a base Accessor class, but that's not consistent between them).
So not really sure there is a better way, but totally agree it is not ideal.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, there is a _accessors attribute. But not fully sure to what extent we can rely on that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is then that I need to get the class (eg Series in case of pandas.Series.dt), which is also difficult to introspect. So maybe the easiest is to leave it as is.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, it's quite useful as it is, we can improve it in the future if we find a way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: #19963

I think isinstance(object.__getattribute__(self, method_obj.__name__), CachedAccessor) might have done it, But having an official registry is better :)

# accessor classes have a signature, but don't want to show this
return tuple()
params = tuple(inspect.signature(self.method_obj).parameters.keys())
if params and params[0] in ('self', 'cls'):
Expand All @@ -149,7 +156,8 @@ def parameter_mismatches(self):
extra = set(doc_params) - set(signature_params)
if extra:
errs.append('Unknown parameters {!r}'.format(extra))
if not missing and not extra and signature_params != doc_params:
if (not missing and not extra and signature_params != doc_params
and not (not signature_params and not doc_params)):
errs.append('Wrong parameters order. ' +
'Actual: {!r}. '.format(signature_params) +
'Documented: {!r}'.format(doc_params))
Expand Down Expand Up @@ -180,6 +188,10 @@ def deprecated(self):
bool(pattern.search(self.summary)) or
bool(pattern.search(self.extended_summary)))

@property
def mentioned_private_classes(self):
return [klass for klass in PRIVATE_CLASSES if klass in self.raw_doc]

@property
def examples_errors(self):
flags = doctest.NORMALIZE_WHITESPACE | doctest.IGNORE_EXCEPTION_DETAIL
Expand Down Expand Up @@ -311,6 +323,11 @@ def validate_one(func_name):
for param_err in param_errs:
errs.append('\t{}'.format(param_err))

mentioned_errs = doc.mentioned_private_classes
if mentioned_errs:
errs.append('Private classes ({}) should not be mentioned in public '
'docstring.'.format(mentioned_errs))

examples_errs = ''
if not doc.examples:
errs.append('No examples section found')
Expand Down