Skip to content

ENH: Use flake8 to check for PEP8 violations in doctests #23399

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Nov 4, 2018
50 changes: 48 additions & 2 deletions scripts/validate_docstrings.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,10 @@
import inspect
import importlib
import doctest
from contextlib import contextmanager

from flake8.api import legacy as flake8

try:
from io import StringIO
except ImportError:
Expand All @@ -40,7 +44,6 @@
from numpydoc.docscrape import NumpyDocString
from pandas.io.formats.printing import pprint_thing


PRIVATE_CLASSES = ['NDFrame', 'IndexOpsMixin']
DIRECTIVES = ['versionadded', 'versionchanged', 'deprecated']

Expand Down Expand Up @@ -331,6 +334,43 @@ def parameter_mismatches(self):

return errs

@property
def pep8_violations(self):
with self._file_representation() as filename:
style_guide = flake8.get_style_guide(doctests=True)
report = style_guide.input_file(filename=filename)
return report.get_statistics('')

@contextmanager
def _file_representation(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This name is not very descriptive. I think this is the examples as temporary file, so something like _examples_as_temp_file seems much clearer.

Not sure if it's worth having two separate methods, I'd generate the file in the method that validates pep8. But if we have two methods, I'd have this first, as this is being used by the other (pep8_violations).

"""
Creates a tmp file containing the function without the body

:returns filename of tmp file
"""
create_function = 'def {name}{signature}: # noqa: E501\n' \
' """{doc}"""\n' \
' pass\n'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As said, I have a preference to have only the examples code in the temporary file. Extracting the code from the examples is trivial, and once #23161 is merged we'll have a property in this class that returns the lines of code.

What you will have to do is before writing the code in the examples, write import numpy as np and same for pandas, so flake8 does not complain because of that.


tmp_dir = os.path.join(BASE_PATH, 'build', 'validate_docstring')
os.makedirs(tmp_dir, exist_ok=True)

filename = os.path.join(tmp_dir, self.name + '.py')
with open(filename, 'w') as f:
name = self.name.split('.')[-1]
sig = str(inspect.signature(self.obj))
lines = self.raw_doc.split("\n")
indented_lines = [' ' + line if line else ''
for line in lines[1:]]
doc = '\n'.join([lines[0], *indented_lines])

f.write(create_function.format(name=name, signature=sig, doc=doc))

yield filename

os.remove(filename)
os.rmdir(tmp_dir)

@property
def correct_parameters(self):
return not bool(self.parameter_mismatches)
Expand Down Expand Up @@ -446,7 +486,7 @@ def validate_one(func_name):
if doc.summary != doc.summary.lstrip():
errs.append('Summary contains heading whitespaces.')
elif (doc.is_function_or_method
and doc.summary.split(' ')[0][-1] == 's'):
and doc.summary.split(' ')[0][-1] == 's'):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it'd be better to not do unrelated changes

errs.append('Summary must start with infinitive verb, '
'not third person (e.g. use "Generate" instead of '
'"Generates")')
Expand Down Expand Up @@ -490,6 +530,12 @@ def validate_one(func_name):
for param_err in param_errs:
errs.append('\t{}'.format(param_err))

pep8_errs = doc.pep8_violations
if pep8_errs:
errs.append('Errors in doctest sections')
for pep8_err in pep8_errs:
errs.append('\t{}'.format(pep8_err))

if doc.is_function_or_method:
if not doc.returns and "return" in doc.method_source:
errs.append('No Returns section found')
Expand Down
25 changes: 25 additions & 0 deletions setup.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,31 @@ exclude =
doc/temp/*.py,
.eggs/*.py,
versioneer.py
.tox
.git

doctests = True
#TODO fix doctests
exclude_from_doctest =
./pandas/_libs
./pandas/api
./pandas/compat
./pandas/computation
./pandas/core
./pandas/errors
./pandas/io
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a number of these have been removed recently, can you edit

./pandas/plotting
./pandas/tests
./pandas/tools
./pandas/tseries
./pandas/types
./pandas/util

[flake8-rst]
ignore =
F821, # undefined name
W391, # blank line at end of file [Seems to be a bug (v0.4.1)]


[yapf]
based_on_style = pep8
Expand Down