Skip to content

CI: add pydocstyle to code_checks #32033

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 13 commits into from

Conversation

simonjayhawkins
Copy link
Member

@simonjayhawkins simonjayhawkins commented Feb 15, 2020

@simonjayhawkins simonjayhawkins added CI Continuous Integration Code Style Code style, linting, code_checks labels Feb 15, 2020
@jreback
Copy link
Contributor

jreback commented Feb 16, 2020

looks ok to me. @jorisvandenbossche

@jorisvandenbossche
Copy link
Member

jorisvandenbossche commented Feb 16, 2020

How does this contrast / overlap with the validations we already do with numpydoc/ the validate_docstrings script?

@simonjayhawkins
Copy link
Member Author

many differences so will only give the main points below. can go into more fine-grained detail if required.

pydocstyle is a static checker so is not able to check dynamically created docstrings.

validate_docstrings script is only checking public api. pydocstyle is checking internal docstrings too.

pydocstyle and validate_docstrings have some overlapping but also exclusive error codes so having both is beneficial.

pydocstyle has a flake8 plugin so can see errors highlighted in ide so a benefit for contributors.

not added here, but error code 'D417 | Missing argument descriptions in the docstring' would add a lot more than consistent style. It would ensure contributors properly document their functions/methods. A lot of work to enable this globally.

@WillAyd
Copy link
Member

WillAyd commented Feb 16, 2020

lgtm. I think we can start with this and refine over time if conflicts with numpydoc arise

@simonjayhawkins
Copy link
Member Author

lgtm. I think we can start with this and refine over time if conflicts with numpydoc arise

Ok. we could enable straight away and see how things pan out. so will take off draft status.

i suppose if issues do arise can always disable. The error codes included here are the simpler ones and it was not a lot of effort to make the changes across the codebase to comply.

we could then look to update the style guide once happy that having on the ci is smooth.

@simonjayhawkins simonjayhawkins marked this pull request as ready for review February 16, 2020 19:41
@simonjayhawkins simonjayhawkins changed the title POC: CI: add pydocstyle to code_checks CI: add pydocstyle to code_checks Feb 16, 2020
@jorisvandenbossche
Copy link
Member

error code 'D417 | Missing argument descriptions in the docstring' would add a lot more than consistent style

That's something that numpydoc checks as well?
And actually, as far as I can see, pydocstyle cannot do this for numpy-style docstrings?

@@ -820,7 +819,7 @@ def tz_convert(self, tz):
dtype = tz_to_dtype(tz)
return self._simple_new(self.asi8, dtype=dtype, freq=self.freq)

def tz_localize(self, tz, ambiguous="raise", nonexistent="raise"):
def tz_localize(self, tz, ambiguous="raise", nonexistent="raise"): # noqa
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the noqa needed here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another continuation line issue.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we rather figure out what is wrong, or disable the indentation check?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably need a fix upstream

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like an issue has already been raised for this PyCQA/pydocstyle#437

@@ -538,7 +538,7 @@ def to_datetime(
infer_datetime_format=False,
origin="unix",
cache=True,
):
): # noqa: D207
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same question here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there looks to be a problem with the use of continuation. The next line is flagged as underindented (D207).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's black that requires this, so I would maybe not include the D207 check then, if it gives conflicts with black.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

disabling D207 on one or two functions is better than disabling on all, surely?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@simonjayhawkins
Copy link
Member Author

That's something that numpydoc checks as well?
And actually, as far as I can see, pydocstyle cannot do this for numpy-style docstrings?

can you elaborate?

the checks of the parameters aren't broken down into separate codes for the parameter violations and the checks are not so rigorous.

sample cli output of D417

$ pydocstyle pandas/io/formats/format.py  --select=D417
pandas/io/formats/format.py:968 in public method `to_html`:
        D417: Missing argument descriptions in the docstring (argument(s) buf, encoding are missing descriptions in 'to_html' docstring)
pandas/io/formats/format.py:1111 in public function `format_array`:
        D417: Missing argument descriptions in the docstring (argument(s) decimal, digits, float_format, formatter, justify, na_rep, space, values are missing descriptions in 'format_array' docstring)
pandas/io/formats/format.py:1507 in public function `format_percentiles`:
        D417: Missing argument descriptions in the docstring (argument(s) percentiles are missing descriptions in 'format_percentiles' docstring)
(pandas-dev) 

Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@@ -316,6 +316,11 @@ fi
### DOCSTRINGS ###
if [[ -z "$CHECK" || "$CHECK" == "docstrings" ]]; then


MSG='Validate docstrings (D201, D202, D204, D207, D208, D209, D213, D300, D409, D411, D412, D414)' ; echo $MSG
pydocstyle pandas --select=D201,D202,D204,D207,D208,D209,D213,D300,D409,D411,D412,D414
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to have the codes in setup.cfg? I guess we could then call pydocstyle pandas locally and get this validation.

Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jorisvandenbossche
Copy link
Member

@simonjayhawkins Thanks for the explanation above #32033 (comment) So I suppose the main reason to use it in addition to numpydoc is that it checks all docstrings?

And actually, as far as I can see, pydocstyle cannot do this for numpy-style docstrings?

can you elaborate?

That was mentioned in one of the issues on numpydoc that talked about using pydocstyle, but it was from a while ago, so probably out of date


If this is going to fail CI in case of violation, we should add this to the contributing guidelines (similar to black and flake8)

Also, should we add it to pre-commit hooks? (can it run on only the changed files? Or how fast it is for the full code base?)

Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So looks good, but needs some docs IMO

@alimcmaster1
Copy link
Member

Maybe worth a note in the contributing guide?

@simonjayhawkins
Copy link
Member Author

I seem incapable of successfully building documentation locally on my Windows machine. closing to clear queue

@jorisvandenbossche
Copy link
Member

@simonjayhawkins for only adding some prose, there is not necessarily a need to build the docs locally. I can certainly review on technical rst aspects.

@simonjayhawkins
Copy link
Member Author

I think the ideal (but not most straightforward approach) is to update https://dev.pandas.io/docs/development/contributing_docstring.html.

I think that we need to avoid confusion between the public API functions/methods/classes that use sphinx to generate documentation for pandas users and docstrings for all other (internal) functions/method/classes that are for the benefit of pandas contributors.

A simpler approach for now could be to add a one-liner in the contributing guide instead.

@WillAyd
Copy link
Member

WillAyd commented Mar 17, 2020

Do we have a go forward on this? I still think OK and would be fine with doc updates as a follow on since this isn't drastically changing things.

Though not a strong preference either way - just don't want this to hang in limbo if it doesn't need to

@jorisvandenbossche
Copy link
Member

If it is going to fail CI, it should be documented in the contributing docs. It doesn't need to belong, just a short mention of it should be fine.

Also:

Also, should we add it to pre-commit hooks? (can it run on only the changed files? Or how fast it is for the full code base?)

@simonjayhawkins
Copy link
Member Author

If it is going to fail CI, it should be documented in the contributing docs.

#32393 was intended as precursor to this. Do we have the contributing docs available now? if so, can you provide a link, and i'll reopen #32393.

@simonjayhawkins
Copy link
Member Author

Also, should we add it to pre-commit hooks?

i don't have these (may require a bit of effort/knowledge to setup on Windows). so will need others to push the relevant patch for this.

@simonjayhawkins
Copy link
Member Author

Though not a strong preference either way - just don't want this to hang in limbo if it doesn't need to

no strong preference either. see #32033 (comment).

@simonjayhawkins
Copy link
Member Author

I think we need further discussion before we address #32033 (comment) and then update the guidance. I don't think this PR is that vehicle.

@simonjayhawkins
Copy link
Member Author

I think we need further discussion before we address #32033 (comment) and then update the guidance. I don't think this PR is that vehicle.

opened #32773, closing this to clear the queue for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI Continuous Integration Code Style Code style, linting, code_checks
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants