DOC: Validate in docstrings that numpy and pandas are not imported #23161

thoo · 2018-10-15T05:05:53Z

pep8speaks · 2018-10-15T05:05:55Z

Hello @thoo! Thanks for submitting the PR.

There are no PEP8 issues in the file scripts/tests/test_validate_docstrings.py !
There are no PEP8 issues in the file scripts/validate_docstrings.py !

codecov · 2018-10-15T14:45:46Z

Codecov Report

Merging #23161 into master will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master   #23161   +/-   ##
=======================================
  Coverage   92.23%   92.23%           
=======================================
  Files         161      161           
  Lines       51197    51197           
=======================================
  Hits        47220    47220           
  Misses       3977     3977

Flag	Coverage Δ
#multiple	`90.61% <ø> (ø)`	⬆️
#single	`42.27% <ø> (ø)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d78bd7a...ce37c8a. Read the comment docs.

datapythonista

Thanks for the work, added some comments.

datapythonista · 2018-10-15T16:01:08Z

scripts/tests/test_validate_docstrings.py

@@ -502,6 +502,20 @@ def no_punctuation(self):
        return "Hello world!"


+class BadExamples(object):
+
+    def npPd_import(self):


this name is a bit cryptic, can you use something more clear.

datapythonista · 2018-10-15T16:01:28Z

scripts/tests/test_validate_docstrings.py

+
+    def npPd_import(self):
+        """
+        Provide example with numpy and pandas import


Can you finish this with a period, so it doesn't generate an error for it.

datapythonista · 2018-10-15T16:02:37Z

scripts/tests/test_validate_docstrings.py

+        Examples
+        --------
+        import numpy as np
+        import pandas as pd


The code in examples start by >>>

It'd be good to have tests for import numpy and import pandas (without the aliases), and have each of them separate. And also a case when something else is imported (e.g. import datetime)

Not sure if in the good docstrings in the tests we have examples somewhere. We should, to make sure the error is not reported when import numpy... is not present.

datapythonista · 2018-10-15T16:09:22Z

scripts/validate_docstrings.py

@@ -512,6 +512,12 @@ def validate_one(func_name):
        examples_errs = doc.examples_errors
        if examples_errs:
            errs.append('Examples do not pass tests')
+        if 'import numpy as np' in ' '.join(doc.examples):


I think it'd be good to use the module doctest to extract the content of the code in the examples. Meaning that something like this:

Examples -------- This example does not import pandas as pd >>> pd.Series(['pd is assumed to be imported as we want'])

should not generate an error.

And probably you want to include an example like this in the tests.

thoo · 2018-10-15T16:16:32Z

@datapythonista Thanks for the review. Really appreciate it. Will be back. :)

datapythonista

Is there any reason to have a space at the beginning of the error messages? I don't see it in the other messages, I'd keep consistent.

Also, the rstrip in the code seems unnecessary, and if we use that method for something else, removing the break lines can be a problem. If I'm not missing something, I'd keep the break lines.

Other than that, looks good.

thoo · 2018-10-16T07:53:17Z

@datapythonista Thanks for the help. It looks like one of Travis' tests failed last run. Is it possible conda update in Travis somehow breaks it? I am learning a lot. :):)

datapythonista

I'm checking from the phone, and is difficult to review in detail, but looks good.

Is the test with the good import being checked? If it is, lgtm.

datapythonista · 2018-10-16T07:59:21Z

Can't check the travis logs now, but if ut's failing when installing anaconda the only thing we can do is run the build again. For it, you usually update your beanch from master and push it. So, you don't change anything, but as the PR is updated travis runs again. In this case as you did changes is not needed.

TomAugspurger · 2018-10-16T11:27:55Z

The travis failure can be ignored.

I restarted the azure build. That failed with a pyarrow could not load DLL issue.

thoo · 2018-10-16T13:33:51Z

Thank you guys. Appreciate it!!!

datapythonista · 2018-10-27T04:07:04Z

@WillAyd can you take a look and merge if you're happy?

WillAyd · 2018-10-27T06:36:16Z

scripts/tests/test_validate_docstrings.py

@@ -502,6 +514,33 @@ def no_punctuation(self):
        return "Hello world!"


+class BadExamples(object):


I believe there is already an example that imports both of these:

pandas/scripts/tests/test_validate_docstrings.py

Line 334 in 5aff727

>>> import numpy as np

Any reason we don't leverage that instead of creating a new class?

…xamples * repo_org/master: (83 commits) DOC: Add docstring validations for "See Also" section (pandas-dev#23143) TST: Fix test assertion (pandas-dev#23357) BUG: Handle Period in combine (pandas-dev#23350) REF: SparseArray imports (pandas-dev#23329) CI: Migrate some CircleCI jobs to Azure (pandas-dev#22992) DOC: update the is_month_start/is_month_end docstring (pandas-dev#23051) Partialy fix issue pandas-dev#23334 - isort pandas/core/groupby directory (pandas-dev#23341) TST: Add base test for extensionarray setitem pandas-dev#23300 (pandas-dev#23304) API: Add sparse Acessor (pandas-dev#23183) PERF: speed up CategoricalIndex.get_loc (pandas-dev#23235) fix and test incorrect case in delta_to_nanoseconds (pandas-dev#23302) BUG: Handle Datetimelike data in DataFrame.combine (pandas-dev#23317) TST: re-enable gbq tests (pandas-dev#23303) Switched references of App veyor to azure pipelines in the contributing CI section (pandas-dev#23311) isort imports-io (pandas-dev#23332) DOC: Added a Multi Index example for the Series.sum method (pandas-dev#23279) REF: Make PeriodArray an ExtensionArray (pandas-dev#22862) DOC: Added Examples for Series max (pandas-dev#23298) API/ENH: tz_localize handling of nonexistent times: rename keyword + add shift option (pandas-dev#22644) BUG: Let MultiIndex.set_levels accept any iterable (pandas-dev#23273) (pandas-dev#23291) ...

thoo · 2018-10-27T15:07:11Z

@WillAyd Could you re-review this? I updated it with your suggestion.

…xamples * repo_org/master: DOC: Validate that See Also section items do not contain the pandas. prefix (pandas-dev#23145) API/TST: make hasnans always return python booleans (pandas-dev#23349)

thoo · 2018-10-27T19:48:59Z

@WillAyd I just resolve the conflicts with the base branch and it should be ready for re-review. Thanks.

datapythonista

looks good, added couple of last things, and I think we're ready to merge.

datapythonista · 2018-11-03T13:33:40Z

scripts/validate_docstrings.py

+    def examples_codes(self):
+        codes = doctest.DocTestParser().get_examples(self.raw_doc)
+        codes = [line.source for line in codes]
+        return ' '.join(codes)


Can you return a list with each line instead? We'll reuse this property to validate pep8 too, and returning a single string with all the code joined is not useful in that case.

Also, at least to me examples_codes is somehow misleading. Can you use examples_source_code instead?

datapythonista · 2018-11-03T13:35:16Z

scripts/validate_docstrings.py

+        if 'import numpy' in examples_codes:
+            errs.append('Examples should not have `import numpy` ')
+        if 'import pandas' in examples_codes:
+            errs.append('Examples should not have `import pandas` ')


I think the error could be more descriptive. pandas does not need to be imported in the examples, as it's assumed to be already imported as pd. I think something like that provides more useful information to devs facing this error.

datapythonista · 2018-11-03T13:35:47Z

scripts/tests/test_validate_docstrings.py

@@ -218,6 +218,18 @@ def mode(self, axis, numeric_only):
        """
        pass

+    def good_import(self):


Suggested change

def good_import(self):

def good_imports(self):

…xamples * repo_org/master: (66 commits) CLN: doc string (pandas-dev#23469) DOC: Add cookbook entry for triangular correlation matrix (GH22840) (pandas-dev#23032) add number of Errors, Warnings to scripts/validate_docstrings.py (pandas-dev#23150) BUG: Allow freq conversion from dt64 to period (pandas-dev#23460) ENH: Add FrozenList.union and .difference (pandas-dev#23394) REF: cython cleanup, typing, optimizations (pandas-dev#23464) strictness and checks for Timedelta _simple_new (pandas-dev#23433) Fixing flake8 problems new to flake8 3.6.0 (pandas-dev#23472) DOC: Updating the docstring of Series.dot (pandas-dev#22890) TST: Fixturize series/test_analytics.py (pandas-dev#22755) BUG/ENH: Handle NonexistentTimeError in date rounding (pandas-dev#23406) PERF: speed up concat on Series by making _get_axis_number() a classmethod (pandas-dev#23404) REF: Remove DatetimelikeArrayMixin._shallow_copy (pandas-dev#23430) REF: strictness/simplification in DatetimeArray/Index _simple_new (pandas-dev#23431) REF: cython cleanup, typing, optimizations (pandas-dev#23456) TST: tweak Hypothesis configuration and idioms (pandas-dev#23441) BUG: fix HDFStore.append with all empty strings error (GH12242) (pandas-dev#23435) TST: Skip 32bit failing IntervalTree tests (pandas-dev#23442) BUG: Deprecate nthreads argument (pandas-dev#23112) style: fix import format at pandas/core/reshape (pandas-dev#23387) ...

datapythonista

lgtm, thanks for the changes.

Merging on green, but I added some final ideas in case you want to apply them, but merging anyway.

datapythonista · 2018-11-03T18:41:15Z

scripts/validate_docstrings.py

+    def examples_source_code(self):
+        codes = doctest.DocTestParser().get_examples(self.raw_doc)
+        codes = [line.source for line in codes]
+        return codes


You can also return in the previous line.

May be lines is a better name? codes sounds to me like identifiers

datapythonista · 2018-11-03T18:42:30Z

scripts/validate_docstrings.py

+            errs.append("Numpy does not need to be imported in the examples, "
+                        "as it's assumed to be already imported as np")
+        if 'import pandas' in ' '.join(examples_source_code):
+            errs.append("Pandas does not need to be imported in the examples, "


I think we don't usually use capitalized pandas, even at the beginning of sentence. And in this case, as it's more the name of the module than the project, I'd use lower case also for numpy.

datapythonista · 2018-11-03T18:44:25Z

scripts/validate_docstrings.py

@@ -531,6 +537,13 @@ def validate_one(func_name):
        examples_errs = doc.examples_errors
        if examples_errs:
            errs.append('Examples do not pass tests')
+        examples_source_code = doc.examples_source_code


may be you can do the join here, so doesn't need to be repeated? also, is the lines end with \n I'd join by '', otherwise by '\n'. Doesn't change the result, but conceptually the content of the variable is something that makes more sense.

thoo · 2018-11-03T18:58:54Z

@datapythonista Thanks for the review. I update the files.

datapythonista

That's what I meant in the comment about naming lines and returning (your changes make sense, but just partially, based on my comment)

datapythonista · 2018-11-03T19:01:06Z

scripts/validate_docstrings.py

@@ -402,6 +402,12 @@ def examples_errors(self):
            error_msgs += f.getvalue()
        return error_msgs

+    @property
+    def examples_source_code(self):
+        codes = doctest.DocTestParser().get_examples(self.raw_doc)


Suggested change

codes = doctest.DocTestParser().get_examples(self.raw_doc)

lines = doctest.DocTestParser().get_examples(self.raw_doc)

datapythonista · 2018-11-03T19:01:27Z

scripts/validate_docstrings.py

+    @property
+    def examples_source_code(self):
+        codes = doctest.DocTestParser().get_examples(self.raw_doc)
+        lines = [line.source for line in codes]


Suggested change

lines = [line.source for line in codes]

return [line.source for line in lines]

datapythonista · 2018-11-03T19:01:43Z

scripts/validate_docstrings.py

+    def examples_source_code(self):
+        codes = doctest.DocTestParser().get_examples(self.raw_doc)
+        lines = [line.source for line in codes]
+        return lines


Suggested change

return lines

datapythonista · 2018-11-03T19:11:18Z

Thanks once more @thoo, will merge on green.

TomAugspurger · 2018-11-04T14:04:33Z

Thanks!

…andas-dev#23161)

Check numpy and pandas in Examples

8655fdd

thoo closed this Oct 15, 2018

thoo reopened this Oct 15, 2018

datapythonista reviewed Oct 15, 2018

View reviewed changes

datapythonista added Docs CI Continuous Integration labels Oct 15, 2018

datapythonista changed the title ~~Check numpy and pandas in Examples~~ DOC: Validate in docstrings that numpy and pandas are not imported Oct 15, 2018

Made changes as recommended

5c071ec

thoo force-pushed the check_np_pd_fromExamples branch from cf75147 to 5c071ec Compare October 16, 2018 07:40

datapythonista reviewed Oct 16, 2018

View reviewed changes

remove rstrip

5aff727

datapythonista approved these changes Oct 16, 2018

View reviewed changes

thoo closed this Oct 19, 2018

thoo reopened this Oct 19, 2018

WillAyd reviewed Oct 27, 2018

View reviewed changes

thoo added 2 commits October 27, 2018 09:48

remove BadExamples class

374c9ea

Merge remote-tracking branch 'repo_org/master' into check_np_pd_fromE…

d824414

…xamples * repo_org/master: DOC: Validate that See Also section items do not contain the pandas. prefix (pandas-dev#23145) API/TST: make hasnans always return python booleans (pandas-dev#23349)

datapythonista reviewed Nov 3, 2018

View reviewed changes

datapythonista mentioned this pull request Nov 3, 2018

ENH: Use flake8 to check for PEP8 violations in doctests #23399

Merged

2 tasks

thoo added 3 commits November 3, 2018 13:19

Change as recommended

3e512ac

Change double to single quote

a2011b2

datapythonista approved these changes Nov 3, 2018

View reviewed changes

thoo added 2 commits November 3, 2018 14:55

a few more changes

0243b91

a few more changes

d123b5c

datapythonista reviewed Nov 3, 2018

View reviewed changes

changes in return

ec78655

Retrigger Travis CI

ce37c8a

TomAugspurger merged commit 047242b into pandas-dev:master Nov 4, 2018

JustinZhengBC pushed a commit to JustinZhengBC/pandas that referenced this pull request Nov 14, 2018

DOC: Validate in docstrings that numpy and pandas are not imported (p…

cb51a02

…andas-dev#23161)

tm9k1 pushed a commit to tm9k1/pandas that referenced this pull request Nov 19, 2018

DOC: Validate in docstrings that numpy and pandas are not imported (p…

5f0db32

…andas-dev#23161)

thoo deleted the check_np_pd_fromExamples branch January 2, 2019 20:26

Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019

DOC: Validate in docstrings that numpy and pandas are not imported (p…

282d9eb

…andas-dev#23161)

Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this pull request Feb 28, 2019

DOC: Validate in docstrings that numpy and pandas are not imported (p…

f45b6c8

…andas-dev#23161)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DOC: Validate in docstrings that numpy and pandas are not imported #23161

DOC: Validate in docstrings that numpy and pandas are not imported #23161

thoo commented Oct 15, 2018

pep8speaks commented Oct 15, 2018

codecov bot commented Oct 15, 2018 •

edited

Loading

datapythonista left a comment

datapythonista Oct 15, 2018

datapythonista Oct 15, 2018

datapythonista Oct 15, 2018

datapythonista Oct 15, 2018

thoo commented Oct 15, 2018

datapythonista left a comment

thoo commented Oct 16, 2018

datapythonista left a comment

datapythonista commented Oct 16, 2018

TomAugspurger commented Oct 16, 2018

thoo commented Oct 16, 2018

datapythonista commented Oct 27, 2018

WillAyd Oct 27, 2018

thoo commented Oct 27, 2018

thoo commented Oct 27, 2018

datapythonista left a comment

datapythonista Nov 3, 2018

datapythonista Nov 3, 2018

datapythonista Nov 3, 2018

datapythonista left a comment

datapythonista Nov 3, 2018

datapythonista Nov 3, 2018

datapythonista Nov 3, 2018

thoo commented Nov 3, 2018

datapythonista left a comment

datapythonista Nov 3, 2018

datapythonista Nov 3, 2018

datapythonista Nov 3, 2018

datapythonista commented Nov 3, 2018

TomAugspurger commented Nov 4, 2018

		@@ -502,6 +514,33 @@ def no_punctuation(self):
		return "Hello world!"


		class BadExamples(object):

	codes = doctest.DocTestParser().get_examples(self.raw_doc)
	lines = doctest.DocTestParser().get_examples(self.raw_doc)

	lines = [line.source for line in codes]
	return [line.source for line in lines]

DOC: Validate in docstrings that numpy and pandas are not imported #23161

DOC: Validate in docstrings that numpy and pandas are not imported #23161

Conversation

thoo commented Oct 15, 2018

pep8speaks commented Oct 15, 2018

codecov bot commented Oct 15, 2018 • edited Loading

Codecov Report

datapythonista left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

thoo commented Oct 15, 2018

datapythonista left a comment

Choose a reason for hiding this comment

thoo commented Oct 16, 2018

datapythonista left a comment

Choose a reason for hiding this comment

datapythonista commented Oct 16, 2018

TomAugspurger commented Oct 16, 2018

thoo commented Oct 16, 2018

datapythonista commented Oct 27, 2018

Choose a reason for hiding this comment

thoo commented Oct 27, 2018

thoo commented Oct 27, 2018

datapythonista left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

datapythonista left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

thoo commented Nov 3, 2018

datapythonista left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

datapythonista commented Nov 3, 2018

TomAugspurger commented Nov 4, 2018

codecov bot commented Oct 15, 2018 •

edited

Loading