ENH: Added support for multiple functions+description in a See Also block #172

pvanmulbregt · 2018-04-12T22:31:17Z

Addresses gh-170. Extended capabilities of See Also blocks. No longer silently drops descriptions from lines with multiple functions.
Supports

<FUNCNAME>
<FUNCNAME> <SPACE>* COLON SPACE+ <DESCRIPTION> SPACE*
<FUNCNAME> ( COMMA SPACE+ <FUNCNAME>)* SPACE*
<FUNCNAME> ( COMMA SPACE+ <FUNCNAME>)* SPACE* COLON SPACE+ <DESCRIPTION> SPACE*

Empty <DESCRIPTION> elements are replaced with a Unicode ZERO-WIDTH SPACE " \u200B" in the output of docscrape.py. That is sufficient to convince the subsequent processing that the definition is present for the purposes of continuing the definition list. Further processing replaces the string " \u200B" with "", so that teh zero-width space doesn't appear in the generated HTML.

Closes #28

pvanmulbregt · 2018-04-13T02:10:44Z

Under Py3.6 nosetests passes, but the latexpdf fails with Error: Unicode char \u8: not set up for use with LaTeX. [On my local machine the message is slightly different: Package inputenc Error: Unicode char (U+200B) (inputenc) not set up for use with LaTeX.]

Under Py 2.7, one nosetest test fails with UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 71: ordinal not in range(128), inside jinja2's Template.render(), presumably from encountering the UTF-8 for the ZERO WIDTH SPACE, '\xe2\x80\x8b'.

Implication: Inserting a Unicode character into the output, either as Unicode or UTF-8, breaks downstream processing.

@jnothman suggests using ".." instead and that seems to behave better.

ev-br · 2018-04-13T20:16:12Z

Also closes gh-28

pvanmulbregt · 2018-04-13T20:43:52Z

Summary:

Using ".." instead of unicode zero-width space allows multiple functions on a line, works under both Py2 and Py3, and keeps the latexpdf builds working.
If any line is missing the description field, then the rendering of the See Also block misaligns the subsequent description fields, the yellow div block is too short, and the last line in the See Also block may interfere with the next section. That appears to be a style-sheet issue, independent of this change.

larsoner

@pvanmulbregt can you rebase to get rid of the merge conflict?

…lock.

Also remove the zero-width space \u200B when comparing test output.

larsoner · 2019-04-03T20:54:23Z

This seems like a useful extension/bugfix to me, and it sounds like using .. instead of unicode fixed the build problems. @jnothman what's the needs-decision point here? I think we just need to make sure that everything works properly, and have another set of eyes on the code.

I rebuilt MNE doc with this and things still looked okay. I also rebuilt SciPy and things looked okay. And when I modified this:

    See Also
    --------
    lfiltic : Construct initial conditions for `lfilter`.
    lfilter_zi : Compute initial state (steady state of step response) for
                 `lfilter`.
    filtfilt : A forward-backward filter, to obtain a filter with linear phase.
    ...

To be this:

    See Also
    --------
    lfiltic, lfilter_zi : Construct initial conditions or steady state, respectively, for `lfilter`.
    filtfilt : A forward-backward filter, to obtain a filter with linear phase.
    ...

It looked good:

jnothman

I've not really managed to review for correctness yet.

jnothman · 2019-04-04T07:27:56Z

numpydoc/tests/test_docscrape.py

@@ -331,13 +331,15 @@ def _strip_blank_lines(s):


 def line_by_line_compare(a, b):
+    empty_description = '..'
+    rgx = re.compile(r"^\s+" + re.escape(empty_description) + "$")


rgx is a terrible name for something that matches an empty description. But I'm also not sure why it's important to disregard these.

Renamed to empty_description_rgx.

jnothman · 2019-04-04T07:30:31Z

numpydoc/docscrape.py

+            ml = self._line_rgx.match(line)
+            description = None
+            if ml:
+                if 'description' in ml.groupdict():


description = ml.group('desc') will suffice here

jnothman · 2019-04-04T07:32:03Z

numpydoc/docscrape.py

@@ -236,10 +236,30 @@ def _parse_param_list(self, content):

        return params

+    _role = r":(?P<role>\w+):"
+    _funcbacktick = r"`(?P<name>(?:~\w+\.)?[a-zA-Z0-9_.-]+)`"


It's confusing to mix the use of \w and explicit character classes:

Suggested change

_funcbacktick = r"`(?P<name>(?:~\w+\.)?[a-zA-Z0-9_.-]+)`"

_funcbacktick = r"`(?P<name>(?:~\w+\.)?[\w.-]+)`"

Actually, I don't think the ~ or . should be obligatory.

Suggested change

_funcbacktick = r"`(?P<name>(?:~\w+\.)?[a-zA-Z0-9_.-]+)`"

_funcbacktick = r"`(?P<name>(?:~?\w+\.)?[\w.-]+|\w+)`"

The regexes appearing in these patterns were taken from the original regex.

_name_rgx = re.compile(r"^\s*(:(?P<role>\w+):" r"`(?P<name>(?:~\w+\.)?[a-zA-Z0-9_.-]+)`|" r" (?P<name2>[a-zA-Z0-9_.-]+))\s*", re.X)

The patterns can be changed but I'd prefer that to be in a separate PR, since that will change the meaning.

jnothman · 2019-04-04T07:32:12Z

numpydoc/docscrape.py

@@ -236,10 +236,30 @@ def _parse_param_list(self, content):

        return params

+    _role = r":(?P<role>\w+):"
+    _funcbacktick = r"`(?P<name>(?:~\w+\.)?[a-zA-Z0-9_.-]+)`"
+    _funcplain = r"(?P<name2>[a-zA-Z0-9_.-]+)"


Suggested change

_funcplain = r"(?P<name2>[a-zA-Z0-9_.-]+)"

_funcplain = r"(?P<name2>[a-zA-Z0-9_.-]+)"

Suggested change

_funcplain = r"(?P<name2>[a-zA-Z0-9_.-]+)"

_funcplain = r"(?P<name2>[\w.-]+)"

See comment above regarding original _name_rgx.

jnothman · 2019-04-04T07:32:54Z

numpydoc/docscrape.py

+    _funcplain = r"(?P<name2>[a-zA-Z0-9_.-]+)"
+    _funcname = r"(" + _role + _funcbacktick + r"|" + _funcplain + r")"
+    _funcnamenext = _funcname.replace('role', 'rolenext').replace('name', 'namenext')
+    _description = r"(?P<description>\s*:(\s+(?P<desc>\S+.*))?)?\s*$"


(I think we'd like to make the space before the colon obligatory, for consistency with param lists, but I don't think the current code requires it)

I believe that the original _name_rgx did not require it.

jnothman · 2019-04-04T07:47:17Z

numpydoc/docscrape.py

+                    if not text.strip():
+                        break
+                    name, role, m2 = parse_item_name(text)
+                    # m2 = self._func_rgx.match(text)


please remove this commented section

jnothman · 2019-04-04T07:48:39Z

numpydoc/docscrape.py

+                while True:
+                    if not text.strip():
+                        break
+                    name, role, m2 = parse_item_name(text)


It would be cleaner to have parse_item_name return the end than the match. especially if you're going to call it the inscrutable m2

numpydoc/tests/test_docscrape.py

jnothman · 2019-04-04T07:49:36Z

numpydoc/tests/test_docscrape.py

+                        '~baz.obj_r'):
+                assert (not desc), str([func, desc])
+            elif func in ('func_f2', 'func_g2', 'func_h2', 'func_j2'):
+                    assert (desc), str([func, desc])


rm parentheses

jnothman · 2019-04-04T07:49:46Z

numpydoc/tests/test_docscrape.py

+            elif func in ('func_f2', 'func_g2', 'func_h2', 'func_j2'):
+                    assert (desc), str([func, desc])
+            else:
+                assert(desc), str([func, desc])


rm parentheses

jnothman

I think this looks reasonable in terms of functionality and tests. Just could be neater code.

Added the spec used by NumpyDocString._parse_see_also. Added a warning whenever an unexpected trailing comma appears in a See Also function list. Removed unnecessary re.X qualifiers from some regular expressions. Renamed some variables with more descriptive names. Removed some unused/commented-out code. Removed some unnneeded parentheses. Shortened some lines to keep under 80 characters.

jnothman · 2019-04-08T03:22:40Z

numpydoc/docscrape.py

+    # <FUNCNAME> ( COMMA SPACE+ <FUNCNAME>)* SPACE* COLON SPACE+ <DESC> SPACE*
+
+    # <FUNCNAME> is:
+    # A legal function name, optionally enclosed in backticks.


currently a function name enclosed in backticks requires a role.

Updated doc.

jnothman · 2019-04-08T03:24:57Z

numpydoc/docscrape.py

+            if not m:
+                raise ParseError("%s is not a item name" % text)
+            role = m.group('role')
+            name = (m.group('name') if role else m.group('name2'))


redundant parentheses

jnothman · 2019-04-08T03:28:01Z

numpydoc/docscrape.py

-            else:
-                out[-1] += ", %s" % link
+        for funcs, desc in self['See Also']:
+            assert isinstance(funcs, (list, tuple))


why do we allow a tuple?

Only allow lists.

jnothman · 2019-04-08T03:29:58Z

numpydoc/tests/test_docscrape.py

@@ -331,10 +331,16 @@ def _strip_blank_lines(s):


 def line_by_line_compare(a, b):
+    empty_description = NumpyDocString.empty_description  # '..'


if it's important that these are inserted, why do we want to disregard them in testing?

Removed the disregarding of these lines and updated the expected output to account for that.

Have you forgotten to commit this?

…put.

jnothman

Otherwise LGTM

jnothman · 2019-04-10T03:02:21Z

Thanks, @pvanmulbregt. I'd appreciate if this could also be mentioned in the docs. another pr?

larsoner approved these changes Apr 2, 2019

View reviewed changes

pvanmulbregt added 5 commits April 2, 2019 22:29

ENH: Added support for multiple functions+description in a See Also b…

5ca74fe

…lock.

ENH: Use a zero-width space for empty definitions in See Also blocks.

bb11821

Also remove the zero-width space \u200B when comparing test output.

BUG: Under Py27 use the UTF-8 for ZERO WIDTH SPACE

30ec43d

BUG: Replace Unicode zero-width-space with ".."

1670bbb

STY: Keep line length < 80 for some regexes.

a6911fe

pvanmulbregt force-pushed the seealso branch from e306491 to a6911fe Compare April 3, 2019 02:39

jnothman added the needs-decision label Apr 3, 2019

jnothman removed the needs-decision label Apr 4, 2019

jnothman reviewed Apr 4, 2019

View reviewed changes

rgommers added this to the v0.9.0 milestone Apr 7, 2019

jnothman reviewed Apr 8, 2019

View reviewed changes

pvanmulbregt added 2 commits April 8, 2019 22:06

DOC: Adjusted documentation and some parentheses.

a6a1b20

TST: Don't remove '..' lines from text comparing against str(doc) out…

d65f917

…put.

jnothman approved these changes Apr 9, 2019

View reviewed changes

jnothman merged commit 8f1ac50 into numpy:master Apr 10, 2019

larsoner added the type: Enhancement label Apr 11, 2019

larsoner mentioned this pull request Apr 11, 2019

Please release a new version #203

Closed

rgommers mentioned this pull request Apr 14, 2019

Parsing See Also became too picky #206

Closed

WillAyd mentioned this pull request Apr 22, 2019

Doc Check Failures pandas-dev/pandas#26187

Closed

	_funcbacktick = r"`(?P<name>(?:~\w+\.)?[a-zA-Z0-9_.-]+)`"
	_funcbacktick = r"`(?P<name>(?:~\w+\.)?[\w.-]+)`"

	_funcbacktick = r"`(?P<name>(?:~\w+\.)?[a-zA-Z0-9_.-]+)`"
	_funcbacktick = r"`(?P<name>(?:~?\w+\.)?[\w.-]+\|\w+)`"

	_funcplain = r"(?P<name2>[a-zA-Z0-9_.-]+)"
	_funcplain = r"(?P<name2>[a-zA-Z0-9_.-]+)"

	_funcplain = r"(?P<name2>[a-zA-Z0-9_.-]+)"
	_funcplain = r"(?P<name2>[\w.-]+)"

		@@ -331,10 +331,16 @@ def _strip_blank_lines(s):


		def line_by_line_compare(a, b):
		empty_description = NumpyDocString.empty_description # '..'

ENH: Added support for multiple functions+description in a See Also block #172

ENH: Added support for multiple functions+description in a See Also block #172

Conversation

pvanmulbregt commented Apr 12, 2018 • edited by larsoner Loading

pvanmulbregt commented Apr 13, 2018

ev-br commented Apr 13, 2018

pvanmulbregt commented Apr 13, 2018

larsoner left a comment

Choose a reason for hiding this comment

larsoner commented Apr 3, 2019

jnothman left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jnothman left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jnothman left a comment

Choose a reason for hiding this comment

jnothman commented Apr 10, 2019

pvanmulbregt commented Apr 12, 2018 •

edited by larsoner

Loading