Skip to content

ENH: Added support for multiple functions+description in a See Also block #172

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Apr 10, 2019

Conversation

pvanmulbregt
Copy link
Contributor

@pvanmulbregt pvanmulbregt commented Apr 12, 2018

Addresses gh-170. Extended capabilities of See Also blocks. No longer silently drops descriptions from lines with multiple functions.
Supports

  1. <FUNCNAME>
  2. <FUNCNAME> <SPACE>* COLON SPACE+ <DESCRIPTION> SPACE*
  3. <FUNCNAME> ( COMMA SPACE+ <FUNCNAME>)* SPACE*
  4. <FUNCNAME> ( COMMA SPACE+ <FUNCNAME>)* SPACE* COLON SPACE+ <DESCRIPTION> SPACE*

Empty <DESCRIPTION> elements are replaced with a Unicode ZERO-WIDTH SPACE " \u200B" in the output of docscrape.py. That is sufficient to convince the subsequent processing that the definition is present for the purposes of continuing the definition list. Further processing replaces the string " \u200B" with "", so that teh zero-width space doesn't appear in the generated HTML.

Closes #28

@pvanmulbregt
Copy link
Contributor Author

Under Py3.6 nosetests passes, but the latexpdf fails with Error: Unicode char \u8:​ not set up for use with LaTeX. [On my local machine the message is slightly different: Package inputenc Error: Unicode char ​ (U+200B) (inputenc) not set up for use with LaTeX.]

Under Py 2.7, one nosetest test fails with UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 71: ordinal not in range(128), inside jinja2's Template.render(), presumably from encountering the UTF-8 for the ZERO WIDTH SPACE, '\xe2\x80\x8b'.

Implication: Inserting a Unicode character into the output, either as Unicode or UTF-8, breaks downstream processing.

@jnothman suggests using ".." instead and that seems to behave better.

@ev-br
Copy link
Contributor

ev-br commented Apr 13, 2018

Also closes gh-28

@pvanmulbregt
Copy link
Contributor Author

Summary:

  1. Using ".." instead of unicode zero-width space allows multiple functions on a line, works under both Py2 and Py3, and keeps the latexpdf builds working.
  2. If any line is missing the description field, then the rendering of the See Also block misaligns the subsequent description fields, the yellow div block is too short, and the last line in the See Also block may interfere with the next section. That appears to be a style-sheet issue, independent of this change.

Copy link
Collaborator

@larsoner larsoner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pvanmulbregt can you rebase to get rid of the merge conflict?

@larsoner
Copy link
Collaborator

larsoner commented Apr 3, 2019

This seems like a useful extension/bugfix to me, and it sounds like using .. instead of unicode fixed the build problems. @jnothman what's the needs-decision point here? I think we just need to make sure that everything works properly, and have another set of eyes on the code.

I rebuilt MNE doc with this and things still looked okay. I also rebuilt SciPy and things looked okay. And when I modified this:

    See Also
    --------
    lfiltic : Construct initial conditions for `lfilter`.
    lfilter_zi : Compute initial state (steady state of step response) for
                 `lfilter`.
    filtfilt : A forward-backward filter, to obtain a filter with linear phase.
    ...

To be this:

    See Also
    --------
    lfiltic, lfilter_zi : Construct initial conditions or steady state, respectively, for `lfilter`.
    filtfilt : A forward-backward filter, to obtain a filter with linear phase.
    ...

It looked good:

Screenshot from 2019-04-03 16-53-32

Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've not really managed to review for correctness yet.

@@ -331,13 +331,15 @@ def _strip_blank_lines(s):


def line_by_line_compare(a, b):
empty_description = '..'
rgx = re.compile(r"^\s+" + re.escape(empty_description) + "$")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rgx is a terrible name for something that matches an empty description. But I'm also not sure why it's important to disregard these.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed to empty_description_rgx.

ml = self._line_rgx.match(line)
description = None
if ml:
if 'description' in ml.groupdict():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

description = ml.group('desc') will suffice here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -236,10 +236,30 @@ def _parse_param_list(self, content):

return params

_role = r":(?P<role>\w+):"
_funcbacktick = r"`(?P<name>(?:~\w+\.)?[a-zA-Z0-9_.-]+)`"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's confusing to mix the use of \w and explicit character classes:

Suggested change
_funcbacktick = r"`(?P<name>(?:~\w+\.)?[a-zA-Z0-9_.-]+)`"
_funcbacktick = r"`(?P<name>(?:~\w+\.)?[\w.-]+)`"

Actually, I don't think the ~ or . should be obligatory.

Suggested change
_funcbacktick = r"`(?P<name>(?:~\w+\.)?[a-zA-Z0-9_.-]+)`"
_funcbacktick = r"`(?P<name>(?:~?\w+\.)?[\w.-]+|\w+)`"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The regexes appearing in these patterns were taken from the original regex.

    _name_rgx = re.compile(r"^\s*(:(?P<role>\w+):"
                           r"`(?P<name>(?:~\w+\.)?[a-zA-Z0-9_.-]+)`|"
                           r" (?P<name2>[a-zA-Z0-9_.-]+))\s*", re.X)

The patterns can be changed but I'd prefer that to be in a separate PR, since that will change the meaning.

@@ -236,10 +236,30 @@ def _parse_param_list(self, content):

return params

_role = r":(?P<role>\w+):"
_funcbacktick = r"`(?P<name>(?:~\w+\.)?[a-zA-Z0-9_.-]+)`"
_funcplain = r"(?P<name2>[a-zA-Z0-9_.-]+)"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
_funcplain = r"(?P<name2>[a-zA-Z0-9_.-]+)"
_funcplain = r"(?P<name2>[a-zA-Z0-9_.-]+)"
Suggested change
_funcplain = r"(?P<name2>[a-zA-Z0-9_.-]+)"
_funcplain = r"(?P<name2>[\w.-]+)"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment above regarding original _name_rgx.

_funcplain = r"(?P<name2>[a-zA-Z0-9_.-]+)"
_funcname = r"(" + _role + _funcbacktick + r"|" + _funcplain + r")"
_funcnamenext = _funcname.replace('role', 'rolenext').replace('name', 'namenext')
_description = r"(?P<description>\s*:(\s+(?P<desc>\S+.*))?)?\s*$"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I think we'd like to make the space before the colon obligatory, for consistency with param lists, but I don't think the current code requires it)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that the original _name_rgx did not require it.

if not text.strip():
break
name, role, m2 = parse_item_name(text)
# m2 = self._func_rgx.match(text)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please remove this commented section

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

while True:
if not text.strip():
break
name, role, m2 = parse_item_name(text)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be cleaner to have parse_item_name return the end than the match. especially if you're going to call it the inscrutable m2

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

'~baz.obj_r'):
assert (not desc), str([func, desc])
elif func in ('func_f2', 'func_g2', 'func_h2', 'func_j2'):
assert (desc), str([func, desc])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rm parentheses

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

elif func in ('func_f2', 'func_g2', 'func_h2', 'func_j2'):
assert (desc), str([func, desc])
else:
assert(desc), str([func, desc])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rm parentheses

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks reasonable in terms of functionality and tests. Just could be neater code.

Added the spec used by NumpyDocString._parse_see_also.
Added a warning whenever an unexpected trailing comma appears in a See Also
function list.
Removed unnecessary re.X qualifiers from some regular expressions.
Renamed some variables with more descriptive names.
Removed some unused/commented-out code.
Removed some unnneeded parentheses.
Shortened some lines to keep under 80 characters.
@rgommers rgommers added this to the v0.9.0 milestone Apr 7, 2019
# <FUNCNAME> ( COMMA SPACE+ <FUNCNAME>)* SPACE* COLON SPACE+ <DESC> SPACE*

# <FUNCNAME> is:
# A legal function name, optionally enclosed in backticks.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

currently a function name enclosed in backticks requires a role.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated doc.

if not m:
raise ParseError("%s is not a item name" % text)
role = m.group('role')
name = (m.group('name') if role else m.group('name2'))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

redundant parentheses

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

else:
out[-1] += ", %s" % link
for funcs, desc in self['See Also']:
assert isinstance(funcs, (list, tuple))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we allow a tuple?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only allow lists.

@@ -331,10 +331,16 @@ def _strip_blank_lines(s):


def line_by_line_compare(a, b):
empty_description = NumpyDocString.empty_description # '..'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if it's important that these are inserted, why do we want to disregard them in testing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the disregarding of these lines and updated the expected output to account for that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you forgotten to commit this?

Copy link
Member

@jnothman jnothman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise LGTM

@jnothman jnothman merged commit 8f1ac50 into numpy:master Apr 10, 2019
@jnothman
Copy link
Member

Thanks, @pvanmulbregt. I'd appreciate if this could also be mentioned in the docs. another pr?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants