Skip to content

ENH: Added support for multiple functions+description in a See Also block #172

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Apr 10, 2019
120 changes: 80 additions & 40 deletions numpydoc/docscrape.py
Original file line number Diff line number Diff line change
Expand Up @@ -236,10 +236,30 @@ def _parse_param_list(self, content):

return params

_role = r":(?P<role>\w+):"
_funcbacktick = r"`(?P<name>(?:~\w+\.)?[a-zA-Z0-9_.-]+)`"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's confusing to mix the use of \w and explicit character classes:

Suggested change
_funcbacktick = r"`(?P<name>(?:~\w+\.)?[a-zA-Z0-9_.-]+)`"
_funcbacktick = r"`(?P<name>(?:~\w+\.)?[\w.-]+)`"

Actually, I don't think the ~ or . should be obligatory.

Suggested change
_funcbacktick = r"`(?P<name>(?:~\w+\.)?[a-zA-Z0-9_.-]+)`"
_funcbacktick = r"`(?P<name>(?:~?\w+\.)?[\w.-]+|\w+)`"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The regexes appearing in these patterns were taken from the original regex.

    _name_rgx = re.compile(r"^\s*(:(?P<role>\w+):"
                           r"`(?P<name>(?:~\w+\.)?[a-zA-Z0-9_.-]+)`|"
                           r" (?P<name2>[a-zA-Z0-9_.-]+))\s*", re.X)

The patterns can be changed but I'd prefer that to be in a separate PR, since that will change the meaning.

_funcplain = r"(?P<name2>[a-zA-Z0-9_.-]+)"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
_funcplain = r"(?P<name2>[a-zA-Z0-9_.-]+)"
_funcplain = r"(?P<name2>[a-zA-Z0-9_.-]+)"
Suggested change
_funcplain = r"(?P<name2>[a-zA-Z0-9_.-]+)"
_funcplain = r"(?P<name2>[\w.-]+)"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment above regarding original _name_rgx.

_funcname = r"(" + _role + _funcbacktick + r"|" + _funcplain + r")"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we intentionally disallowing names in backticks not preceded by roles (i.e. using the default role)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was the effect of the original _name_rgx.

_funcnamenext = _funcname.replace('role', 'rolenext').replace('name', 'namenext')
_description = r"(?P<description>\s*:(\s+(?P<desc>\S+.*))?)?\s*$"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I think we'd like to make the space before the colon obligatory, for consistency with param lists, but I don't think the current code requires it)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that the original _name_rgx did not require it.

_func_rgx = re.compile(r"^\s*" + _funcname + r"\s*", re.X)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why re.X? I don't think you're taking advantage of verbose here anywhere (if you were, you might not need to define all these intermediate variables as you could put comments in the regex)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re.X was leftover from original _name_rgx. Now removed.

_line_rgx = re.compile(
r"^\s*"
+ r"(?P<allfuncs>" # group for all function names
+ _funcname
+ r"(?P<morefuncs>([,]\s+"
+ _funcnamenext + r")*)"
+ r")" # end of "allfuncs"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you were actually taking advantage of verbose mode, you could have a string here and indent....

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found the fully expanded verbose mode less understandable/maintainable than the hierarchical version, though there may be equivalent regexes that are simpler.

+ r"(\s*,)?" # Some function lists have a trailing comma
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we want to support a trailing comma? we are here to define a standard...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without it one of the existing tests failed. test_docscrape.py::test_see_also()
I added a warning if this occurs, and during the tests it now prints

UserWarning: Unexpected comma after function list at index 39 of line "func_f, func_g, :meth:`func_h`, func_j,"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we want to support a trailing comma?

Copy link
Contributor Author

@pvanmulbregt pvanmulbregt Apr 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a duplicate of the previous comment?

+ _description,
re.X)

_name_rgx = re.compile(r"^\s*(:(?P<role>\w+):"
r"`(?P<name>(?:~\w+\.)?[a-zA-Z0-9_.-]+)`|"
r" (?P<name2>[a-zA-Z0-9_.-]+))\s*", re.X)

empty_description = '..'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get why this is a public attribugte

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is (now) used in test_docscrape.py:line_by_line_compare.


def _parse_see_also(self, content):
"""
func_name : Descriptive text
Expand All @@ -252,48 +272,62 @@ def _parse_see_also(self, content):

def parse_item_name(text):
"""Match ':role:`name`' or 'name'"""
m = self._name_rgx.match(text)
if m:
g = m.groups()
if g[1] is None:
return g[3], None
else:
return g[2], g[1]
raise ParseError("%s is not a item name" % text)
m = self._func_rgx.match(text)
if not m:
raise ParseError("%s is not a item name" % text)
role = m.groupdict().get('role')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
role = m.groupdict().get('role')
role = m.group('role')

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

if role:
name = m.group('name')
else:
name = m.group('name2')
return name, role, m

def push_item(name, rest):
if not name:
return
name, role = parse_item_name(name)
name, role, m2 = parse_item_name(name)
items.append((name, list(rest), role))
del rest[:]

current_func = None
rest = []

for line in content:
if not line.strip():
continue

m = self._name_rgx.match(line)
if m and line[m.end():].strip().startswith(':'):
push_item(current_func, rest)
current_func, line = line[:m.end()], line[m.end():]
rest = [line.split(':', 1)[1].strip()]
if not rest[0]:
rest = []
elif not line.startswith(' '):
push_item(current_func, rest)
current_func = None
if ',' in line:
for func in line.split(','):
if func.strip():
push_item(func, [])
elif line.strip():
current_func = line
elif current_func is not None:
ml = self._line_rgx.match(line)
description = None
if ml:
if 'description' in ml.groupdict():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

description = ml.group('desc') will suffice here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

description = ml.groupdict().get('desc')
if not description and line.startswith(' '):
rest.append(line.strip())
push_item(current_func, rest)
elif ml:
funcs = []
text = ml.group('allfuncs')
while True:
if not text.strip():
break
name, role, m2 = parse_item_name(text)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be cleaner to have parse_item_name return the end than the match. especially if you're going to call it the inscrutable m2

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

# m2 = self._func_rgx.match(text)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please remove this commented section

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

# if not m2:
# raise ParseError("%s is not a item name" % line)
# role = m2.groupdict().get('role')
# if role:
# name = m2.group('name')
# else:
# name = m2.group('name2')
funcs.append((name, role))
text = text[m2.end():].strip()
if text and text[0] == ',':
text = text[1:].strip()
if description:
rest = [description]
else:
rest = []
items.append((funcs, rest))
else:
raise ParseError("%s is not a item name" % line)
return items

def _parse_index(self, section, content):
Expand Down Expand Up @@ -440,24 +474,30 @@ def _str_see_also(self, func_role):
return []
out = []
out += self._str_header("See Also")
out += ['']
last_had_desc = True
for func, desc, role in self['See Also']:
if role:
link = ':%s:`%s`' % (role, func)
elif func_role:
link = ':%s:`%s`' % (func_role, func)
else:
link = "`%s`_" % func
if desc or last_had_desc:
out += ['']
out += [link]
else:
out[-1] += ", %s" % link
for funcs, desc in self['See Also']:
assert isinstance(funcs, (list, tuple))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we allow a tuple?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only allow lists.

links = []
for func, role in funcs:
if role:
link = ':%s:`%s`' % (role, func)
elif func_role:
link = ':%s:`%s`' % (func_role, func)
else:
link = "`%s`_" % func
links.append(link)
link = ', '.join(links)
out += [link]
if desc:
out += self._str_indent([' '.join(desc)])
last_had_desc = True
else:
last_had_desc = False
out += self._str_indent([self.empty_description])

if last_had_desc:
out += ['']
out += ['']
return out

Expand Down
66 changes: 39 additions & 27 deletions numpydoc/tests/test_docscrape.py
Original file line number Diff line number Diff line change
Expand Up @@ -331,13 +331,15 @@ def _strip_blank_lines(s):


def line_by_line_compare(a, b):
empty_description = '..'
rgx = re.compile(r"^\s+" + re.escape(empty_description) + "$")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rgx is a terrible name for something that matches an empty description. But I'm also not sure why it's important to disregard these.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed to empty_description_rgx.


a = textwrap.dedent(a)
b = textwrap.dedent(b)
a = [l.rstrip() for l in _strip_blank_lines(a).split('\n')]
b = [l.rstrip() for l in _strip_blank_lines(b).split('\n')]
a = [rgx.sub('', l.rstrip()) for l in _strip_blank_lines(a).split('\n')]
b = [rgx.sub('', l.rstrip()) for l in _strip_blank_lines(b).split('\n')]
assert all(x == y for x, y in zip(a, b))


def test_str():
# doc_txt has the order of Notes and See Also sections flipped.
# This should be handled automatically, and so, one thing this test does
Expand Down Expand Up @@ -709,36 +711,46 @@ def test_see_also():
multiple lines
func_f, func_g, :meth:`func_h`, func_j,
func_k
func_f1, func_g1, :meth:`func_h1`, func_j1
func_f2, func_g2, :meth:`func_h2`, func_j2 : description of multiple
:obj:`baz.obj_q`
:obj:`~baz.obj_r`
:class:`class_j`: fubar
foobar
""")

assert len(doc6['See Also']) == 13
for func, desc, role in doc6['See Also']:
if func in ('func_a', 'func_b', 'func_c', 'func_f',
'func_g', 'func_h', 'func_j', 'func_k', 'baz.obj_q',
'~baz.obj_r'):
assert(not desc)
else:
assert(desc)

if func == 'func_h':
assert role == 'meth'
elif func == 'baz.obj_q' or func == '~baz.obj_r':
assert role == 'obj'
elif func == 'class_j':
assert role == 'class'
else:
assert role is None

if func == 'func_d':
assert desc == ['some equivalent func']
elif func == 'foo.func_e':
assert desc == ['some other func over', 'multiple lines']
elif func == 'class_j':
assert desc == ['fubar', 'foobar']
assert len(doc6['See Also']) == 10
for funcs, desc in doc6['See Also']:
for func, role in funcs:
if func in ('func_a', 'func_b', 'func_c', 'func_f',
'func_g', 'func_h', 'func_j', 'func_k', 'baz.obj_q',
'func_f1', 'func_g1', 'func_h1', 'func_j1',
'~baz.obj_r'):
assert (not desc), str([func, desc])
elif func in ('func_f2', 'func_g2', 'func_h2', 'func_j2'):
assert (desc), str([func, desc])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rm parentheses

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

else:
assert(desc), str([func, desc])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rm parentheses

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


if func == 'func_h':
assert role == 'meth'
elif func == 'baz.obj_q' or func == '~baz.obj_r':
assert role == 'obj'
elif func == 'class_j':
assert role == 'class'
elif func in ['func_h1', 'func_h2']:
assert role == 'meth'
else:
assert role is None, str([func, role])

if func == 'func_d':
assert desc == ['some equivalent func']
elif func == 'foo.func_e':
assert desc == ['some other func over', 'multiple lines']
elif func == 'class_j':
assert desc == ['fubar', 'foobar']
elif func in ['func_f2', 'func_g2', 'func_h2', 'func_j2']:
assert desc == ['description of multiple'], str([desc, ['description of multiple']])


def test_see_also_parse_error():
Expand Down