-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
DOC: Implement script to validate list indentation in docs #21520
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi @datapythonista, I'm new to the open source community and was thinking this could be a good first place to contribute. Would this just involve writing a shell command that greps for any line in the .rst files that looks like a list and doesn't have either none or 4 spaces? |
@anich003 that sounds great. The main challenge here is to correctly detect all lists with wrong indentation avoiding false positives. I see 2 possible, processing the raw Let me know if you need help with it. |
@datapythonista it seems the .rst files definitely have "well-formed" sublists (No space or 4 space before each *) but there are also some files that have 3-space lines (dsintro.rst, comparison_with_stata.rst) and 5-space lines (io.rst, install.rst). The stata lines are, for example, comment lines within stata while the io examples might be sublists but its not clear. It seems we'd want to ignore the code block asterisks and detect the lines in io.rst. I'm not sure how I'd accomplish this with a series of piped greps (my original plan) so I'll have to look more into the sphinx functions. Thoughts? |
Thanks for looking at it @anich003. There wouldn't be many lists with wrong indentation, as I reviewed them manually recently. So it's normal you found just few. I think it's not a simple problem to detect them automatically, so it makes sense that the piped greps are not good enough. I think we'll have to parse the files ourselves (probably with a Python script). Or reuse sphinx parsing and detect when they are not being rendered correctly (for indented lists sphinx creates a block quote around the list). |
I think sublists require 2 instead of 4 spaces to indent:
From A ReStructuredText Primer
<ul>
<li><p class="first">if the dtype is unsupported (e.g. <code class="docutils literal notranslate"><span class="pre">np.complex</span></code>) then the <code class="docutils literal notranslate"><span class="pre">default_handler</span></code>, if provided, will be called
for each value, otherwise an exception is raised.</p>
</li>
<li><p class="first">if an object is unsupported it will attempt the following:</p>
<blockquote>
<div><ul class="simple">
<li>check if the object has defined a <code class="docutils literal notranslate"><span class="pre">toDict</span></code> method and call it.
A <code class="docutils literal notranslate"><span class="pre">toDict</span></code> method should return a <code class="docutils literal notranslate"><span class="pre">dict</span></code> which will then be JSON serialized.</li>
<li>invoke the <code class="docutils literal notranslate"><span class="pre">default_handler</span></code> if one was provided.</li>
<li>convert the object to a <code class="docutils literal notranslate"><span class="pre">dict</span></code> by traversing its contents. However this will often fail
with an <code class="docutils literal notranslate"><span class="pre">OverflowError</span></code> or give unexpected results.</li>
</ul>
</div></blockquote>
</li>
</ul> vs <ul class="simple">
<li>if the dtype is unsupported (e.g. <code class="docutils literal notranslate"><span class="pre">np.complex</span></code>) then the <code class="docutils literal notranslate"><span class="pre">default_handler</span></code>, if provided, will be called
for each value, otherwise an exception is raised.</li>
<li>if an object is unsupported it will attempt the following:<ul>
<li>check if the object has defined a <code class="docutils literal notranslate"><span class="pre">toDict</span></code> method and call it.
A <code class="docutils literal notranslate"><span class="pre">toDict</span></code> method should return a <code class="docutils literal notranslate"><span class="pre">dict</span></code> which will then be JSON serialized.</li>
<li>invoke the <code class="docutils literal notranslate"><span class="pre">default_handler</span></code> if one was provided.</li>
<li>convert the object to a <code class="docutils literal notranslate"><span class="pre">dict</span></code> by traversing its contents. However this will often fail
with an <code class="docutils literal notranslate"><span class="pre">OverflowError</span></code> or give unexpected results.</li>
</ul>
</li>
</ul> |
In #21518 it's been identified that several lists in the documentation don't follow the restructuredText standard (no indentation for top-level lists, 4 space indentation for sublists).
In that issue, (hopefully) all the formatting has been fixed manually. But it'd be useful to have a script that validates all the documentation pages, and makes sure no wrong formatting exists. Adding this script to
lint.sh
will also prevent that no lists with the wrong formatting are added in the future.The text was updated successfully, but these errors were encountered: