-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
DOC: Validate consistency of title capitalization #26941
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
main things to note here are proper names, e.g. PyTables, Python and IPython (prob some others) |
I would like to give this a try, if that's ok. |
Is this still an open issue? |
Yes, still pending, and would be great to get this fixed. Thanks! |
When you mention validating as a sphinx extension, do you mean creating a custom extension (like that done in the file 'doc/sphinxext/contributors.py')? I'm also a little confused on the type of extension to create. I've read about Sphinx roles, Directives, and Builders, but I'm not sure if there's any specific one I should choose for this situation. |
I don't know much about sphinx extensions, and I find sphinx itself very confusing. But yes, I assumed that since sphinx is already parsing all the files, it could be possible in a custom extension like the contributors.py one to validate the titles. But an independent script that parses all the files, extracts all the titles, and reports any with an unwanted capitalization is also an option. |
take |
At this point, I have been able to create a python script where given a .rst file, this file can parse through that .rst file, identify titles from the produced doctree, and determine which titles do not follow the capitalization convention mentioned above. I've been using the doc/source/development/contributing.rst file as a test file to see if my code is working fine. When testing, I noticed my code labeled these titles as not following the capitalization convention: Code Base: Before moving on, I was wondering if these titles are special in any way (i.e. proper names, etc.) or if they simply do not follow the capitalization convention. Also, is there any place that I could get easy access to finding proper names? I thought of looking through pandas API reference (https://pandas.pydata.org/pandas-docs/stable/reference/index.html) but I wasn't sure how to approach finding proper names in that document. Thanks! |
Thanks @tonywu1999, that sounds great. Those titles don't have anything special, and should be changed. We don't have a list of proper names we want capitalized, we'll have to build that list dynamically, as we validate titles. Jeff mentioned few as examples: PyTables, Python and IPython. But not even sure if those appear in titles. I think the way to move forward is to open a PR with your script, and you use it to validate couple of files from Once your PR is merged, we can open issues to fix and validate the rest of the files in the docs. Other people can help with this, there is a significant amount of titles to change. What I'd do is that your script accepts a file, a list of files, or a directory to look for files in it recursively. So, these cases would all be valid:
Initially, we'll validate just a subset of files, and when we're done we'll just call the last command. For the exceptions, I think the easiest is that your script has something like: CAPITALIZATION_EXCEPTIONS = [
"pandas",
"NumFOCUS",
"Python",
...
] The words on the list will have to be in the exact capitalization as defined, no matter if they are the first word of the title, or a following one. The rest of the words should have the correct capitalization Does all this make sense? |
I have a couple questions regarding your comment:
|
For (1), I'd use as a reference the scripts in For (2), you should output in the terminal (CI logs), a message as descriptive as possible, so when someone finds it in the CI, can easily understand and fix the problem. You also need to make the script return an exit code different from 0, so the process fails, and the CI fails. Again, you can use the mentioned scripts for reference. In (3) I meant that when you've got the script, and you open the PR, you can add in Thanks! |
Hi, I recently committed and made a pull request with the new script ( #31114 ), but I encountered multiple issues. One big issue I'm having is suppressing the output of helper functions that I imported. In my script, I had created a context manager to suppress output, which worked when I ran the script on my local machine, but did not work on GitHub when code_checks.sh ran. Is there any way I can suppress the output of other helper functions? |
…mong headings in documentation (pandas-dev#26941) (pandas-dev#31114)
In #26933, we're making the capitalization of the title sections consisten. We use to have many titles capitalized as
This is the Section Title
, and we changed all them (probably few were forgotten) toThis is the section title
.To keep this consistency, we should validate that the capitalization is correct in the CI. This can be done by extracting all the titles, and making sure that only the first letter of the sentence is uppercase, or words defined in a short list, like
Series
,DataFrame
,...I think this can be done in two ways:
The first option should be simpler if sphinx can implement this as extension, but not sure if that's the case.
The text was updated successfully, but these errors were encountered: