Search: recursively parse sections #7207
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is on top of #7204
If we have a structure like
And we start indexing from
parent
,we will index all children in the first step,
and then index each header later. This is, duplicating content.
This is solved by checking for a section till 1 level.
In this example, the first parsing will stop when finding the first h1,
not duplicating content. Later it will index the next nodes as usual.
Also, we can increase the depth check when parsing all sections, that way we don't rely anymore on the div used by sphinx to enclose a section, and avoid indexing duplicated content if other themes don't follow the same structure.
A real example of this is https://github.com/readthedocs/readthedocs.org/blob/a0d645c9b561c0189ba0956a1554f577c413ecdf/readthedocs/search/tests/data/mkdocs/in/gitbook/index.html (from #7208)