You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If we have an structure like
- parent
- content
- content
- h1
- content
- content
- h2
- content
And we start indexing from `parent`,
we will index all children in the first step,
and then index each header later. This is, duplicating content.
This is solved by checking for a section till 1 level.
In this example, the first parsing will stop when finding the first h1,
not duplicating content. Later it will index the next nodes as usual.
0 commit comments