Skip to content

Incorrect expectations in tree-construction webkit02 tests #78

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
iabudiab opened this issue May 15, 2016 · 25 comments
Closed

Incorrect expectations in tree-construction webkit02 tests #78

iabudiab opened this issue May 15, 2016 · 25 comments

Comments

@iabudiab
Copy link
Contributor

Correct if I'm wrong. I could prepare a PR, but wanted to verify this first.

I believe the expectations for the following tests (15, 16, 17 in webkit02) to be incorrect:
<b><em><foo><foo><foo><aside></b></em>
<b><em><foo><foo><foo><foo><foo><foo><foo><foo><foo><foo><aside></b></em>
<b><em><foo><foob><foob><foob><foob><fooc><fooc><fooc><fooc><food><aside></b></em>

This <b><em><foo><foo><foo><aside></b></em> expects the following tree:

| <html>
|   <head>
|   <body>
|     <b>
|       <em>
|         <foo>
|           <foo>
|             <foo>
|     <aside>
|       <b>

but should expect:

| <html>
|   <head>
|   <body>
|     <b>
|       <em>
|         <foo>
|           <foo>
|             <foo>
|     <aside>
|       <b>
|         <em>

The same is true for the other two tests, i.e. the <b> element, child of <aside>, should have an <em> after running the adoption agency algorithm.

@gsnedders
Copy link
Member

gsnedders commented May 16, 2016

Oh. Those tests. I remember them taking me quite a while to convince myself of their correctness. It looks like they're checking the interaction between the Noah's Ark clause (does <aside> introduce a marker?) and the AAA. I'm soon boarding a far-too-many-hours flight, hopefully I'll have convinced myself of their correctness or not by the end. :)

@gsnedders
Copy link
Member

The em is removed because of "If inner loop counter is greater than three and node is in the list of active formatting elements, then remove node from the list of active formatting elements" combined with "If node is not in the list of active formatting elements, then remove node from the stack of open elements and then go back to the step labeled inner loop", no? (When the inner loop counter is 1, the bottommost foo is removed from the stack; when 2 the middle foo; when 3 the top foo.)

@gsnedders
Copy link
Member

We have interop, and I'm pretty sure that's the key to working out what's going on.

@iabudiab
Copy link
Contributor Author

I've doen some calculations myself.

Here is is the state when first entering the AAA with </b> token:

Stack of Open Elements Active Formatting Elements
html, body, b, em, foo, foo, foo, aside b, em
  • In the outer loop we get the following:
Formatting Element Furthest Block Common Ancestor Bookmark
b aside body 0
  • Now the inner loop.

Blink and WebKit limit the inner loop counter to 3. This is also true for the html5lib-python implementation. However, the current spec doesn't state this explicitly, at least not explicitly enough as it is the case for the outer loop Outer loop: If outer loop counter is greater than or equal to eight, then abort these steps.

So in this case the <em> wouldn't be removed via the following step If inner loop counter is greater than three and node is in the list of active formatting elements, then remove node from the list of active formatting elements

So the state directly after the inner loop is as follows:

Stack of Open Elements Active Formatting Elements
html, body, b, em, aside b, em

and the DOM tree is untouched.

  • Now the furthest block <aside> gets inserted into the common ancestor <body>
  • A new element for the token, i.e. <b>, is appended to <aside>
  • <b> is removed from the active formatting elements and reinserted at bookmark, which is 0, so this stays as it is
  • <b> is removed from the stack of open elements and is reinserted after the <aside>
  • The second iteration of the outer loops breaks because there is no furthest block and we end up with:
Stack of Open Elements Active Formatting Elements
html, body, em, aside em

with the following tree:

| <html>
|   <head>
|   <body>
|     <b>
|       <em>
|         <foo>
|           <foo>
|             <foo>
|     <aside>
|       <b>
  • The outer loop of the AAA for the </em> token is almost the same:
Formatting Element Furthest Block Common Ancestor Bookmark
em aside body 0
  • The inner loop however is left immediately, since the element directly above the <aside> in the stack of pen elements is the formatting element.
  • A new element for the token <em> is created and it gets all child nodes from <aside>, as described: Take all of the child nodes of furthest block and append them to the element created in the last step.
  • The <em> is then appended to <aside>
  • The second iteration of the outer loops breaks because there is no furthest block and we end up with:
Stack of Open Elements Active Formatting Elements
html, body, aside

with the following tree:

| <html>
|   <head>
|   <body>
|     <b>
|       <em>
|         <foo>
|           <foo>
|             <foo>
|     <aside>
|       <em>
|         <b>

Notice the ordering of <em> and <b> under <aside>. My initial assumption was wrong, however the test expectation is still incorrect.

I have also tested the HTML snippet in Chrome, Safari & Firefox, all three produce this tree, i.e. aside, em, b.

As I see it, the spec is ambiguous when it comes to the inner loop handling.

@iabudiab
Copy link
Contributor Author

And for the sake of completeness:
WebKit Adoption Agency
html5lib-python Adoption Agency

@gsnedders
Copy link
Member

Given this is getting into clarity of the spec: cc/ @domenic @annevk

@annevk
Copy link
Contributor

annevk commented May 17, 2016

@RReverser or @zcorpan are probably much more able to address this than I am.

@RReverser
Copy link
Contributor

@zcorpan or maybe @inikulin are more likely to be able to answer this.

@inikulin
Copy link
Member

Just re-checked @iabudiab's steps per spec and in parse5 - test's expected value is wrong.

@nox
Copy link
Member

nox commented May 17, 2016

I don't see what in @iabudiab's explanation disproves what @gsnedders stated in #78 (comment).

To me by the time the algorithm reaches <em>, the inner loop counter already reached 3 and thus <em> gets removed.

FWIW, html5ever passes this test.

@zcorpan
Copy link
Contributor

zcorpan commented May 17, 2016

In Gecko I get this tree for http://software.hixie.ch/utilities/js/live-dom-viewer/saved/4199

| <html>
|   <head>
|   <body>
|     <b>
|       <em>
|         <foo>
|           <foo>
|             <foo>
|     <aside>
|       <b>

i.e. no em in the aside. But WebKit/Chromium/IE11 have aside, em, b.

cc @hsivonen

@inikulin
Copy link
Member

I feel like there was some change to the spec which caused fragmentation in results

@gsnedders
Copy link
Member

gsnedders commented May 17, 2016

From #78 (comment) (@iabudiab)

Blink and WebKit limit the inner loop counter to 3. This is also true for the html5lib-python implementation. However, the current spec doesn't state this explicitly, at least not explicitly enough as it is the case for the outer loop Outer loop: If outer loop counter is greater than or equal to eight, then abort these steps.

So in this case the <em> wouldn't be removed via the following step If inner loop counter is greater than three and node is in the list of active formatting elements, then remove node from the list of active formatting elements

You're saying that it doesn't get reached in Blink and WebKit because they break out of the loop when the inner loop counter is three, and hence that never gets hit? Given the spec doesn't have that breakout from the loop, it should therefore get removed, no?

@gsnedders
Copy link
Member

@inikulin whatwg/html@22ce3c31 is why WebKit/Blink/IE11/html5lib-python has that behaviour

@gsnedders
Copy link
Member

gsnedders commented May 17, 2016

As far as I can tell, @iabudiab's proposed behaviour is what would be expected by the spec prior to whatwg/html@22ce3c31. (And hence why it matches WebKit/Blink/IE11/html5lib-python.)

@inikulin
Copy link
Member

@gsnedders same in parse5

@domenic
Copy link

domenic commented May 17, 2016

Edge matches IE11, FWIW.

So is there agreement that that was a good spec change, and we should be filing bugs on WebKit/Blink/Edge/html5lib-python/parse5?

Or do we want to align with the majority, revert the spec change, and file bugs on Gecko/html5ever?

@inikulin
Copy link
Member

It would be nice to know motivation behind whatwg/html@22ce3c3

@gsnedders
Copy link
Member

@domenic html5lib-python just has a generic "update to match the spec" issue (it hasn't been properly updated in a long time) :)

@inikulin https://lists.w3.org/Archives/Public/public-whatwg-archive/2013Jul/0401.html

@gsnedders
Copy link
Member

@inikulin https://lists.w3.org/Archives/Public/public-whatwg-archive/2013Jul/0006.html which that is a response to gives more of it

@gsnedders
Copy link
Member

Notably, <b><i><a><s><tt><div></b>abc</b></div></tt></s></a>xyz</i> gives a non-sensical parse-tree (where xyz precedes abc) in the old spec/WebKit/Blink/Edge.

@inikulin
Copy link
Member

Notably,<b><i><a><s><tt><div></b>abc</b></div></tt></s></a>xyz</i> gives a non-sensical parse-tree (where xyz precedes abc) in the old spec/WebKit/Blink/Edge.

Well, I don't see a problem. With foster parenting you will have similar results: <table><tr><td>123</td></tr><div>456</div></table>. And IMHO such cases are more common than the one with AAA.

@zcorpan
Copy link
Contributor

zcorpan commented May 18, 2016

Reordering in tables is something that all browsers did pre-HTML5. Reordering for misnested formatting elements is not.

Since it seems Gecko has successfully shipped the spec change, we should go ahead and implement it in the other browsers, too (unless we have data suggesting the new spec is less Web-compatible). The change in Gecko was https://bugzilla.mozilla.org/show_bug.cgi?id=901319

@travisleithead any update on this issue for Edge? re https://lists.w3.org/Archives/Public/public-html/2013Aug/0002.html

@gsnedders
Copy link
Member

So we have consensus that the tests are right v. the spec, and impls should change? Closing as a result. (If that understanding is wrong, feel free to reopen it. Or ping me if you can't.)

@travisleithead
Copy link

Mmm. The 2013 bug is likely gone with our old bug database. I opened an issue on our public tracker--you can monitor progress there: https://developer.microsoft.com/en-us/microsoft-edge/platform/issues/9854392/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants