-
Notifications
You must be signed in to change notification settings - Fork 23
JSON-LD in HTML #68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JSON-LD in HTML #68
Conversation
087b6ef
to
64fe47a
Compare
64fe47a
to
10b56e9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally this seems great, but I raised some concerns/questions/tweaks inline. Thanks for tackling this, @gkellogg!
Rebasing, so inline comments may be lost. |
75814b8
to
9e071c9
Compare
9e071c9
to
c1cac18
Compare
…e about prospect of dynamically changing base. This reflects discussion from #23 (comment) and resulting TAG advice w3ctag/design-reviews#312 (comment).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should wait on some of this until we've further discussed the potential opportunity of the TAG's review alongside the risk of prematurely closing #23 without revisiting it in light of the TAG's review and suggested explorations.
…e restrictive than the base HTML definitions. This includes a sketch processing model for removing comments and unescaping the resulting content.
…n extracting examples.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 to merging this work. Hopefully with this merged, we can encourage implementation--which should help more discussions around actual usage scenarios beyond just the Schema.org one(s).
Actually...before we merge this work, are we certain that the 4.12.1.3. Restrictions for contents of script elements apply to data blocks? That "restrictions" section only seems to pertain to writing Thoughts? |
The restriction is on the contents of a script element, data blocks are a subset of all script elements. I don't follow how you infer that the section doesn't apply to data blocks. It would seem self-evident that the contents of data blocks need to have provisions for making sure that the content of the data block could be properly escaped. In any case, these rules are necessary for properly processing JSON-LD scripts. |
@gkellogg apologies...you're right of course... I'd forgotten I'd tested all that earlier. Toss this into http://jsbin.com/ or similar to see how it breaks: <script type="application/ld+json" id="stuff">
{
"@context": {"@vocab": "http://example.com/"},
"widget": "var example = 'Consider this string: <!-- <script>';console.log(example);"
}
</script>
<script>
<!-- whatever's below here won't run because what's above here is busted... T_T -->
console.log(stuff.textContent);
</script> Escaping things as you've defined does fix it....sad...but true. 😩 |
...hence all the red highlighting added by the Markdown parser... 🤕 |
They're actually needed to not break the rest of the HTML and/or DOM tree afaict. So, it's an HTML parsing requirement that's "infecting" the JSON(-LD) (or JS or YAML or...) when embedded... |
This issue was discussed in a meeting.
View the transcriptWhat is ‘base’ for embedded json-ld?
Benjamin Young: we discussed that one at tpac Gregg Kellogg: there are 2 open PRs Adam Soroka: quick question, what are we expected to do with their comments? Ivan Herman: what they propose is interesting but beyond our charter
Ivan Herman: regarding the PR-93, there is some stuff about having XML Benjamin Young: the thing I just linked shows how script tags affect html parsing Gregg Kellogg: what I did in the PR-68 I call out specifics on how to handle those blocks if the media type is application/json Benjamin Young: the HTML comments stuff as really bothered me since I’ve read it Ivan Herman: for the comment storing, the whole section is a normative thing
Ivan Herman: we should officially answer to the TAG and will officially add to the standard what they said about base
Gregg Kellogg: comments in html and escaping.. it depends on the encoding
Ivan Herman: it has to be valid json-ld Gregg Kellogg: that’s something you see quite often
Gregg Kellogg: comments are often used just to make sure there are no other issues embedded in the script elements that would cause any issues Benjamin Young: I did quite some digging on that issue Pierre-Antoine Champin: one crazy idea by looking at the json-ld embedded in html comments: you could add a js comment in front of the html comment, making it valid javascript Benjamin Young: sadly it wouldn’t Gregg Kellogg: the json-ld would not be allowed to contain anything that could be interpreted as html and/or html comments Harold Solbrig: why is this an json-ld issue but not a javascript issue? Gregg Kellogg: [explains why it isn’t] Gregg Kellogg: it did some test cases for this, exploring corner cases we know of Gregg Kellogg: it describes script tags and data blocks are a subset Benjamin Young: what’s breaking it, is the potential of one to too early close the script tag Ivan Herman: is it so horrible to say, if I put json-ld in a script tag I’m supposed to escape anything that html would need to have escaped Gregg Kellogg: for someone who’s actually looking at the source, those entities become rather annoying Ivan Herman: realistically, I don’t know how often this would happen Benjamin Young: the escaping issue is very similar of putting json-ld inside a text env. Ivan Herman: I think it’s perfectly reasonable to accept both PRs, close the issue Gregg Kellogg: it’s a editor’s draft not a working draft Ivan Herman: we would open a issue right away Benjamin Young: I would only +1 this, if we add a big red AT RISK disclaimer Ivan Herman: a lot of very important things are pending for now Adam Soroka: I don’t think we should use a phrase like “AT RISK” but more something along the lines of “will be part of the final spec but might undergo some changes” Ivan Herman: we cannot commit ourselves to having always consistent editor’s drafts Benjamin Young: I’m not sure we have reached consensus on all the things contained Gregg Kellogg: I cannot work on other open issues Pierre-Antoine Champin: what about a parameter on the media type hinting at having to do unescaping? (like
Ivan Herman: what does “that” mean?
Benjamin Young: I don’t want to have stuff merged without reaching consensus Ivan Herman: putting things that are already done “at risk” would be going backwards Adam Soroka: I have to generally agree with ivan
Adam Soroka: it seems for me very unlikely that we would stop talking about it
Benjamin Young: I’m fine with merging those
|
Update the JSON-LD in HTML section to be normative, describe dataset extraction, how to deal with multiple script elements and script element targeting using fragments.
I took some license to describe targeting specific script element using fragments and on the treatment of embedded escape sequences and comments.
Fixes #57.
This should not be merged before appropriate changes made to json-ld-api.
Preview | Diff