diff --git a/common/common.js b/common/common.js index b9e93ab1..14be9d70 100644 --- a/common/common.js +++ b/common/common.js @@ -271,6 +271,6 @@ function unComment(doc, content) { return content .replace(//, '') - .replace(/< !--/g, ''); + .replace(/< !\s*-\s*-/g, ''); } diff --git a/common/extract-examples.rb b/common/extract-examples.rb index 89fb6edb..5035f657 100755 --- a/common/extract-examples.rb +++ b/common/extract-examples.rb @@ -319,13 +319,18 @@ def save_example(examples:, element:, title:, example_number:, error:, warn:) $stdout.write "F".colorize(:red) next end + + # Get base from document, if present + html_base = doc.at_xpath('/html/head/base/@href') + ex[:base] = html_base.to_s if html_base + script_content = doc.at_xpath(xpath) if script_content # Remove (faked) XML comments and unescape sequences content = script_content .inner_html - .sub(/^\s*< !--/, '') - .sub(/-- >\s*$/, '') + .sub(/^\s*< !\s*-\s*-/, '') + .sub(/-\s*- >\s*$/, '') .gsub(/</, '<') end @@ -438,7 +443,15 @@ def save_example(examples:, element:, title:, example_number:, error:, warn:) # Set argument to referenced content to be parsed args[0] = if examples[ex[:result_for]][:ext] == 'html' && method == :expand # If we are expanding, and the reference is HTML, find the first script element. - doc = Nokogiri::HTML.parse(examples[ex[:result_for]][:content]) + doc = Nokogiri::HTML.parse( + examples[ex[:result_for]][:content] + .sub(/^\s*< !\s*-\s*-/, '') + .sub(/-\s*- >\s*$/, '')) + + # Get base from document, if present + html_base = doc.at_xpath('/html/head/base/@href') + options[:base] = html_base.to_s if html_base + script_content = doc.at_xpath(xpath) unless script_content errors << "Example #{ex[:number]} at line #{ex[:line]} references example #{ex[:result_for].inspect} with no JSON-LD script element" @@ -447,12 +460,13 @@ def save_example(examples:, element:, title:, example_number:, error:, warn:) end StringIO.new(script_content .inner_html - .sub(/^\s*< !--/, '') - .sub(/-- >\s*$/, '') .gsub(/</, '<')) elsif examples[ex[:result_for]][:ext] == 'html' && ex[:target] # Only use the targeted script - doc = Nokogiri::HTML.parse(examples[ex[:result_for]][:content]) + doc = Nokogiri::HTML.parse( + examples[ex[:result_for]][:content] + .sub(/^\s*< !\s*-\s*-/, '') + .sub(/-\s*- >\s*$/, '')) script_content = doc.at_xpath(xpath) unless script_content errors << "Example #{ex[:number]} at line #{ex[:line]} references example #{ex[:result_for].inspect} with no JSON-LD script element" @@ -461,8 +475,6 @@ def save_example(examples:, element:, title:, example_number:, error:, warn:) end StringIO.new(script_content .to_html - .sub(/^\s*< !--/, '') - .sub(/-- >\s*$/, '') .gsub(/</, '<')) else StringIO.new(examples[ex[:result_for]][:content]) diff --git a/examples/Combining-multiple-JSON-LD-script-elements-into-a-single-dataset-original.html b/examples/Combining-multiple-JSON-LD-script-elements-into-a-single-dataset-original.html new file mode 100644 index 00000000..cf0107e2 --- /dev/null +++ b/examples/Combining-multiple-JSON-LD-script-elements-into-a-single-dataset-original.html @@ -0,0 +1,9 @@ + +{ + "@context": { + "@vocab": "http://schema.org/" + }, + "@id": "https://digitalbazaar.com/author/dlongley/", + "@type": "Person", + "name": "Dave Longley" +} diff --git a/examples/Combining-multiple-JSON-LD-script-elements-into-a-single-dataset-statements.table b/examples/Combining-multiple-JSON-LD-script-elements-into-a-single-dataset-statements.table new file mode 100644 index 00000000..82a2bdd7 --- /dev/null +++ b/examples/Combining-multiple-JSON-LD-script-elements-into-a-single-dataset-statements.table @@ -0,0 +1,29 @@ +
Subject | +Property | +Value | +
---|---|---|
https://digitalbazaar.com/author/dlongley/ | +rdf:type | +schema:Person | +
https://digitalbazaar.com/author/dlongley/ | +schema:name | +Dave Longley | +
http://greggkellogg.net/foaf#me | +rdf:type | +schema:Person | +
http://greggkellogg.net/foaf#me | +schema:name | +Gregg Kellogg | +
Subject | +Property | +Value | +Value Type | +
---|---|---|---|
http://dbpedia.org/resource/John_Lennon | +foaf:name | +John Lennon | ++ |
http://dbpedia.org/resource/John_Lennon | +schema:birthDate | +1940-10-09 | +xsd:date | +
http://dbpedia.org/resource/John_Lennon | +schema:spouse | +http://dbpedia.org/resource/Cynthia_Lennon | ++ |
Subject | +Property | +Value | +Value Type | +
---|---|---|---|
http://dbpedia.org/resource/John_Lennon | +foaf:name | +John Lennon | ++ |
http://dbpedia.org/resource/John_Lennon | +schema:birthDate | +1940-10-09 | +xsd:date | +
http://dbpedia.org/resource/John_Lennon | +schema:spouse | +http://dbpedia.org/resource/Cynthia_Lennon | ++ |
Subject | +Property | +Value | +
---|---|---|
http://greggkellogg.net/foaf#me | +rdf:type | +schema:Person | +
http://greggkellogg.net/foaf#me | +schema:name | +Gregg Kellogg | +
HTML script elements can be used to embed blocks of data in documents.
This way, JSON-LD content can be easily embedded in HTML [[HTML52]] by placing
it in a script element with the type
attribute set to
application/ld+json
.
- -+
Defining how such data may be used is beyond the scope of this specification. The embedded JSON-LD document might be extracted as is or, e.g., be interpreted as RDF.
-If JSON-LD content is extracted as RDF [[RDF11-CONCEPTS]], it should be expanded into an +
If JSON-LD content is extracted as RDF [[RDF11-CONCEPTS]], it MUST be expanded into an RDF Dataset using the Deserialize JSON-LD to RDF Algorithm - [[JSON-LD11-API]].
+ [[JSON-LD11-API]]. Unless a specific script is targeted + (see ), + all script elements + withtype
application/ld+json
MUST be processed and merged
+ into a single dataset with equivalent blank node identifiers contained in
+ separate script elements treated as if they were in a single document (i.e.,
+ blank nodes are shared between different JSON-LD script elements).
+
+
+
+ Otherwise, unless a specific script is targeted
+ (see ),
+ only the first script element of type
application/ld+json
is used.
When processing a JSON-LD + script element, + only the resolved document location of the + containing HTML document is used to establish the default base IRI of the enclosed + JSON-LD content.
+ +script
elementsAs HTML entities and comments are not allowable in + JSON, the use of comments, escapes, + and HTML Character references + is subject to further discussion in the Working Group.
+ +Depending on how the HTML document is served, certain strings may need
+ to be escaped. In particular, the content MAY be enclosed
+ in the HTML comment-open (<!--
) and comment-close (-->
) text sequences.
As described in HTML Restrictions for contents of <script>
elements
+ the textContent of a script element may include balanced comments
+ and other text which complicate extracting the JSON-LD content from a data blocks.
+ JSON-LD places further restrictions on the contents of
+ script elements containing JSON-LD.
A JSON-LD script element MAY begin with an optional comment-open surrounded by any amount of space characters,
+ followed by valid JSON and ending with an optional comment-close surrounded by any amount of space characters.
+ Any content within the JSON content which can be confused with a comment-open, script-open,
+ comment-close, or script-close MUST be escaped using a REVERSE SOLIDUS (\
) character
+ as follows:
<!--
→ <\!--
<script
→ <\script
-->
→ --\>
</script
→ <\/script
Additionally, content of a script element MAY be escaped using HTML Character references, such as the following:
+&
→ & (ampersand, U+0026)<
→ < (less-than sign, U+003C)>
→ > (greater-than sign, U+003E)"
→ " (quotation mark, U+0022)'
→ ' (apostrophe, U+0027)JSON-LD Processors MUST remove surrounding comment-open and comment-close + sequences, unescape any escaped comment-open, comment-close, + script-open, and script-close sequences, + and turn HTML Character references into the corresponding Unicode. +
+ + +A specific + script element + within an HTML document may be located using + a fragment identifier matching the unique identifier + of the script element within the HTML document located by a URL (see [[!DOM]]).
+For example, given an HTML document located at http://example.com/document
,
+ a script element identified by "name" can be targeted using the URL
+ http://example.com/document#name
.
The following is a list of issues open at the time of publication.
Consider using "@type": "@json"
to describe native values in the compact form.
Allows a term definition to include an @values
block to describe structured values, such as for GeoJSON.
When requesting JSON-LD from an HTTP endpoint, it would be useful to provide a reference to a context or frame which should be used by the server to put the results into the proper format.
Provide a means for refering to a remote context without without requiring it to be downloaded.
-Consider a container type, similar to @list
for encoding things like schema:ItemList
serializations, when the values are schema:ListItem and order is set through schema:position
.
Consider the opposite of "@container": "@set"
; this would be when there is exactly one entry in an @list
, instead of compacting to an array, compact to a single item.
It would be useful if JSON-LD recognized both value (rdf:nil
) and list ([]
).
Consider a mechanism such as Microdata's @itemref
for including objects within another referencing node.
Mechinism to allow freezing terms so that additional contexts don't override them.
Should consider html>head>base@href
and xml:base
, as appropriate.
Update terminology in the spec from IRI to URL.
-For every example, there should be an equivalent of the example in the expanded form, in a table with the triples, in [[Turtle]] (as close to the JSON-LD structure as possible) and, possibly, as graphs. Not all of them would appear on the screen at the same time but, rather, the reader could choose what to see with some tabs.
-Proposal is to start from scratch, ie, deprecating @graph
and replacing the functionality with something cleaner.
"@version": [1.1, "amazingExtensionFoo", "nicheExtensionBar"]
- processors throw if they don't understand every extension listed.Ensure that the output is consistent in shape. Thus if there can ever be multiple values, the structure is always an array.
- - - - +Consider documentation best practices.
Consider issues surrounding confusion of differing expansion rules for @id
, @type
, and dictionary members.
Require JSON-LD processors to be able to identify and extract JSON-LD from a script tag with type application/ld+json
within an HTML document.
- Instead of normatively requiring an initial context, such as RDFa does, instead JSON-LD has the ability to import contexts. This approach means that the existing context rules are followed, and the best practice context can be updated over time as new norms emerge in the community. If the best practice context is not useful to a particular community, then they don't need to import it. -
+ + + +Node Types in @context
.