Skip to content

some edits to the spec for the interim update release #1110

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 26 additions & 13 deletions jsonschema-core.xml
Original file line number Diff line number Diff line change
Expand Up @@ -1417,7 +1417,7 @@
a letter ([A-Za-z]) or underscore ("_"), followed by any number of letters,
digits ([0-9]), hyphens ("-"), underscores ("_"), and periods (".").
This matches the US-ASCII part of XML's
<xref target="xml-names">NCName production</xref>.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"production" is the correct term. Specifically, this:

NCName | ::= | Name - (Char* ':' Char*) | /* An XML Name, minus the ":" */

is the NCName production in the XML Namespaces spec. In section 7, that spec says: "MUST match this specification's production for NCName.", so the terminology lines up. Type is definitely not correct, as that word has other meanings in an XML context.

So for this one we should really keep "production", although see my comments on the date-time formats for alternatives in that case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@karenetheridge does your thumbs up mean you agree and this should be removed from this PR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it's an "okay, I accept this."

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, do you want to interactive rebase, remove the commit related to productions and force push? =]

<xref target="xml-names">NCName type</xref>.
<cref>
Note that the anchor string does not include the "#" character,
as it is not a URI-reference. An "$anchor": "foo" becomes the
Expand Down Expand Up @@ -2085,7 +2085,8 @@
"items", whose behavior is defined in terms of "prefixItems"
</t>
<t>
"contains", whose behavior is defined in terms of "minContains"
"contains", whose behavior is affected by the presence and value of
"minContains", in the Validation vocabulary
</t>
</list>
</t>
Expand Down Expand Up @@ -2318,6 +2319,7 @@
positions within the instance array, it produces an
annotation result of boolean true, indicating that all remaining array
elements have been evaluated against this keyword's subschema.
This annotation affects the behavior of "unevaluatedItems".
</t>
<t>
Omitting this keyword has the same assertion behavior as
Expand All @@ -2337,15 +2339,18 @@
</t>
<t>
An array instance is valid against "contains" if at least one of
its elements is valid against the given schema. The subschema MUST be
applied to every array element even after the first match has
been found, in order to collect annotations for use by other keywords.
This is to ensure that all possible annotations are collected.
its elements is valid against the given schema, provided that
"minContains" is not present or is a value greater than 0.
</t>
<t>
Logically, the validation result of applying the value subschema to each
item in the array MUST be ORed with "false", resulting in an overall
validation result.
If "minContains" is present and has a value of 0, then an array instance
is valid against this keyword only if NONE of its elements is valid
against the given schema.
</t>
<t>
The subschema MUST be applied to every array element even after the first
match has been found, in order to collect annotations for use by other
keywords. This is to ensure that all possible annotations are collected.
</t>
<t>
This keyword produces an annotation value which is an array of
Expand All @@ -2354,6 +2359,7 @@
the subschema validates successfully when applied to every index of the
instance. The annotation MUST be present if the instance array to which
this keyword's schema applies is empty.
This annotation affects the behavior of "unevaluatedItems".
</t>
</section>
</section>
Expand All @@ -2373,7 +2379,9 @@
<t>
The annotation result of this keyword is the set of instance
property names matched by this keyword.
</t>
This annotation affects the behavior of "additionalProperties" and
"unevaluatedProperties".
</t>
<t>
Omitting this keyword has the same assertion behavior as
an empty object.
Expand All @@ -2396,7 +2404,9 @@
<t>
The annotation result of this keyword is the set of instance
property names matched by this keyword.
</t>
This annotation affects the behavior of "additionalProperties" and
"unevaluatedProperties".
</t>
<t>
Omitting this keyword has the same assertion behavior as
an empty object.
Expand All @@ -2422,6 +2432,7 @@
<t>
The annotation result of this keyword is the set of instance
property names validated by this keyword's subschema.
This annotation affects the behavior of "unevaluatedProperties".
</t>
<t>
Omitting this keyword has the same assertion behavior as
Expand Down Expand Up @@ -2559,6 +2570,7 @@
positions within the instance array, it produces an
annotation result of boolean true, analogous to the
behavior of "items".
This annotation affects the behavior of "unevaluatedItems" in parent schemas.
</t>
<t>
Omitting this keyword has the same assertion behavior as
Expand Down Expand Up @@ -2602,6 +2614,7 @@
<t>
The annotation result of this keyword is the set of instance
property names validated by this keyword's subschema.
This annotation affects the behavior of "unevaluatedProperties" in parent schemas.
</t>
<t>
Omitting this keyword has the same assertion behavior as
Expand Down Expand Up @@ -3846,8 +3859,8 @@ https://example.com/schemas/common#/$defs/count/minimum
<t>"$schema" MAY change for embedded resources</t>
<t>Array-value "items" functionality is now "prefixItems"</t>
<t>"items" subsumes the old function of "additionalItems"</t>
<t>"contains" and "unevaluatedItems" interactions now specified</t>
<t>Rename $recursive* to $dynamic*</t>
<t>"contains" annotation behavior, and "contains" and "unevaluatedItems" interactions now specified</t>
<t>Rename $recursive* to $dynamic*, with behavior modification</t>
<t>$dynamicAnchor defines a fragment like $anchor</t>
<t>$dynamic* (previously $recursive) no longer use runtime base URI determination</t>
<t>Define Compound Schema Documents (bundle) and processing</t>
Expand Down
60 changes: 47 additions & 13 deletions jsonschema-validation.xml
Original file line number Diff line number Diff line change
Expand Up @@ -213,6 +213,23 @@
<eref target="https://json-schema.org/draft/2020-12/meta/validation"/>.
</t>

<section title="Keyword Independence">
<t>
Schema keywords typically operate independently, without
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better not to copy the wording from the core spec- then it will get out of sync, and I kind of like keeping the statements of principles in the core. You can xref it, and/or just note the vocabulary-specific dependencies. But this is not a hill I intend to die on so if you feel strongly feel free to keep it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the Applicator vocabulary itself has this paragraph, it's reasonable for the Validation vocabulary to have it too. How about moving it out into the general section before we start talking about individual vocabularies? Then all the exceptions don't need to be listed explicitly -- we can just give a single example, and direct the reader to pay close attention to individual keyword definitions for the exceptions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well that's embarrassing. I thought it was in the general section (don't look at the change log- it's probably my fault it's not 😅 ). If you feel like factoring it out that would be great, but I do worry a bit about not highlighting the interactions sufficiently. I would be totally fine leaving this the way you have it now (with the duplicated language) and factoring it out in the next draft.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also didn't realize this wasn't in the general section. It definitely should be. There's no reason to duplicate documentation of this common principle. I can help with the refactor if needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.
Are you happy to move this to the general section as part of this PR @karenetheridge ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like there is a section in each applicable vocabulary (applicator, validation, unevaluated), but the content is all slightly different. Is it better to have similar-but-slightly-different wording in several places, or just one wording in one place (if this is the case, I would propose putting it in section 7.10)?

affecting each other's outcomes.
</t>
<t>
For schema author convenience, there are some exceptions among the
keywords in this vocabulary:
<list>
<t>
"maxContains" and "minContains", whose behavior is affected by the
presence and value of "contains", in the Applicator vocabulary
</t>
</list>
</t>
</section>

<section title="Validation Keywords for Any Instance Type" anchor="general">
<section title="type">
<t>
Expand Down Expand Up @@ -418,6 +435,12 @@
result is a boolean "true" and the instance array length is less than or
equal to the "maxContains" value.
</t>
<t>
If annotations are not being collected, the validation value may be determined
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In terms of phrasing this should use MAY, and I feel the wording could be condensed a bit, but I do not feel strongly about that so feel free to ignore. (the condensing part- the "may" should really be a MAY).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This paragraph is simply describing the alternative algorithm to use that doesn't involve annotations. MAY in the RFC sense isn't correct here as alternative behaviour is not permitted -- it's only the implementation can vary. How about "can"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just going by this language from the additional* applicators:

Implementations MAY choose to implement or optimize this keyword in
another way that produces the same effect, such as by directly
checking the names in "properties" and the patterns in
"patternProperties" against the instance property set.
Implementations that do not support annotation collection MUST do so.

As long as they end up consistent I'm probably fine with whatever. If making it consistent gets bogged down, I'll live with a little inconsistency. It is probably better to use "can" than lower-case "may", although that is definitely not an absolute rule.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like a reaonsable change. Let's go with "can".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm cutting out this entire part of the revision, since (as per other PRs recently submitted) trying to define a parallel implementation using annotations is not going to be identicaly, so we shouldn't suggest it at all.

by considering the number of instance elements that are valid against the
"contains" schema: validation is successful if and only if the number of valid
elements is less than or equal to the "maxContains" value.
</t>
</section>

<section title="minContains">
Expand All @@ -437,10 +460,17 @@
annotation result is a boolean "true" and the instance array length is
greater than or equal to the "minContains" value.
</t>
<t>
If annotations are not being collected, the validation value may be determined
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/may/can/ again here?

by considering the number of instance elements that are valid against the
"contains" schema: validation is successful if and only if the number of valid
elements is greater than or equal to the "minContains" value.
</t>
<t>
A value of 0 is allowed, but is only useful for setting a range
of occurrences from 0 to the value of "maxContains". A value of
0 with no "maxContains" causes "contains" to always pass validation.
of occurrences from 0 to the value of "maxContains". A value of
0 causes "contains" to always pass validation (but validation can still fail
against a "maxContains" keyword).
</t>
<t>
Omitting this keyword has the same behavior as a value of 1.
Expand Down Expand Up @@ -701,30 +731,34 @@
<list style="hanging">
<t hangText="date-time:">
A string instance is valid against this attribute if it is
a valid representation according to the "date-time" production.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As before, production is correct, although I think "rule" is also correct as it's what is used in the RFC for ABNF, so perhaps that would be more agreeable?

Specifically, date and time are from these ABNF rules in section 5.6 of RFC 3339:

   date-fullyear   = 4DIGIT
   date-month      = 2DIGIT  ; 01-12
   date-mday       = 2DIGIT  ; 01-28, 01-29, 01-30, 01-31 based on
                             ; month/year
   time-hour       = 2DIGIT  ; 00-23
   time-minute     = 2DIGIT  ; 00-59
   time-second     = 2DIGIT  ; 00-58, 00-59, 00-60 based on leap second
                             ; rules
   time-secfrac    = "." 1*DIGIT
   time-numoffset  = ("+" / "-") time-hour ":" time-minute
   time-offset     = "Z" / time-numoffset

   partial-time    = time-hour ":" time-minute ":" time-second
                     [time-secfrac]
   full-date       = date-fullyear "-" date-month "-" date-mday
   full-time       = partial-time time-offset

   date-time       = full-date "T" full-time

and duration comes from RFC 3339 Appendix A, ported from the ISO spec:

   dur-second        = 1*DIGIT "S"
   dur-minute        = 1*DIGIT "M" [dur-second]
   dur-hour          = 1*DIGIT "H" [dur-minute]
   dur-time          = "T" (dur-hour / dur-minute / dur-second)
   dur-day           = 1*DIGIT "D"
   dur-week          = 1*DIGIT "W"
   dur-month         = 1*DIGIT "M" [dur-day]
   dur-year          = 1*DIGIT "Y" [dur-month]
   dur-date          = (dur-day / dur-month / dur-year) [dur-time]

   duration          = "P" (dur-date / dur-time / dur-week)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about "date-time" ABNF rule?

FWIW that would also resolve the previous confusion I'd raised about duration, where the ABNF rule in RFC3339 is very much incomplete compared to the actual ISO 8601 definition.. so we can be specific that we're only targeting what's in the RFC. (We can also optionally validate the remainder of the ISO 8601 definition, as those cases are explicitly omitted from the test suite.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@karenetheridge I'm confused, duration already cites the ISO rule as quoted in the RFC. If it's not in the RFC, we are not referencing it. All we are doing is referencing that production. That's why the word "production" was used.

I think "rule" (with the RFC section number, which is already there) is sufficient, although I would not object to ABNF so that's fine.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find the use of the word "production" to be very confusing and would prefer we avoid using it unless it's referencing something very specific, in which case we should link to its definition somewhere in the preamble like we do for MUST/MAY/RECOMMENDS etc.

Copy link
Contributor

@handrews handrews Jun 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tone of the comment I made here was too condescending. I am leaving this up for transparency so anyone who wants to can look at the edit history. I am otherwise withdrawing the comment.

</t>
a valid "date-time" representation according to the specification
referenced above
</t>
<t hangText="date:">
A string instance is valid against this attribute if it is
a valid representation according to the "full-date" production.
</t>
a valid "full-date" representation according to the specification
referenced above
</t>
<t hangText="time:">
A string instance is valid against this attribute if it is
a valid representation according to the "full-time" production.
</t>
a valid "full-time" representation according to the specification
referenced above
</t>
<t hangText="duration:">
A string instance is valid against this attribute if it is
a valid representation according to the "duration" production.
</t>
a valid "duration" representation according according to the
specification referenced above
</t>
</list>
</t>
<t>
Implementations MAY support additional attributes using the other
production names defined anywhere in that RFC. If "full-date" or "full-time"
format names defined anywhere in that RFC. If "full-date" or "full-time"
are implemented, the corresponding short form ("date" or "time"
respectively) MUST be implemented, and MUST behave identically.
Implementations SHOULD NOT define extension attributes
with any name matching an RFC 3339 production unless it validates
according to the rules of that production.
with any name matching an RFC 3339 format unless it validates
according to the rules of that format.
<cref>
There is not currently consensus on the need for supporting
all RFC 3339 formats, so this approach of reserving the
Expand Down