-
-
Notifications
You must be signed in to change notification settings - Fork 309
require $schema
in schemas
#1434
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Glad to see these this change happening!
Something that's missing is dialect inheritance in embedded schemas. An embedded schema that doesn't declare $schema
inherits the dialect of the schema it's embedded in (how ever the parent schema's dialect was determined, not necessarily $schema
).
…are not required to support media type parameter inputs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like this could be organized a bit better. There are four rules for determining the dialect, but they're not all in the same place and not presented consistently. The first rule is in a different place than the other three, Those three are in the same place, but the third and fourth are presented differently (as a list) from the second.
I think this could be more clear if the rules for determining the dialect were presented consistently and in one place in a way implementers could follow like an algorithm. I think the third and fourth rules are presented in this way, but it would be nice if the first and second were included as well.
Also, it just occurred to me that this is all defined in the section about the $schema
keyword. Since the $schema
keyword is only one one part of dialect determination, it seems like it would be better to talk about dialect determination in it's own section that references to $schema
keyword as one way the dialect could be determined.
This PR is a realization of the discussion we had in #1420. In that discussion, we identified three prioritized steps:
What is this fourth rule? |
The fourth rule is an embedded schema inheriting from its parent context. I think it belongs after the media-type parameter and before the user provided default. I had forgotten about that one when I wrote that issue, but it's definitely part of the process. If you'd rather address that in a separate PR, I'm ok with that. |
This comment was marked as outdated.
This comment was marked as outdated.
Co-authored-by: Jason Desrosiers <[email protected]>
…org/json-schema-spec into gregsdennis/require-$schema
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't wait to see the tests for this get merged! =D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great! I left a couple notes, but either could be address separately if we want.
jsonschema-core.md
Outdated
3. Parent dialect - An embedded schema resource which does not itself contain a | ||
`$schema` keyword MUST be processed using the same dialect as the schema | ||
which contains it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this concept also applies to schemas embedded in other types of documents like OpenAPI. I'm not sure how this might be reworded to include that case, or if it should be considered a different case, or if that case should be addressed elsewhere entirely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can say something like:
If the schema is embedded in a larger document (such as OpenAPI), the semantics for determining the dialect may be specified by that document's specification. Unless explicitly dictated by that document's specification, the presence of a $schema
keyword shall always override the dialect imposed by the containing document.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that @karenetheridge. I'll try to work some of that in.
Unless explicitly dictated by that document's specification, the presence of a $schema keyword shall always override the dialect imposed by the containing document.
I think this may be redundant given this is a prioritized list and the $schema
requirement is higher on the list.
Unless you're saying the containing document's spec can say that $schema
should be ignored. That seems like we're saying it's okay for a consuming specification to override this rule. I'd prefer such specs simply disallow using $schema
so that there's no confusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. Currently, the OpenAPI spec allows you to define a default dialect.
@karenetheridge What's the use case you see for allowing a containing document specification to override $schema
for it's contained schemas?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the other way around - the openapi spec has language about following jsonSchemaDialect
to set the dialect for all contained schemas, but if one of those schemas has a $schema
keyword in it, that overrides the jsonSchemaDialect. (Unless OpenAPI wanted to add a rule saying that $schema
is no good here.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, so that falls in line with the precedence order here. $schema
first, then media type (if it exists), then parent document, then config.
I think Ben was wondering about the edge case where a document wanted to override $schema
. I'm not sure we can define requirements for that.
I'm not sure much, if any, of this testable with our current setup. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if it's been discussed elsewhere but there is still a problem with allowing $schema
keywords to appear in non-root locations in the document -- because we also require that the schema must validate against its metaschema, and this requirement can be violated if a contained sub-schema uses a different dialect than the document root that conflicts in some way.
Are we expecting that implementations need to have a special evaluation mode when validating schemas, and somehow switch to a different metaschema when $schema keywords are encountered? We don't spell that out in the spec in any way, and instead assume that when a schema is being evaluated against its metaschema, we treat it just like any other data instance.
2. `application/schema+json` media type with a `schema` parameter - | ||
Implementations which support media type parameter inputs MUST process the | ||
schema according to the dialect the parameter declares. A media type will | ||
generally only be available if the schema has been retrieved from an external | ||
source and only applies to the document root. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can go stronger here and say that if the schema is fetched over HTTP, it MUST respect the media type parameter if it is present, and add a reference to the section that defines the name and syntax of this header. This may be implied by the existing wording, but we can spell it out more explicitly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this not what it's already saying? 895 contains the "MUST" requirement. I don't want to call out HTTP; I'd prefer to stay more general and say that if the media type parameter is available in general. It could come from anywhere, e.g. the OS or app.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I was confused what "Implementations which support media type parameter inputs" meant. Does that cover "if you used HTTP, you better check the Content-Type header and do what it says"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it covers any scenario, including http, where you might receive a media type. So, yeah.
jsonschema-core.md
Outdated
3. Parent dialect - An embedded schema resource which does not itself contain a | ||
`$schema` keyword MUST be processed using the same dialect as the schema | ||
which contains it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can say something like:
If the schema is embedded in a larger document (such as OpenAPI), the semantics for determining the dialect may be specified by that document's specification. Unless explicitly dictated by that document's specification, the presence of a $schema
keyword shall always override the dialect imposed by the containing document.
Co-authored-by: Karen Etheridge <[email protected]>
…org/json-schema-spec into gregsdennis/require-$schema
Yup.
First, we have to look at the definition of a Compound Schema Document...
I would argue that this defines any OpenAPI definition as a compound schema document. Then, the validation section...
Given this, we might want to tighten up the language of meta-schema useage for validation to make the target a "schema resource" to make this clearer? I don't know. Open to thoughts. |
I'm with Karen. I think a meta-scheam evaluating a schema should be no different than a schema validating an instance. I don't think it's too hard to achieve this. Still, let's work that out separately from this PR, please.
I don't think the comment was about OpenAPI, but rather any time a If the parent schema declares 2020-12 (which disallows array I don't know that the Compound Document definition applies to the above scenario like it does for bundling. I think it could be argued both ways. I agree that there's no real way to handle this right now. Let's take another PR for that and consider this PR as a mere iteration. These two topics combined intersect somewhat with this discussion about |
However, I agree, it applies more broadly.
I agree. As a reference, we discussed this at quite some length already: #936 |
That issue is talking about how to evaluate with a schema when there are embedded |
@karenetheridge I get what you're saying. I'll create an issue for it for us to discuss. |
Resolves #1420