Meta-schemas for vocabularies seem to have incomplete lists of used vocabularies #964

epoberezkin · 2020-08-07T17:16:38Z

Maybe it was reported already - sorry if I didn't find it.

For example, https://json-schema.org/draft/2019-09/meta/core uses applicator and other vocabularies but includes only core.

https://json-schema.org/draft/2019-09/meta/applicator uses core and others, but includes only applicator.

etc.

Am I missing something?

Relequestual · 2020-08-08T08:14:19Z

Hey @epoberezkin, yes you're missing something here =]

The "$vocabulary" keyword is used in meta-schemas to identify the
vocabularies available for use in schemas described by that meta-
schema.

https://tools.ietf.org/html/draft-handrews-json-schema-02#section-8.1.2

$vocabulary does not define vocabularies using in the same schema as itself. The dialect (collection of vocabularies) is identified by $schema. The associated meta-schema includes $vocabulary to identify the vocabularies used and if they are required or not.

$vocabulary isn't a keyword for general schema use, but only for use in meta-schemas.

If a JSON Schema wants to use a different dialect or create a new dialect, you can do this by first creating the meta-schema which identifies the vocabularies used, and then use the $id of said new meta-schema in the schema $schema which wants to use it.

Shout if you need any further clarification.

I find Appendix D very helpful in understanding how this fits together.
I DO need to re-work some wording in the spec now we have a term for a collection of vocabularies ("feature set" for the keywords, "dialect" for the whole thing).

Relequestual · 2020-08-08T08:23:13Z

To clarify, the core meta-schema you include, uses the full JSON Schema dialect, but is itself JUST the meta-schema for the core vocabulary.
All meta-schemas must use the core vocabulary (It defines $id, $schema, $vocabulary).
Look at the general purpose meta-schema: http://json-schema.org/draft/2019-09/schema
Notice it doesn't include what's already defined in vocabularies it defines as using.

handrews · 2020-08-08T17:19:48Z

The intuitive way to think about this is that schema keywords describe the instance.

The only exceptions are $schema (which links the meta-schema, and therefore describes the resource that contains it) and $id, $anchor, and $dynamicAnchor which create identifiers for the resource that contains them.

Everything else, including $vocabulary, describes the instance to which the schema resource is applied. Since $vocabulary only has meaning when the instance is a schema, it is only useful in meta-schemas. You can have it in a schema applied to a non-schema instance, it just doesn't do anything in that case.

epoberezkin · 2020-08-08T17:41:02Z

$vocabulary does not define vocabularies using in the same schema as itself. The dialect (collection of vocabularies) is identified by $schema. The associated meta-schema includes $vocabulary to identify the vocabularies used and if they are required or not.

That makes sense, that is what I initially thought based on that sentence you quoted, but there is another place in the spec that made me think that the schema can override vocabularies defined in its meta-schema:

If "$vocabulary" is absent, an implementation MAY determine behavior based on the meta-schema if it is recognized from the URI value of the referring schema's "$schema" keyword.

This implies that the $vocabulary can be present in the schema to define the vocabularies it is using. Or does it refer to the meta-schema of the meta-schema?

Further, if this is not correct, then I do not understand why would core, applicator etc. metaschemas have $vocabulary at all? Firstly, they are not expected to be used as meta-schemas on their own - maybe core can be used on its own, but others require core. Secondly, if there is a scenario when they would be used as meta-schemas, they should at least include core vocabulary, and, maybe, applicator and some other vocabularies as well, as it is unlikely you can construct a schema without core and applicator...

I am still missing something here...

handrews · 2020-08-08T21:51:19Z

This implies that the $vocabulary can be present in the schema to define the vocabularies it is using. Or does it refer to the meta-schema of the meta-schema?

It should probably read If "$vocabulary" is absent from the meta-schema. If S is the schema, and M is the meta-schema, you start by examining S. You follow $schema to M. If M has $vocabulary then that is what determines the vocabulary of S. If M does not have $vocabulary, but the URI of S is "recognizable", then the implementation can infer the vocabulary from the recognizable URI.

Translation: If someone uses an old pre-$vocabulary meta-schema that your implementation recognizes, you can assume it still means what you thought it meant. Also, this preserves the old behavior that if an implementation recognizes a custom meta-schema URI, it can process whatever extensions it knows that meta-schema indicates. I don't know if anyone ever implemented this, but the spec allowed it.

Further, if this is not correct, then I do not understand why would core, applicator etc. metaschemas have $vocabulary at all? Firstly, they are not expected to be used as meta-schemas on their own - maybe core can be used on its own, but others require core.

I mean, you could use them that way, and it seemed better to go ahead and put the appropriate value in than to leave it out. Leaving it out when the spec makes a big deal out of $vocabulary in meta-schemas seemed like a bad idea. This way if for some reason you want to use them on their own, you can. I agree that it is unlikely, but consistency seems best here.

Secondly, if there is a scenario when they would be used as meta-schemas, they should at least include core vocabulary

I think we wrote it such that core is always assumed to be present, even if you don't list it, but you SHOULD list it? I'd have to go dig through to find that. But you have to assume core to even follow $schema and process $id and $vocabulary so de-facto core is always in use.

Secondly, if there is a scenario when they would be used as meta-schemas, they should at least include core vocabulary, and, maybe, applicator and some other vocabularies as well, as it is unlikely you can construct a schema without core and applicator..

A meta-schema should only declare the vocabularies that it describes. The validation meta-schema only describes the validation assertions, so that's the only vocabulary (other than core, which as noted is a special case) that it declares.

Is it useful on its own like that? Not very. I can't imagine using it on its own.

Is it correct on its own like that? Yes. It is correct and consistent which seemed more important than doing something inconsistent for single-vocabulary meta-schemas just because they're unlikely to be used on their own in the real world.

karenetheridge · 2020-08-29T17:23:57Z

The $vocabulary keyword seems to be being used in two different ways here, and I'm not sure if this is fully spelled out by the specification.

https://json-schema.org/draft/2019-09/schema has the $vocabulary keyword to specify which vocabularies are enabled when this document is used as the metaschema for a schema.
If individual meta/* schemas are to be used as metaschemas themselves, the use of $vocabulary is valid here to define what vocabularies are enabled when this document is to be used as the metaschema. In this use, @epoberezkin's concern is correct that each of these meta/* documents should be listing every vocabulary that they use (which is core, applicator and validator). (see footnote 1.)
however, the way the $vocabulary keyword seems to be being used at present in the meta/* documents is as another type of identifier: that is, they are saying "this document defines a vocabulary, and this is the identifier for that vocabulary that may be used in the $vocabulary keyword in metaschemas". That usage is not defined by the spec.

footnote 1: I don't think it's valid to use an individual meta/* document as a metaschema on its own though -- it doesn't spell out the full set of vocabularies for a schema to be useful. e.g. a schema with just applicator keywords can't do much (via properties, prefixItems) except check that certain positions exist in the instance data. So this usecase can't be the intention of the $vocabulary usage here.

So, the issue I'm having is: when parsing a schema (either as a regular schema for the purposes of evaluating instance data, or as a metaschema for determining which vocabularies it supports), what do we do when encountering a $vocabulary keyword? It's clear what the intent is when at the top level metaschema document itself (https://json-schema.org/draft/2019-09/schema) - we are enabling or disabling particular vocabularies for all schemas that use this document as the metaschema. But what do we do when encountering the $vocabulary keyword in any $referenced document, such as meta/applicator? Is this only intended for human consumption and the parser implementation should ignore it? If so, wouldn't it be better to move this data into $comment so it is clear it is only intended for human consumption?

jdesrosiers · 2020-08-29T19:09:00Z

I found the duality of the $vocabulary keyword awkward and confusing as well. I found that my implementation has no use for it if it's not a top level meta-schema, so I just ignore it in those cases and move on. I agree that it's only useful as documentation and it would be less confusing if $vocabulary wasn't co-opted for this documentation.

handrews · 2020-09-01T07:05:38Z

$vocabulary wasn't co-opted for anything. It does the same thing in both places, the fact that some of those places are unlikely to be used is irrelevant. It harms nothing to have them there, and it's consistent and promoting best practices for meta-schemas. There's nothing else going on here, $vocabulary means what it always means and those are the correct values for those meta-schemas.

karenetheridge · 2020-11-16T22:02:32Z

$vocabulary means what it always means

This isn't terribly helpful, I'm afraid. I'm still not sure what to do re my question above:

when parsing a schema, what do we (the implementation) do when encountering a $vocabulary keyword? It's clear what the intent is when at the top level metaschema document itself... But what do we do when encountering the $vocabulary keyword in any $referenced document, such as meta/applicator?

Relequestual · 2020-11-17T12:37:50Z

$vocabulary means what it always means

This isn't terribly helpful, I'm afraid. I'm still not sure what to do re my question above:

when parsing a schema, what do we (the implementation) do when encountering a $vocabulary keyword? It's clear what the intent is when at the top level metaschema document itself... But what do we do when encountering the $vocabulary keyword in any $referenced document, such as meta/applicator?

Nothing. It's only useful in the "top-level meta-schema".
It's used in a single vocabulary meta-schema to show what another meta-schema using that vocabulary should also require, AS WELL AS being logically correct (specifying what vocabularies must be understood in order to process an instance which has defined a dialect which uses such as its meta-schema).

It initially seemed like a duality to me, but in actual fact, it isn't.

If you think about it further, you can actually use this data to automagically construct yourself a new dialect based on a set of vocabulary meta-schemas, I think. But that's more an aside.

I found the duality of the $vocabulary keyword awkward and confusing as well. I found that my implementation has no use for it if it's not a top-level meta-schema, so I just ignore it in those cases and move on. I agree that it's only useful as documentation and it would be less confusing if $vocabulary wasn't co-opted for this documentation.

This is exactly correct. It's only meta-schemas which are "referenced" by use of $schema where $vocabulary should be considered. The individual vocabulary meta-schemas are referenced by applicator $ref, and so $vocabulary should not be considered, because it's no longer a meta-schema root in that context..

handrews · 2020-11-17T23:04:23Z

@karenetheridge that reply was responding to @jdesrosiers about $vocabulary being "co-opted for documentation."

A meta-schema should declare the vocabularies that it describes, and no more. So the applicator single-vocabulary meta-schema declares schemas that use it directly only rely on applicator (and core) semantics. Somewhere there is an issue with a use case for this- basically using only applicators and an additional annotation vocabulary for, I think, UI generation. It was a bit contrived but had some relation to an actual project, and is the reason that the applicator vocabulary is in the core spec as a separate vocabulary.

So to address

footnote 1: I don't think it's valid to use an individual meta/* document as a metaschema on its own though -- it doesn't spell out the full set of vocabularies for a schema to be useful. e.g. a schema with just applicator keywords can't do much (via properties, prefixItems) except check that certain positions exist in the instance data. So this usecase can't be the intention of the $vocabulary usage here.

You don't think so, but you don't define the entire set of use cases ;-P Others have come up with use cases where this is at least plausible. I will admit, not compelling- I'm with you there! But plausible.

More importantly, people will extend the regular schema dialect and allOf it. They may want to change which vocabularies are optional vs required, and add more vocabularies. But of course the regular meta-schema will not have those vocabularies, nor should it- that's the whole point of extensibility. That is where the use case of ignoring reference $vocabulary keywords is critically important. It would probably be possible to come up with some sort of combining rule, but that is far more complex and no one had a use case (unlike the single-schema meta-schemas declaring only their own vocabulary, which is if anything less complex because you don't have to figure out what "the full" set of vocabularies is supposed to be- at minimum, it is not more complex, it's just not very useful).

handrews · 2021-05-06T19:34:44Z

@epoberezkin was your question sufficiently answered (or have you sorted it out elsewhere and no longer need info here)?

@karenetheridge @jdesrosiers @Relequestual the discussion drifted a ways off track from just answering the original question. If there is anything else that needs further discussion, can we file that separately and close this?

epoberezkin · 2021-05-07T06:05:13Z

thank you!

karenetheridge · 2021-05-08T03:34:36Z

I'm going to reserve comment on usage of $vocabulary until I actually successfully implement it, as I'll be better able to articulate the problems that I see with it (or someone will tell me that I'm understanding it wrong, and we'll massage the spec wording to be more clear).

handrews · 2021-05-11T19:08:49Z

@karenetheridge that sounds like you're not specifically worried about the question here, so it's probably best if you just file new issues if and when you find stuff with vocabularies during implementation. I'm going to close this particular issue since the original question is resolved- the current usage is well-defined.

The questions around whether the current usage is long-term desirable are better addressed in #1098, which directly addresses the fact that a lot of people are confused by the current wording.

ghost added the triage label Aug 7, 2020

Relequestual added question and removed triage labels Aug 8, 2020

Relequestual self-assigned this Aug 10, 2020

handrews closed this as completed May 11, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Meta-schemas for vocabularies seem to have incomplete lists of used vocabularies #964

Meta-schemas for vocabularies seem to have incomplete lists of used vocabularies #964

epoberezkin commented Aug 7, 2020 •

edited

Loading

Relequestual commented Aug 8, 2020 •

edited

Loading

Relequestual commented Aug 8, 2020

handrews commented Aug 8, 2020

epoberezkin commented Aug 8, 2020

handrews commented Aug 8, 2020

karenetheridge commented Aug 29, 2020

jdesrosiers commented Aug 29, 2020

handrews commented Sep 1, 2020

karenetheridge commented Nov 16, 2020

Relequestual commented Nov 17, 2020

handrews commented Nov 17, 2020 •

edited

Loading

handrews commented May 6, 2021

epoberezkin commented May 7, 2021

karenetheridge commented May 8, 2021

handrews commented May 11, 2021

Meta-schemas for vocabularies seem to have incomplete lists of used vocabularies #964

Meta-schemas for vocabularies seem to have incomplete lists of used vocabularies #964

Comments

epoberezkin commented Aug 7, 2020 • edited Loading

Relequestual commented Aug 8, 2020 • edited Loading

Relequestual commented Aug 8, 2020

handrews commented Aug 8, 2020

epoberezkin commented Aug 8, 2020

handrews commented Aug 8, 2020

karenetheridge commented Aug 29, 2020

jdesrosiers commented Aug 29, 2020

handrews commented Sep 1, 2020

karenetheridge commented Nov 16, 2020

Relequestual commented Nov 17, 2020

handrews commented Nov 17, 2020 • edited Loading

handrews commented May 6, 2021

epoberezkin commented May 7, 2021

karenetheridge commented May 8, 2021

handrews commented May 11, 2021

epoberezkin commented Aug 7, 2020 •

edited

Loading

Relequestual commented Aug 8, 2020 •

edited

Loading

handrews commented Nov 17, 2020 •

edited

Loading