Skip to content

Commit 3c5ff2a

Browse files
committed
add vocabulary proposal doc
1 parent cbaa513 commit 3c5ff2a

File tree

1 file changed

+182
-17
lines changed

1 file changed

+182
-17
lines changed

proposals/vocabularies.md

+182-17
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ The current approach to extending JSON Schema by providing custom keywords is
66
very implementation-specific and therefore not interoperable.
77

88
To address this deficiency, this document proposes vocabularies as a concept
9-
and a new Core keyword, `$vocabulary` to support it.
9+
and a new Core keyword, `$vocabulary`, to support it.
1010

1111
While the Core specification will define and describe vocabularies in general,
1212
the Validation specification will also need to change to incorporate some of
@@ -16,9 +16,9 @@ in both documents.
1616
## Current Status
1717

1818
This proposal was originally integrated into both specifications, starting with
19-
the 2019-09 release, and has been extracted as the feature is incomplete. The
20-
feature, at best effort, was extracted in such a way as to retain the
21-
functionality present in the 2020-12 release.
19+
the 2019-09 release. For the upcoming stable release, the feature has been
20+
extracted as it is incomplete. The feature, at best effort, was extracted in
21+
such a way as to retain the functionality present in the 2020-12 release.
2222

2323
Trying to fit the 2020-12 version into the current specification, however,
2424
raises some problems, and further discussion around the design of
@@ -45,28 +45,191 @@ also apply to this document.
4545

4646
### Problem Statement
4747

48-
The specification allows implementations to support user-defined keywords.
49-
However, this vague and open allowance has drawbacks.
48+
To support extensibility, the specification allows implementations to support
49+
keywords that are not defined in the specifications themselves. However, this
50+
vague and open allowance has drawbacks.
5051

51-
1. This isn't a requirement, it is a permission. An implementation could just as
52-
easily (_more_ easily) choose _not_ to support user-defined keywords.
52+
1. Such support is not a requirement; it is a permission. An implementation
53+
could just as easily (_more_ easily) choose _not_ to support extension
54+
keywords.
5355
2. There is no prescribed mechanism by which an implementation should provide
5456
this support. As a result, each implementation that _does_ have the feature
5557
supports it in different ways.
56-
3. Support for any given user-defined keyword will be limited to that
57-
implementation. Unless the user explicitly configures another
58-
implementation, their keywords likely will not be supported.
58+
3. Support for any given user-defined keyword will be limited to the
59+
implementations which are explicitly configured for that keyword. For a user
60+
defining their own keyword, this becomes difficult and/or impossible
61+
depending on the varying support for extension keywords offered by the
62+
implementations the user is using.
5963

60-
This exposes a need for the specification(s) to define a way for implementations
61-
to share knowledge of a keyword or group of keywords.
64+
This exposes a need for an implementation-agnostic approach to
65+
externally-defined keywords as well as a way for implementations to declare
66+
support for them.
6267

6368
### Solution
6469

65-
<!-- What is the solution? Include examples of use. -->
70+
Two new concepts, vocabularies and dialects, will be introduced into the Core
71+
specification.
72+
73+
A vocabulary is identified by an absolute URI and is used to define a set of
74+
keywords. A vocabulary is generally defined in a human-readable _vocabulary
75+
description document_. (The URI for the vocabulary may be the same as the URL of
76+
where this vocabulary description document can be found, but no recommendation
77+
is made either for or against this practice.)
78+
79+
A new keyword, `$vocabulary`, will be introduced into the Core specification as
80+
well. This keyword's value is an object with vocabulary URIs as keys and
81+
booleans as values. This keyword only has meaning within a meta-schema. A
82+
meta-schema which includes a vocabulary's URI in its `$vocabulary` keyword is
83+
said to "include" that vocabulary.
84+
85+
```jsonc
86+
{
87+
"$schema": "https://example.org/draft/next/schema",
88+
"$id": "https://example.org/schema",
89+
"$vocabulary": {
90+
"https://example.org/vocab/vocab1": true,
91+
"https://example.org/vocab/vocab2": true,
92+
"https://example.org/vocab/vocab3": false
93+
},
94+
// ...
95+
}
96+
```
97+
98+
A dialect is the set of vocabularies listed by a meta-schema. It is ephemeral
99+
and carries no identifier.
100+
101+
_**NOTE** It is possible for two meta-schemas, which would have different `$id`
102+
values, to share a common dialect if they both declare the same set of
103+
vocabularies._
104+
105+
A schema that declares a meta-schema (via `$schema`) which contains
106+
`$vocabulary` is declaring that only those keywords defined by the included
107+
vocabularies are to be processed when evaluating the schema. All other keywords
108+
are to be considered "unknown" and handled accordingly.
109+
110+
The boolean values in `$vocabulary` signify implementation requirements for each
111+
vocabulary.
112+
113+
- A `true` value indicates that the implementation must recognize the vocabulary
114+
and be able to process each of the keywords defined it. If an implementation
115+
does not recognize the vocabulary or cannot process all of its defined
116+
keywords, the implementation must refuse to process the schema. These
117+
vocabularies are also known as "required" vocabularies.
118+
- A `false` value indicates that the implementation is not required to recognize
119+
the vocabulary or its keywords and may continue processing the schema anyway.
120+
However, keywords that are not recognized or supported must be considered
121+
"unknown" and handled accordingly. These vocabularies are also known as
122+
"optional" vocabularies.
123+
124+
Typically, but not required, a schema will accompany the vocabulary description
125+
document. This _vocabulary schema_ should carry an `$id` value which is distinct
126+
from the vocabulary URI. The purpose of the vocabulary schema is to provide
127+
syntactic validation for the the vocabulary's keywords' values for when the
128+
schema is being validated by a meta-schema that includes the vocabulary. (A
129+
vocabulary schema is not itself a meta-schema since it does not validate entire
130+
schemas.) To facilitate this extra validation, when a vocabulary schema is
131+
provided, any meta-schema which includes the vocabulary should also contain a
132+
reference (via `$ref`) to the vocabulary schema's `$id` value.
133+
134+
```jsonc
135+
{
136+
"$schema": "https://example.org/draft/next/schema",
137+
"$id": "https://example.org/schema",
138+
"$vocabulary": {
139+
"https://example.org/vocab/vocab1": true,
140+
"https://example.org/vocab/vocab2": true,
141+
"https://example.org/vocab/vocab3": false
142+
},
143+
"allOf": {
144+
{"$ref": "meta/vocab1"}, // https://example.org/meta/vocab1
145+
{"$ref": "meta/vocab2"}, // https://example.org/meta/vocab2
146+
{"$ref": "meta/vocab3"} // https://example.org/meta/vocab3
147+
}
148+
// ...
149+
}
150+
```
151+
152+
Finally, the keywords in both the Core and Validation specifications will be
153+
divided into multiple vocabularies. The keyword definitions will be removed from
154+
the meta-schema and added to vocabulary schemas to which the meta-schema will
155+
contain references. In this way, the meta-schema's functionality remains the same.
156+
157+
```json
158+
{
159+
"$schema": "https://json-schema.org/draft/next/schema",
160+
"$id": "https://json-schema.org/draft/next/schema",
161+
"$vocabulary": {
162+
"https://json-schema.org/draft/next/vocab/core": true,
163+
"https://json-schema.org/draft/next/vocab/applicator": true,
164+
"https://json-schema.org/draft/next/vocab/unevaluated": true,
165+
"https://json-schema.org/draft/next/vocab/validation": true,
166+
"https://json-schema.org/draft/next/vocab/meta-data": true,
167+
"https://json-schema.org/draft/next/vocab/format-annotation": true,
168+
"https://json-schema.org/draft/next/vocab/content": true
169+
},
170+
"$dynamicAnchor": "meta",
171+
172+
"title": "Core and Validation specifications meta-schema",
173+
"allOf": [
174+
{"$ref": "meta/core"},
175+
{"$ref": "meta/applicator"},
176+
{"$ref": "meta/unevaluated"},
177+
{"$ref": "meta/validation"},
178+
{"$ref": "meta/meta-data"},
179+
{"$ref": "meta/format-annotation"},
180+
{"$ref": "meta/content"}
181+
],
182+
}
183+
```
184+
185+
The division of keywords among the vocabularies will be in accordance with the
186+
2020-12 specification (for now).
66187

67188
### Limitations
68189

69-
<!-- Are there any limitations inherent to the proposal? -->
190+
#### Unknown Keywords and Unsupported Vocabularies
191+
192+
This proposal, in its current state, seeks to mimic the behavior defined in the
193+
2020-12 specification. However, the current specification's disallowance of
194+
unknown keywords presents a problem for schemas that use keywords from optional
195+
vocabularies. (This is the topic of the discussion at
196+
https://github.com/orgs/json-schema-org/discussions/342.)
197+
198+
In short, if a schema uses a keyword from an unknown _optional_ vocabulary, the
199+
implementation cannot proceed because unknown keywords are explicitly
200+
disallowed. However, not being able to proceed with evaluation is the behavior
201+
prescribed for _required_ vocabularies. Thus, if the behaviors for required and
202+
optional vocabularies is the same, then the boolean value is moot, which
203+
highlights that the structure of `$vocabulary` needs to be reconsidered.
204+
205+
#### Machine Readability
206+
207+
The vocabulary URI is an opaque value. There is no data that an implementation
208+
can reference to identify the keywords defined by the vocabulary. The vocabulary
209+
schema _implies_ this, but scanning a `properties` keyword isn't very reliable.
210+
Moreover, such a system cannot provide metadata about the keywords. As such, the
211+
user must explicitly ensure that the implementation recognizes and supports the
212+
vocabulary, which isn't much of an improvement over the current state.
213+
214+
Having some sort of "vocabulary definition" file could alleviate this.
215+
216+
One reason for _not_ having such a file is that, at least for functional
217+
keywords, the user generally needs to provide custom code to the implementation
218+
to process the keywords, thus performing that same explicit configuration
219+
anyway. (Such information cannot be gleaned from a vocabulary specification. For
220+
example, an implementation can't know what to do with a hypothetical `minDate`
221+
keyword.)
222+
223+
#### Implicit Inclusion of Core Vocabulary
224+
225+
Because the Core keywords (the ones that start with `$`) instruct an
226+
implementation on how a schema should be processed, its inclusion is mandatory
227+
and assumed. As such, while excluding the Core Vocabulary from the `$vocabulary`
228+
keyword has no effect, it is generally advised as common practice to include the
229+
Core Vocabulary explicitly.
230+
231+
This can be confusing and difficult to use/implement, and we probably need
232+
something better here.
70233

71234
## Change Details
72235

@@ -91,12 +254,14 @@ For example
91254
```
92255
-->
93256

257+
_**NOTE** Since the design of vocabularies will be changing anyway, it's not worth the time and effort to fill in this section just yet. As such, please read the above sections for loose requirements. For tighter requirements, please assume conformance with the 2020-12 Core and Validation specifications._
258+
94259
## [Appendix] Change Log
95260

96-
* [MMMM YYYY] Created
261+
* 2024-06-10 - Created
97262

98263
## [Appendix] Champions
99264

100265
| Champion | Company | Email | URI |
101266
|----------------------------|---------|-------------------------|----------------------------------|
102-
| Your Name | | | < GitHub profile page > |
267+
| Greg Dennis | | [email protected] | https://github.com/gregsennis |

0 commit comments

Comments
 (0)