Skip to content

Commit 61aadd5

Browse files
committed
Add general validation principles and examples.
This addresses issue #55 plus concerns raised in the comments of issue #101. I replaced "linearity" with "independence" as I think it is more general and intuitive. The general considerations section has been reorganized to start with the behavior of the empty schema, then explain keyword independence, and finally cover container vs child and type applicability, both of which flow directly from keyword independence. In draft 04, the wording obscured the connection between keyword independence and container/child independence. When we rewrote the array and object keywords to explicitly classify each keyword as either validating the container or the child, keyword independence became sufficient to explain container/child independence. The list of non-independent keywords has been updated, and exceptions to the independence of parent and child schemas have been documented. Finally, I added a comprehensive example of the frequently-confusing lack of connection between type and other keywords.
1 parent b5afae7 commit 61aadd5

File tree

1 file changed

+138
-31
lines changed

1 file changed

+138
-31
lines changed

jsonschema-validation.xml

+138-31
Original file line numberDiff line numberDiff line change
@@ -156,56 +156,163 @@
156156

157157
<section title="General validation considerations">
158158

159-
<section title="Keywords and instance primitive types">
159+
<section title="Constraints and missing keywords">
160160
<t>
161-
Most validation keywords only limit the range of values within a certain primitive type.
162-
When the primitive type of the instance is not of the type targeted by the keyword, the
163-
validation succeeds.
161+
Each JSON Schema validation keyword adds constraints that
162+
an instance must satisfy in order to successfully validate.
164163
</t>
165164
<t>
166-
For example, the "maxLength" keyword will only restrict certain strings (that are too long) from being valid.
167-
If the instance is a number, boolean, null, array, or object, the keyword passes validation.
168-
</t>
165+
Validation keywords that are missing never restrict validation.
166+
In some cases, this no-op behavior is identical to a keyword that
167+
exists with certain values, and these values are noted where relevant.
168+
</t>
169+
<figure>
170+
<preamble>
171+
From this principle, it follows that all JSON values
172+
successfully validate against the empty schema:
173+
</preamble>
174+
<artwork>
175+
<![CDATA[
176+
{}
177+
]]>
178+
</artwork>
179+
</figure>
180+
<figure>
181+
<preamble>
182+
Similarly, it follows that no JSON value successfully
183+
validates against the empty schema's negation:
184+
</preamble>
185+
<artwork>
186+
<![CDATA[
187+
{
188+
"not": {}
189+
}
190+
]]>
191+
</artwork>
192+
</figure>
169193
</section>
170194

171-
<section title="Validation of primitive types and child values">
172-
<t>
173-
Two of the primitive types, array and object, allow for child values. The validation of
174-
the primitive type is considered separately from the validation of child instances.
175-
</t>
195+
<section title="Keyword independence">
176196
<t>
177-
For arrays, primitive type validation consists of validating restrictions on length.
197+
Validation keywords typically operate independently, without
198+
affecting each other's outcomes.
178199
</t>
179200
<t>
180-
For objects, primitive type validation consists of validating restrictions on the presence
181-
or absence of property names.
201+
For schema author convenience, there are some exceptions:
202+
<list>
203+
<t>"additionalProperties", whose behavior is defined in terms of "properties" and "patternProperties"</t>
204+
<t>"additionalItems", whose behavior is defined in terms of "items"</t>
205+
<t>"minimum" and "maximum", whose behaviors are modified by "exclusiveMinimum" and "exclusiveMaximum", respectively</t>
206+
</list>
182207
</t>
183208
</section>
184209

185-
<section title="Missing keywords">
210+
<section title="Validation of primitive types and child values">
186211
<t>
187-
Validation keywords that are missing never restrict validation.
188-
In some cases, this no-op behavior is identical to a keyword that exists with certain values,
189-
and these values are noted where known.
212+
Two of the primitive types, array and object, allow for child values.
190213
</t>
191-
</section>
192-
193-
<section title="Linearity">
194-
<!-- I call this "linear" in the same manner e.g. waves are linear, they don't interact with each other -->
195214
<t>
196-
Validation keywords typically operate independent of each other, without affecting each other.
215+
Nearly all keywords are defined to operate on either the primitive
216+
type of the container instance, or on the child instance(s), but
217+
not both. Those that operate on child instances are applied to
218+
each appropriate child instance separately.
197219
</t>
198220
<t>
199-
For author convienence, there are some exceptions:
221+
It follows from keyword independence that validation of the primitive
222+
type of the container instance is considered separately from the
223+
values of the child instances or their validation outcomes.
224+
</t>
225+
<t>
226+
Two keywords are exceptions, as they validate properties of arrays as a whole:
200227
<list>
201-
<t>"additionalProperties", whose behavior is defined in terms of "properties" and "patternProperties"; and</t>
202-
<t>"additionalItems", whose behavior is defined in terms of "items"</t>
228+
<t>"uniqueItems", which validates a relationship among the child instances</t>
229+
<t>"contains", which provides a schema for child validation, but need only successfully validate any one child instance rather than applying to all children or to a specific subset of children.</t>
203230
</list>
204231
</t>
205232
</section>
206233

207234
</section>
208235

236+
<section title="Keyword applicability to instance primitive types">
237+
<t>
238+
An important implication of keyword independence is
239+
that most validation keywords only limit the range of values
240+
within a certain primitive type. When the primitive type of
241+
the instance is not of the type targeted by the keyword, the
242+
validation succeeds.
243+
</t>
244+
<t>
245+
For example, the "multipleOf" keyword will only restrict
246+
certain numbers from being valid.
247+
If the instance is a string, boolean, null, array, or object
248+
the keyword passes validation.
249+
</t>
250+
<figure>
251+
<preamble>
252+
The utility of this is best illustrated by considering
253+
this schema for odd numbers:
254+
</preamble>
255+
<artwork>
256+
<![CDATA[
257+
{
258+
"multipleOf": 1,
259+
"not": {
260+
"multipleOf": 2
261+
}
262+
}
263+
]]>
264+
</artwork>
265+
</figure>
266+
<figure>
267+
<preamble>
268+
If "multipleOf" implicitly constrained the type of the
269+
instance to be a number, then both the overall schema
270+
and the negated subschema would require a numeric instance
271+
in order to validate. It would be equivalent to:
272+
</preamble>
273+
<artwork>
274+
<![CDATA[
275+
{
276+
"type": "number",
277+
"multipleOf": 1,
278+
"not": {
279+
"type": "number",
280+
"multipleOf": 2,
281+
}
282+
}
283+
]]>
284+
</artwork>
285+
<postamble>
286+
It is clearly impossible to satisfy this schema, so keywords
287+
must not impose constraints on type. Therefore, as originally written
288+
(without a type constraint) the schema validates both odd integers
289+
and non-numbers.
290+
</postamble>
291+
</figure>
292+
<figure>
293+
<preamble>
294+
The following schema is the correct way to validate
295+
only odd integers, while failing validation for non-numbers:
296+
</preamble>
297+
<artwork>
298+
<![CDATA[
299+
{
300+
"type": "number",
301+
"multipleOf": 1,
302+
"not": {
303+
"multipleOf": 2,
304+
}
305+
}
306+
]]>
307+
</artwork>
308+
<postamble>
309+
This negates only the even-ness of numbers, without
310+
affecting validation of the instance type within the "not".
311+
The instance type is only constrained outside of the negation.
312+
</postamble>
313+
</figure>
314+
</section>
315+
209316
<section title="Validation keywords">
210317
<t>
211318
Validation keywords in a schema impose requirements for successfully validating an instance.
@@ -505,7 +612,7 @@
505612
</t>
506613
<t>
507614
For all such properties, child validation succeeds if the child instance
508-
validates agains the "additionalProperties" schema.
615+
validates against the "additionalProperties" schema.
509616
</t>
510617
</section>
511618

@@ -663,7 +770,7 @@
663770
<t>
664771
Both of these keywords can be used to decorate a user interface with
665772
information about the data produced by this user interface. A title will
666-
preferrably be short, whereas a description will provide explanation about
773+
preferably be short, whereas a description will provide explanation about
667774
the purpose of the instance described by this schema.
668775
</t>
669776
<t>
@@ -812,11 +919,11 @@
812919

813920
<section title="Security considerations">
814921
<t>
815-
JSON Schema validation defines a vocabulary for JSON Schema core and conserns all the security considerations listed there.
922+
JSON Schema validation defines a vocabulary for JSON Schema core and concerns all the security considerations listed there.
816923
</t>
817924
<t>
818925
JSON Schema validation allows the use of Regular Expressions, which have numerous different (often incompatible) implementations.
819-
Some implementations allow the embedding of arbritrary code, which is outside the scope of JSON Schema and MUST NOT be permitted.
926+
Some implementations allow the embedding of arbitrary code, which is outside the scope of JSON Schema and MUST NOT be permitted.
820927
Regular expressions can often also be crafted to be extremely expensive to compute (with so-called "catastrophic backtracking"),
821928
resulting in a denial-of-service attack.
822929
</t>

0 commit comments

Comments
 (0)