Description
The specification currently specifies that schemas without any "$schema" keyword are "implementation defined" (emphasis added):
The "$schema" keyword SHOULD be used in the document root schema object, and MAY be used in the root schema objects of embedded schema resources. It MUST NOT appear in non-resource root schema objects. If absent from the document root schema, the resulting behavior is implementation-defined.
This under-constrains the validation behavior and permits behavior that I would never expect to be possible (it wouldn't be wrong to say that {"type":"string"}
permits null
). And it provides no guarantees about reverse-compatibility; updates to the meta-schema that are reverse-incompatible will have broad effects. (That is to say, dialects don't actually achieve the intended goal of forward compatibility, it just pushes this responsibility onto implementations, probably in platform-specific ways.)
Further, there's at least a few implementations that do not read "$schema"
and even if they should, requiring this keyword would be impractical. It's perfectly clear what is meant when someone writes {"type":"string"}
without any context.
It's understood that implementations aren't always up-to-date on a spec; that some implementations only follow older versions of a spec normally goes without saying. When we say that documents without "$schema" is implementation-defined behavior, we're either being redundant, or we're saying something more than that.
The "$schema" keyword is, of course, useful. It can declare a subset of the JSON Schema vocabulary (e.g. some document databases might not be able to implement the full vocabulary); or user or implementation-specific keywords (vocabularies). And it can be used as a heuristic to decide if older behavior for a keyword should be used.
But it can't solve all versioning problems. A custom meta-schema might not define compatibility with any draft/release of JSON Schema. We still need to define the behavior for these situations.
By comparison, text/html
and application/xhtml+xml
have a single document that defines how to interpret all documents, even ones marked with an older version number. Some HTML versions allow you to specify a DTD, but these restrict the elements you're allowed to use, they aren't required, they don't actually change the semantics of the elements, and their omission has a standardized behavior.
A simple test for potential solutions is: null
(and other values) cannot be valid against {"type":"string"}
. Currently the behavior is under-constrained (under-specified) and there's nothing to say that this would be wrong.
Potential solutions:
-
Error on a missing $schema keyword: This would break a very large number of existing implementations and documents.
null
wouldn't be valid, but it wouldn't necessarily be invalid either. (And even if we were authoring from scratch, requiring version identifiers in documents is strongly discouraged in Internet media types, and very unpopular among document authors.) -
Implementations assume the latest known, standard $schema: This would give stronger guarantees about cross-platform compatibility (presumably,
"type"
would always mean the same thing). However, this would defeat the intention of"$schema"
as a dialect identifier. If an implementation updates its default"$schema"
, this would break reverse compatibility if the new dialect is not reverse-compatible. (This is also true wherever the behavior is undefined or implementation-defined.) -
Single media type specification: Newer drafts replace older drafts in their entirety, though implementations may choose to implement older behavior for reverse compatibility reasons. Because the URI of the meta-schema changes with every draft, "$schema" can be used as a heuristic to determine if a schema is expecting an older, superseded behavior. All releases would have to be reverse compatible, or at least, changes would have to be carefully weighed, especially if there's orthogonal implementations. This was the behavior through draft-07.