Skip to content

Commit 923c825

Browse files
committed
Rework ref deferencing
This is an attempt to simplify dereferencing refs and address issues exposed by new json-schema-test-suite tests. The main thing here is giving schema objects a `base_uri` that's used when resolving ids (and as the initial base URI for instances). Child schemas use the `ref_uri` that was used to resolve them as their `base_uri` in order to get nested refs and ids to resolve properly. More details about specific issues: - `$id` is ignored when `$ref` is present (`instance.base_uri` is updated after `validate_ref`), because `$ref` isn't supposed to take sibling `$id` values into account. `@base_uri` handles nested refs. Exposed by: json-schema-org/JSON-Schema-Test-Suite#493 - JSON pointers are evaluated relative to the ref URI. Previously, pointers were always evaluated using the root schema. Now they're evaluated relative to the schema with a matching `$id` (usually nearest parent with an `$id`; or specific id (see below); default is root). Exposed by: json-schema-org/JSON-Schema-Test-Suite#457 - JSON pointers are evaluated for id refs. This allows a ref to look up a schema by `$id` and then apply a JSON pointer to use a subschema. This uses the same logic as above. The important part is removing the fragment from `ref_uri` if it's a JSON pointer so that the lookup in `ids` works properly. The fragment is kept if it's not a JSON pointer to support location-independent ids. Exposed by: json-schema-org/JSON-Schema-Test-Suite#578 - JSON pointer refs are always joined with the base URI. I [started handling them][0] separately because of an [issue][1] with invalid URIs. But now I think that was incorrect and that fragment pointers need to be encoded properly for URIs. The [specification says][2]: > In all cases, dereferencing a "$ref" reference involves first > resolving its value as a URI reference against the current base URI. - Empty fragments are removed in `join_uri` to have consistent URIs to lookup in `ids`. Meta schemas, for example, have empty fragments in their top-level ids (eg, `http://json-schema.org/draft-07/schema#`) and removing the JSON pointer fragments causes them not to be found. [0]: b91115e [1]: #54 [2]: https://datatracker.ietf.org/doc/html/draft-handrews-json-schema-01#section-8.3.2
1 parent 5b55d3b commit 923c825

File tree

3 files changed

+58
-84
lines changed

3 files changed

+58
-84
lines changed

lib/json_schemer.rb

+8-11
Original file line numberDiff line numberDiff line change
@@ -57,28 +57,25 @@ def schema(schema, default_schema_class: Schema::Draft7, **options)
5757
when String
5858
schema = JSON.parse(schema)
5959
when Pathname
60-
uri = URI.parse(File.join('file:', URI::DEFAULT_PARSER.escape(schema.realpath.to_s)))
61-
if options.key?(:ref_resolver)
62-
schema = FILE_URI_REF_RESOLVER.call(uri)
60+
base_uri = URI.parse(File.join('file:', URI::DEFAULT_PARSER.escape(schema.realpath.to_s)))
61+
options[:base_uri] = base_uri
62+
schema = if options.key?(:ref_resolver)
63+
FILE_URI_REF_RESOLVER.call(base_uri)
6364
else
6465
ref_resolver = CachedResolver.new(&FILE_URI_REF_RESOLVER)
65-
schema = ref_resolver.call(uri)
6666
options[:ref_resolver] = ref_resolver
67+
ref_resolver.call(base_uri)
6768
end
68-
schema[draft_class(schema, default_schema_class)::ID_KEYWORD] ||= uri.to_s
6969
end
70-
draft_class(schema, default_schema_class).new(schema, **options)
71-
end
72-
73-
private
7470

75-
def draft_class(schema, default_schema_class)
76-
if schema.is_a?(Hash) && schema.key?('$schema')
71+
schema_class = if schema.is_a?(Hash) && schema.key?('$schema')
7772
meta_schema = schema.fetch('$schema')
7873
SCHEMA_CLASS_BY_META_SCHEMA[meta_schema] || raise(UnsupportedMetaSchema, meta_schema)
7974
else
8075
default_schema_class
8176
end
77+
78+
schema_class.new(schema, **options)
8279
end
8380
end
8481
end

lib/json_schemer/schema/base.rb

+49-72
Original file line numberDiff line numberDiff line change
@@ -4,17 +4,17 @@ module Schema
44
class Base
55
include Format
66

7-
Instance = Struct.new(:data, :data_pointer, :schema, :schema_pointer, :parent_uri, :before_property_validation, :after_property_validation) do
7+
Instance = Struct.new(:data, :data_pointer, :schema, :schema_pointer, :base_uri, :before_property_validation, :after_property_validation) do
88
def merge(
99
data: self.data,
1010
data_pointer: self.data_pointer,
1111
schema: self.schema,
1212
schema_pointer: self.schema_pointer,
13-
parent_uri: self.parent_uri,
13+
base_uri: self.base_uri,
1414
before_property_validation: self.before_property_validation,
1515
after_property_validation: self.after_property_validation
1616
)
17-
self.class.new(data, data_pointer, schema, schema_pointer, parent_uri, before_property_validation, after_property_validation)
17+
self.class.new(data, data_pointer, schema, schema_pointer, base_uri, before_property_validation, after_property_validation)
1818
end
1919
end
2020

@@ -53,6 +53,7 @@ def merge(
5353

5454
def initialize(
5555
schema,
56+
base_uri: nil,
5657
format: true,
5758
insert_property_defaults: false,
5859
before_property_validation: nil,
@@ -64,6 +65,7 @@ def initialize(
6465
)
6566
raise InvalidSymbolKey, 'schemas must use string keys' if schema.is_a?(Hash) && !schema.empty? && !schema.first.first.is_a?(String)
6667
@root = schema
68+
@base_uri = base_uri
6769
@format = format
6870
@before_property_validation = [*before_property_validation]
6971
@before_property_validation.unshift(INSERT_DEFAULT_PROPERTY) if insert_property_defaults
@@ -82,15 +84,17 @@ def initialize(
8284
end
8385

8486
def valid?(data)
85-
valid_instance?(Instance.new(data, '', root, '', nil, @before_property_validation, @after_property_validation))
87+
valid_instance?(Instance.new(data, '', root, '', @base_uri, @before_property_validation, @after_property_validation))
8688
end
8789

8890
def validate(data)
89-
validate_instance(Instance.new(data, '', root, '', nil, @before_property_validation, @after_property_validation))
91+
validate_instance(Instance.new(data, '', root, '', @base_uri, @before_property_validation, @after_property_validation))
9092
end
9193

9294
protected
9395

96+
attr_reader :root
97+
9498
def valid_instance?(instance)
9599
validate_instance(instance).none?
96100
end
@@ -120,13 +124,13 @@ def validate_instance(instance, &block)
120124
ref = schema['$ref']
121125
id = schema[id_keyword]
122126

123-
instance.parent_uri = join_uri(instance.parent_uri, id)
124-
125127
if ref
126128
validate_ref(instance, ref, &block)
127129
return
128130
end
129131

132+
instance.base_uri = join_uri(instance.base_uri, id)
133+
130134
if format? && custom_format?(format)
131135
validate_custom_format(instance, formats.fetch(format), &block)
132136
end
@@ -228,7 +232,7 @@ def ids
228232

229233
private
230234

231-
attr_reader :root, :formats, :keywords, :ref_resolver, :regexp_resolver
235+
attr_reader :formats, :keywords, :ref_resolver, :regexp_resolver
232236

233237
def id_keyword
234238
ID_KEYWORD
@@ -246,10 +250,11 @@ def spec_format?(format)
246250
!custom_format?(format) && supported_format?(format)
247251
end
248252

249-
def child(schema)
253+
def child(schema, base_uri:)
250254
JSONSchemer.schema(
251255
schema,
252256
default_schema_class: self.class,
257+
base_uri: base_uri,
253258
format: format?,
254259
formats: formats,
255260
keywords: keywords,
@@ -306,50 +311,38 @@ def validate_type(instance, type, &block)
306311
end
307312

308313
def validate_ref(instance, ref, &block)
309-
if ref.start_with?('#')
310-
schema_pointer = ref.slice(1..-1)
311-
if valid_json_pointer?(schema_pointer)
312-
ref_pointer = Hana::Pointer.new(URI.decode_www_form_component(schema_pointer))
313-
subinstance = instance.merge(
314-
schema: ref_pointer.eval(root),
315-
schema_pointer: schema_pointer,
316-
parent_uri: (pointer_uri(root, ref_pointer) || instance.parent_uri)
317-
)
318-
validate_instance(subinstance, &block)
319-
return
320-
end
321-
end
322-
323-
ref_uri = join_uri(instance.parent_uri, ref)
314+
ref_uri = join_uri(instance.base_uri, ref)
324315

316+
ref_uri_pointer = ''
325317
if valid_json_pointer?(ref_uri.fragment)
326-
ref_pointer = Hana::Pointer.new(URI.decode_www_form_component(ref_uri.fragment))
327-
ref_root = resolve_ref(ref_uri)
328-
ref_object = child(ref_root)
329-
subinstance = instance.merge(
330-
schema: ref_pointer.eval(ref_root),
331-
schema_pointer: ref_uri.fragment,
332-
parent_uri: (pointer_uri(ref_root, ref_pointer) || ref_uri)
333-
)
334-
ref_object.validate_instance(subinstance, &block)
335-
elsif id = ids[ref_uri.to_s]
336-
subinstance = instance.merge(
337-
schema: id.fetch(:schema),
338-
schema_pointer: id.fetch(:pointer),
339-
parent_uri: ref_uri
340-
)
341-
validate_instance(subinstance, &block)
318+
ref_uri_pointer = ref_uri.fragment
319+
ref_uri.fragment = nil
320+
end
321+
322+
ref_object = if ids.key?(ref_uri) || ref_uri.to_s == @base_uri.to_s
323+
self
342324
else
343-
ref_root = resolve_ref(ref_uri)
344-
ref_object = child(ref_root)
345-
id = ref_object.ids[ref_uri.to_s] || { schema: ref_root, pointer: '' }
346-
subinstance = instance.merge(
347-
schema: id.fetch(:schema),
348-
schema_pointer: id.fetch(:pointer),
349-
parent_uri: ref_uri
350-
)
351-
ref_object.validate_instance(subinstance, &block)
325+
child(resolve_ref(ref_uri), base_uri: ref_uri)
326+
end
327+
328+
ref_schema, ref_schema_pointer = ref_object.ids[ref_uri] || [ref_object.root, '']
329+
330+
ref_uri_pointer_parts = Hana::Pointer.parse(URI.decode_www_form_component(ref_uri_pointer))
331+
schema, base_uri = ref_uri_pointer_parts.reduce([ref_schema, ref_uri]) do |(obj, uri), token|
332+
if obj.is_a?(Array)
333+
[obj.fetch(token.to_i), uri]
334+
else
335+
[obj.fetch(token), join_uri(uri, obj[id_keyword])]
336+
end
352337
end
338+
339+
subinstance = instance.merge(
340+
schema: schema,
341+
schema_pointer: "#{ref_schema_pointer}#{ref_uri_pointer}",
342+
base_uri: base_uri
343+
)
344+
345+
ref_object.validate_instance(subinstance, &block)
353346
end
354347

355348
def validate_custom_format(instance, custom_format)
@@ -631,7 +624,7 @@ def escape_json_pointer_token(token)
631624

632625
def join_uri(a, b)
633626
b = URI.parse(b) if b
634-
if a && b && a.relative? && b.relative?
627+
uri = if a && b && a.relative? && b.relative?
635628
b
636629
elsif a && b
637630
URI.join(a, b)
@@ -640,35 +633,19 @@ def join_uri(a, b)
640633
else
641634
a
642635
end
636+
uri.fragment = nil if uri.is_a?(URI) && uri.fragment == ''
637+
uri
643638
end
644639

645-
def pointer_uri(schema, pointer)
646-
uri_parts = nil
647-
pointer.reduce(schema) do |obj, token|
648-
next obj.fetch(token.to_i) if obj.is_a?(Array)
649-
if obj_id = obj[id_keyword]
650-
uri_parts ||= []
651-
uri_parts << obj_id
652-
end
653-
obj.fetch(token)
654-
end
655-
uri_parts ? URI.join(*uri_parts) : nil
656-
end
657-
658-
def resolve_ids(schema, ids = {}, parent_uri = nil, pointer = '')
640+
def resolve_ids(schema, ids = {}, base_uri = @base_uri, pointer = '')
659641
if schema.is_a?(Array)
660-
schema.each_with_index { |subschema, index| resolve_ids(subschema, ids, parent_uri, "#{pointer}/#{index}") }
642+
schema.each_with_index { |subschema, index| resolve_ids(subschema, ids, base_uri, "#{pointer}/#{index}") }
661643
elsif schema.is_a?(Hash)
662-
uri = join_uri(parent_uri, schema[id_keyword])
644+
uri = join_uri(base_uri, schema[id_keyword])
663645
schema.each do |key, value|
664646
case key
665647
when id_keyword
666-
unless uri == parent_uri
667-
ids[uri.to_s] = {
668-
schema: schema,
669-
pointer: pointer
670-
}
671-
end
648+
ids[uri] ||= [schema, pointer]
672649
when 'items', 'allOf', 'anyOf', 'oneOf', 'additionalItems', 'contains', 'additionalProperties', 'propertyNames', 'if', 'then', 'else', 'not'
673650
resolve_ids(value, ids, uri, "#{pointer}/#{key}")
674651
when 'properties', 'patternProperties', 'definitions', 'dependencies'

test/ref_test.rb

+1-1
Original file line numberDiff line numberDiff line change
@@ -215,7 +215,7 @@ def test_it_handles_nested_refs
215215
def test_it_handles_json_pointer_refs_with_special_characters
216216
schema = JSONSchemer.schema({
217217
'type' => 'object',
218-
'properties' => { 'foo' => { '$ref' => '#/definitions/~1some~1{id}'} },
218+
'properties' => { 'foo' => { '$ref' => '#/definitions/~1some~1%7Bid%7D'} },
219219
'definitions' => { '/some/{id}' => { 'type' => 'string' } }
220220
})
221221
assert(schema.valid?({ 'foo' => 'bar' }))

0 commit comments

Comments
 (0)