Skip to content

Add some simple tests for URN base URIs. #578

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jul 31, 2022
Merged

Add some simple tests for URN base URIs. #578

merged 3 commits into from
Jul 31, 2022

Conversation

Julian
Copy link
Member

@Julian Julian commented Jul 29, 2022

See https://datatracker.ietf.org/doc/html/draft-bhutton-json-schema-01#section-8.2.1
(which specifies that $id is any URI from RFC 3986, which URNs are an example of).

Earlier drafts (6 and 7) didn't fully disallow $ids with fragments (they left the
behavior undefined, and their metaschemas didn't guard against them), so we
leave off the f-component test in these drafts.

Closes: #179

@Julian Julian requested a review from a team as a code owner July 29, 2022 13:24
Copy link

@ndellosa95 ndellosa95 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a file:/// test anywhere using a relative ref? I'm seeing the issue with relative refs for both / and : style (non-http) URIs. I believe the example specified here doesn't work if you change the $id from https to file.

@Julian
Copy link
Member Author

Julian commented Jul 29, 2022

Writing portable tests for file is somewhat hard. When you say "doesn't work" though I assume you mean in a particular implementation of JSON Schema (and presumably mine) -- if so let's take that elsewhere (to my issue tracker), but it works fine here. Be sure you're not forgetting the trailing slash.

@ndellosa95
Copy link

Writing portable tests for file is somewhat hard. When you say "doesn't work" though I assume you mean in a particular implementation of JSON Schema (and presumably mine) -- if so let's take that elsewhere (to my issue tracker), but it works fine here. Be sure you're not forgetting the trailing slash.

Did some more digging, file was a bad example because it turns out it does work. What I was looking for is a way to add custom schemes to uses_relative.

Copy link
Member

@gregsdennis gregsdennis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a couple more tests, please?

  • $ref to urn:blah#/pointer/to/somewhere
  • $ref to urn:blah#anchor

I had brought these up here but they're apparently valid.

The rest look really great!

@Julian
Copy link
Member Author

Julian commented Jul 30, 2022

Yep good point, added!

See https://datatracker.ietf.org/doc/html/draft-bhutton-json-schema-01#section-8.2.1
(which specifies that $id is any URI from RFC 3986, which URNs are an example of).

Earlier drafts (6 and 7) didn't fully disallow $ids with fragments (they left the
behavior undefined, and their metaschemas didn't guard against them), so we
leave off the f-component test in these drafts.

Closes: #179
@gregsdennis
Copy link
Member

I'd like to run these through my implementation, but it looks good visually.

@karenetheridge
Copy link
Member

There are a lot of duplicate UUIDs. Please could you make them unique between tests?

@Julian
Copy link
Member Author

Julian commented Jul 31, 2022

There are a lot of duplicate UUIDs. Please could you make them unique between tests?

Ups, yes, done!

@Julian Julian merged commit 685ac63 into main Jul 31, 2022
@Julian Julian deleted the urn branch July 31, 2022 05:50
@Julian
Copy link
Member Author

Julian commented Jul 31, 2022

Whee, there are 3 more cases we need here, one of which is hard to test. (All 3 will come in a follow up PR, soon as I figure out how to do the hard one, or once I give up and send the other 2):

  • A ref test with {"$ref": "urn:uuid:deadbeef-1234-ffff-ffff-4321feebdaed", "$defs": {"subschema": {"$id": "urn:uuid:deadbeef-1234-ffff-ffff-4321feebdaed", "$defs": {"bar": {"type": "string"}}, "$ref": "#/$defs/bar"}}}, i.e. where the $ref points to an inlined URN schema
  • A refRemote test with {"$ref": "http://localhost:1234/urn.json"} where urn.json has {"$id": "urn:uuid:deadbeef-1234-ffff-ffff-4321feebdaed", "$defs": {"bar": {}}, "$ref": "#/$defs/bar"} (i.e. it has a URN $id and some other non-URN URI it's looked up from)
  • A refRemote test with {"$ref": "urn:uuid:deadbeef-1234-ffff-ffff-4321feebdaed"} resolving to {"$defs": {"bar": {}}, "$ref": "#/$defs/bar"} (this one is hard because we don't have a way without jsonschema_suite remotes to signal "the URI for this thing is some non-filepath", so any implementation directly reading files off the filesystem from remote/ rather than executing bin/jsonschema_suite remotes will not be able to run this)

Julian added a commit that referenced this pull request Aug 29, 2022
Covers:

    * A $ref to an absolute URI present in the retrieved schema
    * A schema with URN $id with a nested $ref
    * Retrieving a schema with a different $id than the retrieval URI
    * Retrieving a schema with $ref whose retrieval URI was an HTTP URI
      but whose $id is an URN

(These involve sibling $refs so don't apply to pre-2019).

Missing is a test for an URN retrieval URL, which unfortunately we have
no way of communicating at the minute.

Refs: #578 (comment)
davishmcclurg added a commit to davishmcclurg/json_schemer that referenced this pull request May 16, 2023
This allows a `$ref` to look up a schema by `$id` and then apply a JSON
pointer to use a subschema.

The test suite example schema looks like this:

```json
{
  "$id": "urn:uuid:deadbeef-1234-0000-0000-4321feebdaed",
  "properties": {
    "foo": {"$ref": "urn:uuid:deadbeef-1234-0000-0000-4321feebdaed#/$defs/bar"}
  },
  "$defs": {
    "bar": {"type": "string"}
  }
}
```

Previously, ids were looked up using a URI that included the fragment
(ie, json pointer). Now if the fragment is a valid json pointer, the
schema is looked up without the fragment and the pointer is applied to
whatever is found.

I tried to simplify things as much as I could, but it still ended up
quite complicated. A slightly separate path is still necessary for refs
that start with `#` because they don't have to be encoded as URIs and
aren't always [valid URIs][0].

Refs are sent to the ref resolver if they aren't found by id and aren't
local json pointers.

`parent_uri` is a little tricky and I'm not sure I got it totally right
in all scenarios, but it's passing all the tests.

Exposed by: json-schema-org/JSON-Schema-Test-Suite#578

[0]: b91115e
davishmcclurg added a commit to davishmcclurg/json_schemer that referenced this pull request May 16, 2023
This allows a `$ref` to look up a schema by `$id` and then apply a JSON
pointer to use a subschema.

The test suite example schema looks like this:

```json
{
  "$id": "urn:uuid:deadbeef-1234-0000-0000-4321feebdaed",
  "properties": {
    "foo": {"$ref": "urn:uuid:deadbeef-1234-0000-0000-4321feebdaed#/$defs/bar"}
  },
  "$defs": {
    "bar": {"type": "string"}
  }
}
```

Previously, ids were looked up using a URI that included the fragment
(ie, json pointer). Now if the fragment is a valid json pointer, the
schema is looked up without the fragment and the pointer is applied to
whatever is found.

I tried to simplify things as much as I could, but it still ended up
quite complicated. A slightly separate path is still necessary for refs
that start with `#` because they don't have to be encoded as URIs and
aren't always [valid URIs][0].

Refs are sent to the ref resolver if they aren't found by id and aren't
local json pointers.

`parent_uri` is a little tricky and I'm not sure I got it totally right
in all scenarios, but it's passing all the tests.

Exposed by: json-schema-org/JSON-Schema-Test-Suite#578

[0]: b91115e
davishmcclurg added a commit to davishmcclurg/json_schemer that referenced this pull request May 22, 2023
This is an attempt to simplify dereferencing refs and address issues
exposed by new json-schema-test-suite tests. The main thing here is
giving schema objects a `base_uri` that's used when resolving ids (and
as the initial base URI for instances). Child schemas use the `ref_uri`
that was used to resolve them as their `base_uri` in order to get nested
refs and ids to resolve properly.

More details about specific issues:

- `$id` is ignored when `$ref` is present (`instance.base_uri` is
  updated after `validate_ref`), because `$ref` isn't supposed to take
  sibling `$id` values into account. `@base_uri` handles nested refs.
  Exposed by: json-schema-org/JSON-Schema-Test-Suite#493

- JSON pointers are evaluated relative to the ref URI. Previously,
  pointers were always evaluated using the root schema. Now they're
  evaluated relative to the schema with a matching `$id` (usually
  nearest parent with an `$id`; or specific id (see below); default is
  root).
  Exposed by: json-schema-org/JSON-Schema-Test-Suite#457

- JSON pointers are evaluated for id refs. This allows a ref to look up
  a schema by `$id` and then apply a JSON pointer to use a subschema.
  This uses the same logic as above. The important part is removing the
  fragment from `ref_uri` if it's a JSON pointer so that the lookup in
  `ids` works properly. The fragment is kept if it's not a JSON pointer
  to support location-independent ids.
  Exposed by: json-schema-org/JSON-Schema-Test-Suite#578

- JSON pointer refs are always joined with the base URI. I [started
  handling them][0] separately because of an [issue][1] with invalid
  URIs. But now I think that was incorrect and that fragment pointers
  need to be encoded properly for URIs. The [specification says][2]:

  > In all cases, dereferencing a "$ref" reference involves first
  > resolving its value as a URI reference against the current base URI.

- Empty fragments are removed in `join_uri` to have consistent URIs to
  lookup in `ids`. Meta schemas, for example, have empty fragments in
  their top-level ids (eg, `http://json-schema.org/draft-07/schema#`)
  and removing the JSON pointer fragments causes them not to be found.

[0]: b91115e
[1]: #54
[2]: https://datatracker.ietf.org/doc/html/draft-handrews-json-schema-01#section-8.3.2
davishmcclurg added a commit to davishmcclurg/json_schemer that referenced this pull request May 26, 2023
This is an attempt to simplify dereferencing refs and address issues
exposed by new json-schema-test-suite tests. The main thing here is
giving schema objects a `base_uri` that's used when resolving ids (and
as the initial base URI for instances). Child schemas use the `ref_uri`
that was used to resolve them as their `base_uri` in order to get nested
refs and ids to resolve properly.

More details about specific issues:

- `$id` is ignored when `$ref` is present (`instance.base_uri` is
  updated after `validate_ref`), because `$ref` isn't supposed to take
  sibling `$id` values into account. `@base_uri` handles nested refs.
  Exposed by: json-schema-org/JSON-Schema-Test-Suite#493

- JSON pointers are evaluated relative to the ref URI. Previously,
  pointers were always evaluated using the root schema. Now they're
  evaluated relative to the schema with a matching `$id` (usually
  nearest parent with an `$id`; or specific id (see below); default is
  root).
  Exposed by: json-schema-org/JSON-Schema-Test-Suite#457

- JSON pointers are evaluated for id refs. This allows a ref to look up
  a schema by `$id` and then apply a JSON pointer to use a subschema.
  This uses the same logic as above. The important part is removing the
  fragment from `ref_uri` if it's a JSON pointer so that the lookup in
  `ids` works properly. The fragment is kept if it's not a JSON pointer
  to support location-independent ids.
  Exposed by: json-schema-org/JSON-Schema-Test-Suite#578

- JSON pointer refs are always joined with the base URI. I [started
  handling them][0] separately because of an [issue][1] with invalid
  URIs. But now I think that was incorrect and that fragment pointers
  need to be encoded properly for URIs. The [specification says][2]:

  > In all cases, dereferencing a "$ref" reference involves first
  > resolving its value as a URI reference against the current base URI.

- Empty fragments are removed in `join_uri` to have consistent URIs to
  lookup in `ids`. Meta schemas, for example, have empty fragments in
  their top-level ids (eg, `http://json-schema.org/draft-07/schema#`)
  and removing the JSON pointer fragments causes them not to be found.

[0]: b91115e
[1]: #54
[2]: https://datatracker.ietf.org/doc/html/draft-handrews-json-schema-01#section-8.3.2
davishmcclurg added a commit to davishmcclurg/json_schemer that referenced this pull request May 26, 2023
This is an attempt to simplify dereferencing refs and address issues
exposed by new json-schema-test-suite tests. The main thing here is
giving schema objects a `base_uri` that's used when resolving ids (and
as the initial base URI for instances). Child schemas use the `ref_uri`
that was used to resolve them as their `base_uri` in order to get nested
refs and ids to resolve properly.

More details about specific issues:

- `$id` is ignored when `$ref` is present (`instance.base_uri` is
  updated after `validate_ref`), because `$ref` isn't supposed to take
  sibling `$id` values into account. `@base_uri` handles nested refs.
  Exposed by: json-schema-org/JSON-Schema-Test-Suite#493

- JSON pointers are evaluated relative to the ref URI. Previously,
  pointers were always evaluated using the root schema. Now they're
  evaluated relative to the schema with a matching `$id` (usually
  nearest parent with an `$id`; or specific id (see below); default is
  root).
  Exposed by: json-schema-org/JSON-Schema-Test-Suite#457

- JSON pointers are evaluated for id refs. This allows a ref to look up
  a schema by `$id` and then apply a JSON pointer to use a subschema.
  This uses the same logic as above. The important part is removing the
  fragment from `ref_uri` if it's a JSON pointer so that the lookup in
  `ids` works properly. The fragment is kept if it's not a JSON pointer
  to support location-independent ids.
  Exposed by: json-schema-org/JSON-Schema-Test-Suite#578

- JSON pointer refs are always joined with the base URI. I [started
  handling them][0] separately because of an [issue][1] with invalid
  URIs. But now I think that was incorrect and that fragment pointers
  need to be encoded properly for URIs. The [specification says][2]:

  > In all cases, dereferencing a "$ref" reference involves first
  > resolving its value as a URI reference against the current base URI.

- Empty fragments are removed in `join_uri` to have consistent URIs to
  lookup in `ids`. Meta schemas, for example, have empty fragments in
  their top-level ids (eg, `http://json-schema.org/draft-07/schema#`)
  and removing the JSON pointer fragments causes them not to be found.

[0]: b91115e
[1]: #54
[2]: https://datatracker.ietf.org/doc/html/draft-handrews-json-schema-01#section-8.3.2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add tests for referencing / identifying schemas via URN URIs
4 participants