Skip to content

"additionalPatternProperties": validate properties by pattern, excluding ones listed in "properties" #76

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
dmikov opened this issue Oct 9, 2016 · 28 comments
Labels
clarification Items that need to be clarified in the specification Type: Enhancement
Milestone

Comments

@dmikov
Copy link

dmikov commented Oct 9, 2016

I encountered a very cumbersome issue, which might be related to implementation of validators, but the ones I tried all do it.

{
    "$schema": "http://json-schema.org/draft-04/schema#",
    "type": "object",
    "properties": {
        "version": {
            "type": "string"
        },
        "image64":{
            "type": "string"
        }
    },
    "required": [
        "version"
    ],
    "patternProperties": {
        "^[A-Za-z0-9_]+$": {
            "title": "Task Name"
            "type": "object",
            "properties": {
                "action":{
                    "type": "string"
                },
                "schedule": {
                    "type": "object",
                    "properties": {
                        "timedelta": {
                            "type": "object",
                            "properties": {
                                "minutes": {
                                    "type": "integer"
                                }
                            },
                            "required": [
                                "minutes"
                            ]
                        }
                    },
                    "required": [
                        "timedelta"
                    ]
                }
            }
        }
    }
}

The version and image64 property fail validation since they are catched by regex on pattern and therefore must be object and not string. Yes I can write a regex excluding them, but what happen if I need to add one more static property? Edit it again. I think requiring evaluation of static properties first should be mandatory to avoid this issue.

@handrews
Copy link
Contributor

handrews commented Oct 9, 2016

There is intentionally no order to properties vs patternProperties. It is effectively an allOf if the property name matches both.

Something like propertyNames could help express the concept of "except these properties" which is currently difficult to do- right now you can only "exclude" properties by doing something like {"not": {"properties": {"a": {}, "b": {}}}} which means they can't be there at all.

@dmikov
Copy link
Author

dmikov commented Oct 9, 2016

I don't understand, the link you provided actually specifies the order. Property->PatternProperty->AdditionalProperties. Am I reading it wrong?

8.3.3.2. First step: schemas in "properties"

If set "p" contains value "m", then the corresponding schema in "properties" is added to "s".

@handrews
Copy link
Contributor

"s" is the set of schemas against which to validate the child schema.

Step 1: If there's a match in properties, add that schema to "s"
Step 2: If there are any matches in patternProperties, add all of those schemas to "s"
Step 3: If and only if "s" is empty, add the additionalProperties schema to "s" (or fail validation if additionalProperties is false)

Implicit step 4: Validate the child instance against all schemas in "s" (effectively, an "allOf": [s0, s1, s2...])

@awwright
Copy link
Member

@dmikov A property can be evaluated by both "properties" and "patternProperties", and there's no order to keywords, so the validation result isn't effected by which one the validator considers first.

Does that make sense?

@dmikov
Copy link
Author

dmikov commented Oct 10, 2016

@awwright Hence my enhancement proposal. Why would you want to evaluate by both? Once you defined static property. Choosing specific over generic is a much more natural way. Kinda how lexers and tokenisers work.

@awwright
Copy link
Member

Ok, so you'd like to see any matching properties in "patternProperties"
skipped if validated against one in "properties".

I'm not aware of anyone depending on the current behavior, though that
still seems like the more sensible approach to me.

Can you provide an example where your approach makes more sense?

On Oct 10, 2016 4:14 AM, "Dmitriy Krasnikov" [email protected]
wrote:

@awwright https://github.com/awwright Hence my enhancement proposal.
Why would you want to evaluate by both? Once you defined static property.
Choosing specific over generic is a much more natural way.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#76 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAatDVCImJUUBIYqyqeHHmf0yQaBh3aXks5qyh4mgaJpZM4KSJYM
.

@dmikov
Copy link
Author

dmikov commented Oct 10, 2016

Gladly. I am developing workflow engine right now. As a workflow dsl language, we decided on Mistral DSL 2.0 http://docs.openstack.org/developer/mistral/dsl/dsl_v2.html

Here is an excerpt from it:

---
version: '2.0'

create_vm:
  description: Simple workflow example
  type: direct

  input:
    - vm_name
    - image_ref
    - flavor_ref
  output:
    vm_id: <% $.vm_id %>

  tasks:
    create_server:
      action: nova.servers_create name=<% $.vm_name %> image=<% $.image_ref %> flavor=<% $.flavor_ref %>
      publish:
        vm_id: <% task(create_server).result.id %>
      on-success:
        - wait_for_instance

    wait_for_instance:
      action: nova.servers_find id=<% $.vm_id %> status='ACTIVE'
      retry:
        delay: 5
        count: 15

As you can see the workflow name create_vm is a property name and will always have different name, but always same type definition. version and in my case schema (added for backward compatibility in the future) are static properties with type string. Using pattern property for workflow name doesn't work without excluding 'schema' and 'version' with positive lookahead regex. Which is painful and will be even more if I will want to add author or license in the future.

I personally can hardly find the case when combining rules from static and pattern property is beneficial. Whatever the goal is, this combination is just too unpredictable not to be dangerous, plus I imagine it can still be accomplished by combining multiple pattern properties, that would at least be somewhat expected behavior.

@jdesrosiers
Copy link
Member

I'm against this for the same reasons @handrews laid out in #77. If we did this, it would make patternProperties as problematic as addtionalProperties is now. No validation keyword should be dependent on the value or validation result of another. We should be removing things like this not adding more.

@epoberezkin
Copy link
Member

epoberezkin commented Oct 11, 2016

@jdesrosiers I actually disagree with the idea in #77, both in relation to maximum/minimum and also as a general approach. Allowing keyword validation depend on siblings is fine, it allows to keep keywords values shallower. Being religious about keyword independence is the opposite extreme of other proposed changes where keywords become aware not only of the siblings but of everything, this extreme is equally bad.
Why can't we leave things as they are and only change things that add or improve some functionality, rather than just change the syntax?

@epoberezkin
Copy link
Member

Having said that, making patternProperties dependent on properties is not a bad idea - in many cases it is not possible to write a regular expression that would exclude a certain string. In some cases there are workarounds but they make schema very verbose, in some cases there isn't really.
So excluding properties from patternProperties (and also patternGroups) sounds like an enhancement.

@awwright
Copy link
Member

Maybe we introduce another property called "additionalPatternProperties", keeping with the tradition of prefixing non-linear keywords with "additional".

@epoberezkin
Copy link
Member

epoberezkin commented Oct 11, 2016

Maybe it's better indeed than changing semantic of patternProperties... I would at least deprecate patternProperties in this case, so they don't exist at the same time.

Honestly, there are few use cases where you mix patternProperties and properties, it usually indicates questionable data design. There are even fewer cases when you want to apply the same schema to both properties and patternProperties, and even if you do, you can use allOf with $refs. So renaming patternProperties to additionalPatternProperties is probably ok - it may make transition easier. At the same time, shorter keywords are better, so I am ok with just changing the semantics too - it would affect very few people, majority of users don't mix, and it seems that every time they do mix they raise it as an issue.

@dmikov
Copy link
Author

dmikov commented Oct 11, 2016

@jdesrosiers I want keyword independence actually. #77 is not related conceptually to this at all.
I want my version property be evaluated and validated independent of if I have patternProperty in the same schema or not. Currently if I have existing schema with properties and then add patternProperty that might match name of any of them - the json fails validation because the field cannot be object and string at the same time. This is a very unexpected side effect of adding another property, don't you think?

@handrews
Copy link
Contributor

@dmikov

I want keyword independence actually

Keyword independence means that patternProperties applies or doesn't apply to a given property name without regard to whether that property name is specified in properties. So what we have now is keyword independence. It may not be the most convenient behavior, but it is independent.

@dmikov
Copy link
Author

dmikov commented Oct 11, 2016

@handrews for sure during execution they evaluated independently. However during design phase my property is not independent, it will not be validated correctly because of other - patternProperty being present somewhere in the schema. That's independence I would like to work with. For usability I would think the later is more important, since implementing validator to this is just a technicality. Sorry for confusion.

@epoberezkin
Copy link
Member

@dmikov so essentially you want the behaviour of patternProperties keyword be dependent on the presence of properties keyword...

@handrews
Copy link
Contributor

@dmikov , @awwright , how do we bring this to a resolution? It seems to have quieted down.

There's clear opposition to the original proposal (changing the behavior of patternProperties while keeping its name) from at least three people on this thread. There is also the "additionalPatternProperties" proposal, either alongside the current patternProperties or as a replacement. Do we want to pursue that approach further?

@dmikov
Copy link
Author

dmikov commented Nov 8, 2016

Well additionalPatternProperties will solve the issue for me or at least allow to proceed. But I have to say keeping patternProperties as is - is not clean - too many side effects. The opposition here is really more about saying how people should write json (and not about jsonshema), which I think should not be discussed here at all.

@dmikov
Copy link
Author

dmikov commented Nov 8, 2016

@epoberezkin I thought I didn't need to point out, that it does it anyway in one way or the other. If it combines the definition, the behavior of patternProperties is affected. That's just a nature of collision of definitions and will always be there when you evaluate tokens.
i.e I neither have a pattern property definition pure or single property definition pure applied to my field in json document when they are colliding.
What I propose is just more of an expected behavior in line with other lexer, tokenizers familiar to people. It's not better or worse by itself.
And yes as you mentioned before patternProperty is nice, but too vague a concept and might need to be depreciated at some point when additionalPatternProperties come and when maybe some property inheritance will be implemented.

@epoberezkin
Copy link
Member

@dmikov not sure I follow the argument. Schema validation has no side effects at all, not sure what you mean here. patternProperties is independent of properties. And you suggest to make it dependent. I don't think we should do it.

@handrews
Copy link
Contributor

handrews commented Nov 8, 2016

@dmikov I think if you want to push for additionalPatternProperties there is at least some support for that. It's probably worth filing it in its own issue, as the original goal of this issue does not seem to be getting support. I don't know at what point we declare an idea to be rejected for lack of support- that's a question for @awwright or @Relequestual

@epoberezkin
Copy link
Member

@handrews back to the process question :)
We need a voting panel :)

On the substance, additionalPatternProperties seems weird...

@handrews
Copy link
Contributor

@epoberezkin I suspect @awwright would (not unreasonably) quote The Tao of IETF:

"We reject kings, presidents and voting."

which is followed by the familiar "rough consensus and running code" quote. But I do believe that we need a better-defined notion of "rough consensus", particularly for figuring out when we can reject proposals that are not gathering support. I feel like we have a lot of those just hanging out in the queue cluttering things up.

For more of this discussion, let's take it to #137.

@awwright awwright removed this from the draft-next milestone Dec 1, 2016
@awwright
Copy link
Member

I don't really see a problem with having both "patternProperties" and "additionalPatternProperties". Except perhaps keyword proliferation, which is understandable. But I think if there's a good reason to have "additionalPatternProperties", that's a good reason just to add a new keyword without replacing one.

Does anyone actually think this is a bad idea?

I understand how it might encourage bad design practices. Usually patterns shouldn't be able to look like keywords.

There's also the propertyNames keyword. @dmikov, would #172 work for you instead? That is, using a combination of "additionalProperties" and "propertyNames": { pattern: "^[A-Za-z0-9_]+$" }

@awwright awwright changed the title Make validators require evaluating properties before pattern properties. v4, v6 "additionalPatternProperties": validate properties by pattern, excluding ones listed in "properties" Dec 16, 2016
@epoberezkin
Copy link
Member

@awwright I am against adding "additionalPatternProperties" before figuring out how to manage "additionalProperties" problems - it would only add to the confusion. At the moment "additionalProperties" is defined to validate everything that doesn't match any pattern in "patternProperties" as well, so not sure I follow why we need a separate keyword - the implication here is that we also should change the semantics of patternProperties, which is a bigger issue and many people seem to be of the opinion that it is a bad idea...

@handrews
Copy link
Contributor

handrews commented Dec 20, 2016

@epoberezkin the point of "additionalPatternProperties" would be to avoid changing the semantics of "patternProperties".

@awwright I'm not sure about "additionalPatternProperties". Consider the following scenarios:

  • A property matches both a pattern in "patternProperties" and one in "additionalPatternProperties". Do both apply, or only "patternProperties"? In other words, is the "additional" only "additional after "properties"" or is it "additional after everything"?
  • A property matches multiple patterns in "additionalPatternProperties". Do both affect it? If not, how do you pick one? If so, what if you then want "additionalAdditionalPatternProperties" to avoid double-matching?

I think we would be better advised to make it easier to exclude things from matching than to throw in more keywords with increasingly complex exclusion rules. You are right to bring up "propertyNames", but "propertyNames" imposes a constraint on all properties in the object. So you can't just do what you suggested above. This was an intentional part of @epoberezkin proposal- the idea was that you should not mix different "propertyNames" validation in the same object.

I feel like object property rules are getting dangerously messy and should be reconsidered holistically in a near-future draft. Consider the various other competing proposals in #106 and #117 (and probably others) as well.

@Relequestual
Copy link
Member

I think we would be better advised to make it easier to exclude things from matching than to throw in more keywords with increasingly complex exclusion rules.
@handrews

Agreed. As such, can we close this issue in favour of another solution?

@awwright
Copy link
Member

awwright commented Jul 6, 2017

@dmikov Hasn't responded to my question if "propertyNames" takes care of the use case.

So we can close this out until someone can come up with a case where "propertyNames" is insufficient, or propose a different solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clarification Items that need to be clarified in the specification Type: Enhancement
Projects
None yet
Development

No branches or pull requests

7 participants