-
-
Notifications
You must be signed in to change notification settings - Fork 592
only check strings for patternProperties matches #286
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hi. Bleh, this is... an interesting thing. Thanks for filing it. I'm not 100% sure whether this is the right solution. I'm not not sure it is either, but at least initially: JSON doesn't have non-string keys. So technically what you're validating there isn't valid deserialized JSON. That being said (and I'm being a bit short here), I'm not against fixing this I suppose, it's a bit though that the most obvious place to add a test for this is as a JSON test case in the Thoughts? |
Hmm... interesting, indeed. Initially, I didn't consider that a dict with non-string keys could not have been deserialized from a valid JSON document. I think it's reasonable to require the object being validate to be [strictly] JSON-serializable itself. I'm agreeable to the resolution to #285 being that a Would it be reasonable to validate that all keys of an object are a string type in the |
For example, this def type_draft4(validator, types, instance, schema):
types = _utils.ensure_list(types)
if not any(validator.is_type(instance, type) for type in types):
yield ValidationError(_utils.types_msg(instance, types))
if validator.is_type(instance, "object"):
for key in instance:
if not validator.is_type(key, "string"):
yield ValidationError(
"%r has property %r, which is not a string" % (
instance, key
)
) When given: from jsonschema import validate
schema = {
'type': 'object',
'patternProperties': {
'^[0-9]{3}$': {'type': 'string'}
},
}
validate({200: 'hello'}, schema) Results in:
|
I haven't had another chance to think about this, but I definitely don't think the properties validator should change or that this should be a ValidationError instead of what it is -- this is someone providing a "nonsensical" schema in the context of JSON Schema -- so in my mind this should either be an exception as it is now, or, if we wanted to "stretch" JSON Schema into a more general solution for things beyond valid-deserialized-JSON, the solution would be to ignore non-string keys, or if we felt really nasty, to replicate JS's behavior of coercing non-strings to strings. I haven't thought enough about those 3, but definitely not ValidationError to me. |
Just to clarify, it's a "nonsensical" instance, with a valid schema If I can add some color to the 3 options you proposed...
The I follow your reasoning for it not being a ValidationError, but I still think that it should be some type of error that is defined and documented by jsonschema (if an error is to be raised - see comments on type coercion).
Oh please not this - an instance may pass validation only to fail later by violating the programmer's assumption that successful validation implies that the instance conforms to the schema. If non-string keys are skipped, subsequent serialization may produce json which does not conform. For example: import json
from jsonschema import validate
schema = {
'type': 'object',
'patternProperties': {
'^[0-9]{3}$': {'type': 'string'}
},
}
instance = {1234: 'what'}
# passes if ignored
validate(instance, schema)
# fails after int is coerced to str during serialization
validate(json.loads(json.dumps(instance)), schema)
While I agree that this is "nasty", it is actually what the python JSONEncoder does: import json
import sys
json.dump({200: 'ok'}, sys.stdout) outputs: {"200": "ok"} Maybe this is a reasonable choice because |
I'm still being a bit hard-lined, but just to broadly disagree with this, I don't think this tends to be the case in Python. Objects can raise any sort of exception if the user of the object doesn't provide an object of the correct type. If one does provide an object of the wrong type, all bets are generally off and you get whatever exception that type raises when the wrong typed object gets poked the wrong way, but library authors do not (and should not) just wrap exceptions for the sake of uniformity there.
There's an assumption here that instances should round trip via serialization -- that happens to not be true even for I'm not sure if I'm leaning personally one way or the other though here. I'll still have to think a bit and possibly see what validators in other languages do. |
@fge can I ask you -- yours operates on Java objects right, not strings -- what does it do in this case, where you provide a schema whose keys "could never" have been JSON but then asserts about patternProperties? |
I think at the very least, we should see a clear error that indicates the object is invalid as JSON. I'm working around this with |
So at a high level, I think a reasonable resolution to this issue is at least two things:
I'm inclined to think that we should think about the philosophical stance first, and then to derive the appropriate behavior from there, but some consideration for backwards compatibility also belongs. |
How about a new type of exception to categorize these types of errors. Going more low level, (naming is hard...) it could be called I could see this change breaking someone's code if they're trying to do something like this in response to the problem:
Happy holidays! |
@epoberezkin what does ajv do for this? |
@Julian keys in the map/dictionary in JavaScript can be only strings, so ajv doesn't do anything special about it. |
@epoberezkin even for ES6 |
Ajv only validates enumerable properties and ES6 Maps keys are not properties. ajv will treat ES6 Maps as empty objects. |
Interesting -- OK, thanks for the info!
…On Aug 20, 2017 16:39, "Evgeny Poberezkin" ***@***.***> wrote:
Ajv only validates enumerable properties and ES6 Maps keys are not
properties. ajv will treat them as empty objects.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#286 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAUIXtAm56J7iMZcmhk3CpqOylDix_B_ks5saJmZgaJpZM4IlzUr>
.
|
@handrews this question straddles the boundaries of JSON Schema, Python, and this libraries essence I suppose, but ping just in case you have any thoughts. |
Agreed. If you pass an unexpected/incorrect type, a type error should be expected. At most, I would just document this, but it's true across pretty much all Python libraries.
JSON Schema definitely should not have more stringent requirements than JSON itself. It should possibly have less, given that, as documented in the spec, certain values in the JSON Schema data model may have multiple representations in a JSON document, and detecting and working with such differences is outside of the scope of the spec.
There are very solid reasons for this which I am not going to get into now, but it should remain true that all keywords apply to one or more types, and ignore other types, and definitely do not coerce or assert types.
This could work- I would call it something like For performance, I would go with garbage-in-garbage-out, and say the caller is responsible for providing a data structure that meets the data model, and failing to do so would be expected to produce a For user-friendliness, I could see allowing (opt-in?) a check for data model violations and providing a specific exception for it. |
I agree with all the above (well, unsurprisingly I suppose). For now, I think my gut says to just close this (and my gut also says that something like https://github.com/Julian/Seep might have better answers for this type of thing, but that is still vaporware unfortunately...) If anyone feels like arguing for anything else strongly, please do follow up. |
433ab2f0 Merge pull request python-jsonschema#286 from Zac-HD/not-patterns dbaa3aac Fix data - escape unicode, not regex pattern git-subtree-dir: json git-subtree-split: 433ab2f0fd6b981527e838cd149c91d823bdc367
4e6c78b4 Make the draft3 ECMA regex tests consistently named. 433ab2f0 Merge pull request #286 from Zac-HD/not-patterns dbaa3aac Fix data - escape unicode, not regex pattern git-subtree-dir: json git-subtree-split: 4e6c78b4d878aa427d7b41e7b65cde0ea6f503ea
We just hit this, and while I agree with the general resolution, it would have been very helpful for the Just something as simple as a |
@OddBloke that kind of thing probably happy to review a PR for if you're interested -- with tests :), and with a new exception type rather than But yeah definitely open to just wrapping the exception with a more specific one. |
Fixes #285
I looked over the tests, but wasn't able to infer the appropriate place to add test cases for these errors. Should I add two new tests to test_validators.py - one for the patternProperties validator and another for the additionalProperties validator?