Add more tests for valid emails #253

bvisness · 2019-02-19T20:23:46Z

Per RFC 1123, hostname labels are allowed to start with either a letter or a digit. The RFC 5322 email spec specifies that the domain part of an email is subject to the modifications in RFC 1123. However, the tests for hostnames and emails do not include any examples of hostname labels beginning with a digit.

This PR adds cases to the hostname and email tests for all draft versions.

bvisness · 2019-02-19T21:09:13Z

As much as it pains me to say it, it appears that the JSON schema spec only requires hostnames to be compliant with RFC 1034, which does not permit hostname labels to begin with digits. This is clearly a mistake on their part, since RFC 1123 has been around since 1989, but I guess we're stuck with it.

Should I remove my new hostname tests from this MR?

Julian · 2019-02-19T21:16:56Z

@bvisness probably file a ticket upstream in the spec repo (or better a PR!) to reference the newer RFC going forward in future drafts

bvisness · 2019-02-20T16:02:16Z

So @Julian, am I crazy or is the CI for the test suite really rejecting my new tests because the JS implementation has too narrow a definition of an email address?

Julian · 2019-02-20T16:18:08Z

That certainly seems like what's happening, though that must have changed without me noticing --

@epoberezkin do you know anything about that? We really shouldn't have the test suite's test suite running an implementation on them (not mine, yours, nor any others) -- if the implementation has bugs, it will block people sending patches (as possibly is what's happening here).

epoberezkin · 2019-02-21T10:44:44Z

@Julian a long comment here - my day off :)

We really shouldn't have the test suite's test suite running an implementation on them (not mine, yours, nor any others) -- if the implementation has bugs, it will block people sending patches (as possibly is what's happening here).

I don't think it is correct. I believe it is better to slow down adding tests than to have completely untested test-suite. Whenever you have tests that confirm that your code is compliant with them, the code you tests confirm that the tests are compliant with the code. There is no way to prove that they are correct or fit for purpose, you are just proving that they are consistent with each other.

Having untested test suite makes it completely useless... Would you rely on visual PR inspection to merge?

I do think the opposite - we should have a bigger test matrix of validators / draft version that is visible in travis as separate jobs, so we can see which draft version tests passed against which validators. Once we have several validators testing tests for each draft version, we can have criteria for partial pass required to merge the changes. As we only have one job now we should not be adding additional tests without them passing the tests. And given that formats support is optional in the spec, formats tests should be minimal rather than comprehensive, and when we create a test matrix, formats should be in separate jobs.

Also, @bvisness, this test suite never intended to be complete, I believe that "there is no way to prove that a validator fully complies with the spec" - or at least, it is ~~provable~~ likely, that no finite number of tests can guarantee it (see the example below), whether you can prove compliance in any other way I don't know. The only thing you can prove is that a validator is not complying if it fails any of the tests. I think a stronger statement is correct (although I have no idea if there is a way to prove this or the opposite statement) - "it is not possible to create a validator that fully complies with the spec, it is always possible to find an example that would prove that a validator is not compliant". It doesn't make validation useless, but we have to be aware that it has limitations.

Some time ago I tested several JS validators against the tests Ajv accumulated over time (when fixing bugs, addressing limitations, etc.) - https://github.com/epoberezkin/test-validators. Not a single validator passes all Ajv tests. It doesn't prove that Ajv is better or even more compliant that other validators, it is quite possible that Ajv would fail a bigger number of tests of other validators. But it does mean that Ajv is better for its users, as the bugs and limitations that they reported were addressed and they are not addressed by other JS validators.

Another demo I created at the same time was a fake validator that fully complies with this test-suite and yet not only it is not compliant with the spec, it is absolutely useless - https://github.com/epoberezkin/fjv. This validator fails validation only in the cases that this test-suite prescribes, and passes in all other cases (it simply compares schemas and data instances with the examples in this test suite). That example ~~proves~~ suggests a weaker statement than the above: "Any finite set of tests is not sufficient to prove that a validator complies with the spec - a validator exists that passes all the tests in this set and yet this validator is not compliant with the spec, because it is always possible to add one more test that would be required to pass to comply with the spec, and this validator would fail this test". More a less the story of VW passing all the tests on the stand and yet not complying with the legislation.

epoberezkin · 2019-02-21T11:00:37Z

Maybe there is a more generic way to write tests that would be more comprehensive, but fully comprehensive tests suite would only prove compliance with the spec if we agree that this test-suite IS the spec, and there are no other requirements to comply other than to pass this test-suite. It may be possible to express JSON Schema as a set of tests written in a language that has an advanced type system, e.g. haskell.

Julian · 2019-02-21T12:18:32Z

@Julian a long comment here - my day off :)

What's a day off, that sounds nice :(

I believe it is better to slow down adding tests than to have completely untested test-suite. Whenever you have tests that confirm that your code is compliant with them, the code you tests confirm that the tests are compliant with the code. There is no way to prove that they are correct or fit for purpose, you are just proving that they are consistent with each other.

That's not how it works for any normal test suite (for any normal code base) -- when you write tests, you don't write tests for your tests, do you? I don't, and it's considered an antipattern to do so -- tests need to be correct by visual inspection -- someone needs to be able to look at the test and tell that it tests correct behavior, not by writing another layer of tests ensuring the test tests the right behavior.

Having your validator (or mine, etc.) pass or fail over some new tests being written provides 0 confidence -- both false positives and false negatives are possible -- after all the whole point of a new test is it's a thing the suite previously didn't cover.

The point of the "tests" for this test suite is totally different, and they operate at a completely different mental level -- they test things like "are all the descriptions unique" -- something totally different from the level of "does this test pass or fail on instances", which is what the test suite itself is meant to do. You cannot test that, you just have to write correct tests.

Having untested test suite makes it completely useless... Would you rely on visual PR inspection to merge?

Yes precisely, the same way as one does for a normal code base :)

I do think the opposite - we should have a bigger test matrix of validators / draft version that is visible in travis as separate jobs, so we can see which draft version tests passed against which validators. Once we have several validators testing tests for each draft version, we can have criteria for partial pass required to merge the changes.

Are you suggesting we merge tests by consensus of validators, because yeah I strongly disagree with that one -- again, new tests are "things that are in the spec, that we forgot to test for before". It's highly likely that some, most, or all validators fail for a new test being added.

If you're instead suggesting we also run this suite against a bunch of validators and report what happens, but do not use that to block PRs, well that's a great idea, it's #20 :) and yeah, we should, on every commit, run against all the validators who contribute a way to have their tests run, and then report what happened. But that's totally separate from this process of adding new tests to the suite.

This validator fails validation only in the cases that this test-suite prescribes, and passes in all other cases (it simply compares schemas and data instances with the examples in this test suite).

I haven't read fjv's code, but to be honest this is again another thing I think is well known in the world of testing. The point of tests are not to make it impossible to write an implementation that passes all the tests but still is wrong. Humans are not machines, and we are not randomly generating programs a la monkey Shakespeare.

The point of tests is to make the correct implementation be the simplest implementation that still passes all the tests.

And therefore make a correct implementation more likely than an incorrect one.

Anything beyond that requires formal verification methods.

bvisness · 2019-02-21T16:46:55Z

I'm just trying to keep people's implementations from rejecting real email addresses! Which the JS one is apparently doing!

As @Julian and I were discussing in #254, really strict tests put a large burden on implementers for what is just an optional feature. That's why I have only added valid test cases. Obviously no set of tests can guarantee correctness (as you have so elegantly proven), but when has that ever been a reason not to add more tests?

bvisness · 2019-02-21T16:49:34Z

My intent with these tests is to make implementers do one of the following:

Implement actual RFC 5322 validation.
Throw up their hands and just look for an @ sign.

Either approach will pass the tests, and neither approach will reject real, valid email addresses like the ones that caused our production code to explode this week.

epoberezkin · 2019-02-21T16:59:25Z

That's not how it works for any normal test suite (for any normal code base) -- when you write tests, you don't write tests for your tests, do you? I don't, and it's considered an antipattern to do so -- tests need to be correct by visual inspection -- someone needs to be able to look at the test and tell that it tests correct behavior, not by writing another layer of tests ensuring the test tests the right behavior.

Of course you do visual inspection, and the tests are supposed to define the behaviour, and they are supposed to be simple and declarative to be the documentation for your code. I am not challenging any of that, and I do not write tests for tests (and neither we do it here). But you are not saying that you never had a situation when a mistake in tests was caught by actually running them against your code? I think, everybody had (or at least most people). So unless you run the tests, you cannot be certain they are correct, can you? This is what we do here - we actually run the tests against the code base(s) that is(are) supposed to be correct against these tests, we are not writing tests for tests.

Having your validator (or mine, etc.) pass or fail over some new tests being written provides 0 confidence -- both false positives and false negatives are possible -- after all the whole point of a new test is it's a thing the suite previously didn't cover.

"both false positives and false negatives are possible" - it is definitely correct, but from it doesn't follow that "Having your validator (or mine, etc.) pass or fail over some new tests being written provides 0 confidence" - it actually quite the opposite, having tests actually run on every change did provide me the confidence they are correct, tests that are not executed against the codebase whenever tests change are less valuable. Any test failure warrants the investigation of both the test and the code and the decision which is correct.

Yes precisely, the same way as one does for a normal code base :)

Visual inspection is only the first step, then you run them and then you investigate why they fail - mistake can be in both the test and in the code, if you don't run them - how would you catch mistakes that can creep in in PR that contains 120 lines (this one) and can be missed during the visual inspection?

Are you suggesting we merge tests by consensus of validators, because yeah I strongly disagree with that one -- again, new tests are "things that are in the spec, that we forgot to test for before". It's highly likely that some, most, or all validators fail for a new test being added.

Tests are not part of the spec. If they were, they should be published together with the spec. They are not even referenced from the spec. Some form of consensus, I didn't suggest any specific process, is a normal way to resolve any contradictions. By the way I didn't challenge whether these tests are correct - they probably are. I only challenged whether they are necessary. If we believe they are, we need to fix validator(s) that failed them, re-run the tests and then merge.

The point of tests is to make the correct implementation be the simplest implementation that still passes all the tests. And therefore make a correct implementation more likely than an incorrect one. Anything beyond that requires formal verification methods.

I agree, I was only saying that there is a limited value in adding non-orthogonal tests that test behaviours that are optional.

If you're instead suggesting we also run this suite against a bunch of validators and report what happens, but do not use that to block PRs, well that's a great idea, it's #20 :) and yeah, we should, on every commit, run against all the validators who contribute a way to have their tests run, and then report what happened. But that's totally separate from this process of adding new tests to the suite.

I am suggesting that we agree on what minimal amount of tests jobs need to run to allow merging the tests. As now we only have one test job, well - from my point of view, it is a minimal amount to merge, unless the problem is not fixed for a long time (no idea what defines long) but given that there is no urgency to extend the test suite, it can be anything between a few weeks and a few months, whatever feels appropriate. Otherwise, it's better to have un-merged tests for some time than losing ability to run them in this code base. To make it more flexible we need to break one job to several jobs and agree on merging criteria.

epoberezkin · 2019-02-21T17:11:52Z

@bvisness on the subject of this PR, leaving test theory aside: only two tests fail with Ajv:

a valid email address with a quoted local part
a valid e-mail address with a bracketed IP address

It may be worth splitting it in two PRs, merge the one that passes and merge another one once it's investigated / fixed in Ajv

Julian · 2019-02-21T17:14:11Z

(Thanks @epoberezkin, I still very strongly disagree I think, but seems likely if we chat it out we'll figure out just where we disagree, which possibly I haven't noticed yet)

Will respond to that comment in more detail, but exactly your last comment is what I very strongly disagree with -- it is not @bvisness's responsibility to wait for / fix / investigate the behavior of your validator (nor mine). If tests fail, he should not have to wait until ajv investigates it's bugs (again, or mine).

Of course many validators failing on a test is a good sign that either the spec is unclear or the test is wrong, which points you to be really really sure the test is correct, but I don't think that's what's happening here, and the way things are set up right now isn't very conducive to that, because the build will remain red until ajv fixes its bug.

epoberezkin · 2019-02-21T17:15:44Z

the build will remain red until ajv fixes its bug.

That's why I suggest we split it into a matrix (with formats separate) so at least we can see what is wrong rather than just blindly merge.

And fixing won't take long - I just need to confirm that these tests are indeed correct - they have somewhat unusual emails.

Julian · 2019-02-21T17:16:01Z

Oh and by the way, I am implicitly assuming a requirement which maybe we all have never agreed upon (but which all my work teams go by) which is "you can't merge until the build is green".

Julian · 2019-02-21T17:16:34Z

That's why I suggest we split it into a matrix (with formats separate) so at least we can see what is wrong rather than just blindly merge.

Ah cool, great! Then we agree. That was exactly my point :), and that we mark all the validator runs as optional for merge.

Julian · 2019-02-21T17:20:24Z

Er, well, hopefully we do, as long as this is a typo:

That's why I suggest we split it into a matrix (with formats separate) so at least we can see what is wrong rather than just blindly merge.

Not formats separate, ajv separate, yes?

epoberezkin · 2019-02-21T17:20:41Z

Ah cool, great! Then we agree. That was exactly my point :), and that we mark all the validator runs as optional for merge.

I don't like the idea that everything is optional. I am ok with some jobs are optional and the process should to flag it, create an issue in validator in question (which I did) and then decide to merge.

Maybe the process should require double (or triple approval for tests that fail... I just want us to agree on the process to follow whenever not all jobs pass (or at least more jobs fail than before, what's the time after which the jobs have to be removed etc.) Let's take this conversation elsewhere :) I will probably fix and we will merge (as I said I have no idea whether these two unusual emails are supposed to pass (and whether we want to have such tests is another question entirely)

epoberezkin · 2019-02-21T17:26:49Z

Not formats separate, ajv separate, yes?

Both separate. Formats are optional in the spec, and they are more likely to fail, particularly in some pathological cases, like in this PR.

So we would have this matrix:

draft-3
- keywords: jsonschema
- optional: jsonschema
draft-4
- keywords: jsonschema, ajv
- optional: jsonschema, ajv
draft-6
- keywords: jsonschema, ajv
- optional: jsonschema, ajv
draft-7
- keywords: jsonschema, ajv
- optional: jsonschema, ajv

8 jobs in total.

Even if formats or some other optional tests fail, we would still know that the mandatory testspass. Ajv only tests some old formats, not the newer ones and not content encoding keywords.

epoberezkin · 2019-02-21T17:28:48Z

Oh and by the way, I am implicitly assuming a requirement which maybe we all have never agreed upon (but which all my work teams go by) which is "you can't merge until the build is green".

If you only run tests against one codebase it is a normal thing. Here we run tests against multiple external codebases we have no control over, so the requirement can be reduced. I just want there to be some clarity about these requirements rather than "it's always ok to merge"

Julian · 2019-02-21T17:29:35Z

I disagree with any validator's run being mandatory. Let's discuss wherever is convenient.

…

On Thu, Feb 21, 2019, 12:26 Evgeny Poberezkin ***@***.***> wrote: Not formats separate, *ajv* separate, yes? Both separate. Formats are optional in the spec, and they are more likely to fail, particularly in some pathological cases, like in this PR. So we would have this matrix: - draft-3 - keywords: jsonschema - optional: jsonschema - draft-4 - keywords: jsonschema, ajv - optional: jsonschema, ajv - draft-6 - keywords: jsonschema, ajv - optional: jsonschema, ajv - draft-7 - keywords: jsonschema, ajv - optional: jsonschema, ajv 8 jobs in total. Even if formats or some other optional tests fail, we would still know that the mandatory testspass. Ajv only tests some old formats, not the newer ones and not content encoding keywords. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#253 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAUIXu7lGlR8ZJSQZoyClwACJOVsyDczks5vPtbZgaJpZM4bDys-> .

epoberezkin · 2019-02-21T18:07:29Z

With travis build matrix you can define which jobs are allowed to fail:
https://docs.travis-ci.com/user/build-matrix/#rows-that-are-allowed-to-fail

So we can have all optional tests as optional. The whole build will pass, but we will see which tests failed against which validators.

And I would really encourage all validator authors to have their tests here.

What merging criteria we have is secondary to me.

epoberezkin · 2019-02-21T18:09:01Z

Or we can only have tests that validate semantic validity of the tests - i.e. that JSON files are valid JSON, even test the tests against their schema etc. - only have them as mandatory.

That would definitely reduce the chances of errors and typos.

Julian · 2019-02-21T21:45:52Z

Right so what we're now coming to is I think a lot like what I had in mind for #20 --

So I think it's a great idea but I also want to get @bvisness unblocked -- in other words get the build green :). There's a bunch of details that need hashing out I think, e.g. I suspect #53 even is related -- but yeah if possible I would like to get the build green again so that we can then decide the examples here are correct, merge this, and then figure out details on #20, rather than making him wait while we discuss.

Are you OK with doing that?

bvisness · 2019-02-21T21:50:05Z

Frankly, you can take as much time as you need. I'm not turning email validation back on for our code any time soon. 😛

epoberezkin · 2019-02-21T21:55:56Z

Are you OK with doing that?

As I wrote, I am not sure:

whether these edge-case emails are valid emails (see Add more tests for valid emails #253 (comment)) - so I cannot yet confirm whether two failing test cases should pass
if they are correct, whether we should have them in this test suite at all - it is not the test suite for email spec, so including edge cases is possibly not the right thing.

So I cannot confirm either way. Definitely happy to merge other tests without these 2 (and the other tests also pass). These two can be separated to another PR.

Julian · 2019-02-21T22:01:41Z

whether these edge-case emails are valid emails (see #253 (comment)) - so I cannot yet confirm whether two failing test cases should pass

Yes sure, agreed, which I haven't done myself because I noticed this setup, but of course we can check fairly easily whether they're permitted under the spec.

if they are correct, whether we should have them in this test suite at all - it is not the test suite for email spec, so including edge cases is possibly not the right thing.

If they are correct, your implementation has a bug and rejects them :), so I'd find it hard to hear an argument they don't belong here, the goal of this suite is to help implementations not have bugs.

epoberezkin · 2019-02-21T22:09:14Z

If they are correct, your implementation has a bug and rejects them :), so I'd find it hard to hear an argument they don't belong here, the goal of this suite is to help implementations not have bugs.

See above :) 1) All implementations have bugs, I am more interested whether these bugs affect any users. 2) No finite set of tests exists that would catch all bugs (at least, not in this format).

So the decision whether to include or not should be based on something else, otherwise you would end up with ever growing set of tests that do not reflect actual usage of the spec.

The important question is whether a test case reflects a realistic usage scenario - what is the use case for using such emails in any context? - seems it's purely for the sake of theoretic correctness.

Julian · 2019-02-21T22:16:58Z

"Don't write tests because it's hopeless and there will always be bugs"? Really don't see what your argument is, sorry. Again, this suite was designed to ensure working implementations. If yours doesn't want to support behavior mandated by the spec you are free to skip those tests.

…

On Thu, Feb 21, 2019, 17:09 Evgeny Poberezkin ***@***.***> wrote: If they are correct, your implementation has a bug and rejects them :), so I'd find it hard to hear an argument they don't belong here, the goal of this suite is to help implementations not have bugs. See above :) 1) All implementations have bugs, I am more interested whether these bugs affect any users. 2) No finite set of tests exists that would catch all bugs (at least, not in this format). So the decision whether to include or not should be based on something else, otherwise you would end up with ever growing set of tests that do not reflect actual usage of the spec. The important question is whether a test case reflects a realistic usage scenario - what is the use case for using such emails in any context? - seems it's purely for the sake of theoretic correctness. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#253 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAUIXlKAl7ZJPJj7YY5YnUbyEN74vW36ks5vPxkKgaJpZM4bDys-> .

epoberezkin · 2019-02-21T22:21:43Z

Yes sure, agreed, which I haven't done myself because I noticed this setup, but of course we can check fairly easily whether they're permitted under the spec.

My reading of the RFC5322 (email spec) is that these two failing emails are incorrect in this context. RFC5322 says:

Note: A liberal syntax for the domain portion of addr-spec is given here. However, the domain portion contains addressing information specified by and used in other protocols (e.g., RFC1034, RFC1035, RFC1123, RFC5321). It is therefore incumbent upon implementations to conform to the syntax of addresses for the context in which they are used.

To re-phrase it, RFC5322 says that the domain part of email name should comply with the addressing syntax of the context where it is used. It does list RFC1123, but only as an example. The context for email syntax is JSON schema validation, and JSON schema spec also defines which RFC should be used for hostname validation - it is RFC1034.

So I don't think RFC1123 should apply for validation of domain part of email, and instead RFC1034 should apply.

epoberezkin · 2019-02-21T22:24:43Z

"Don't write tests because it's hopeless and there will always be bugs"?

No, this is not what I wrote :) What I wrote equates to "do not write more tests than necessary to test realistic use cases". Beyond certain number of tests, they do not add value, and only increase maintenance and cognitive costs.

epoberezkin · 2019-02-21T22:30:19Z

Actually, this #253 (comment) relates to the second failing test.

bvisness · 2019-02-21T22:31:17Z

@epoberezkin May I again point out that I have added test cases for real, valid email addresses because implementations were rejecting real, valid email addresses? What do you think these tests are for?

This bug actually affected me, an actual user. Validating actual customer email addresses is certainly "actual usage of the spec". I have included tests that catch a real bug in a real implementation that I really used. I cannot fathom why this is so controversial.

bvisness · 2019-02-21T22:33:53Z

If you think that it's ok for your email address validation to only support RFC 1034 domains, then you are living in a fantasy world where no customers have email addresses at RFC 1123 domains. This is false; we have customers with email addresses at RFC 1123 domains, and I had a schema implementation break because of it.

The JSON schema spec is only as useful as its application in the real world!

epoberezkin · 2019-02-21T22:38:01Z

real, valid email addresses

@bvisness This is what I am looking at. Real and valid is not the same. There is no universal definitions of validity, there are various RFCs that define it. The fact that the email is real and used in some context does not mean that it should be valid against JSON Schema email format. If JSON Schema definition is obsolete, then the spec should be updated and then the test suite should be updated, not the other way around.

Most validators allow replacing the implementation of format validation (and even turn it off completely) - so you can always supply the email definition that is working for your usage scenario, but it does not mean that any particular library has a bug that needs to be fixed.

epoberezkin · 2019-02-21T22:40:50Z

... and even if a validator needs to be fixed, proposing a PR that fixes a real issue is more valuable than a test case that makes this test-suite deviate from the spec.

We're only talking about [...] domain part here btw, a quoted string should be valid it seems.

bvisness · 2019-02-21T22:46:58Z

Whether we're talking about local part or domain part, every case I've provided is a valid RFC 5322 email address. Maybe RFC 5322 truly doesn't specify anything about the domain part - in that case, implementations should not reject emails because of the domain. There is no need to update any spec to address the test cases I have added.

I've already addressed this issue in my actual validator of choice: santhosh-tekuri/jsonschema#20

All I want is for the same bug not to occur in other implementations!

Julian · 2019-03-01T13:10:59Z

@bvisness So I still haven't gotten a chance to actually form an opinion here, so apologies :/

I would say that I agree the test suite should not be ahead of the spec though -- so if it's "reasonable" to interpret the spec in some way (even if it's ridiculous, like your hostname RFC bug), we definitely should fix it, but not here, first in the spec, then in the draft that's published with the fix.

But yeah will try and carve some time out to read your examples in the next couple of days.

@epoberezkin I again still would like to see a PR removing ajv from the build, regardless of this ticket. I still can't tell why you're pushing back on that. Again, your validator should not be more special than any other with regards to a new PR that adds previously unseen tests, and if we want to hit #20, we can do so in a way that does not disrupt changes to the test suite.

epoberezkin · 2019-03-01T23:32:44Z

I am only pushing back against these additional tests that possibly don’t match the current spec. I agree that we should split running validators against the test cases, so that PRs only fails on structural validation of test cases. We can introduce a more complex grid later but I was going to split running all tests to an optional job so we can still screen PRs like this.

awwright · 2019-03-01T23:37:21Z

Here's my opinion: Among the implementations that choose to validate format (and "email" in particular), they should probably all behave the same.

First, we should ensure that the spec is referencing the correct production.

Second, we should add as many tests as necessary to ensure that implementations are using the correct production.

Finally, as we add more examples it's likely we'll get something wrong, so we should incorporate an email address test library to ensure that our examples are correct.

Julian · 2019-03-02T16:51:16Z

I am only pushing back against these additional tests that possibly don’t match the current spec. I agree that we should split running validators against the test cases, so that PRs only fails on structural validation of test cases. We can introduce a more complex grid later but I was going to split running all tests to an optional job so we can still screen PRs like this.

OK cool! Phew, glad we agree -- great. That helps me focus on just whether I agree these tests are conformant or not! Sorry for the confusion.

@awwright thanks for weighing in, that's roughly what I think too (though I think the first part of your comment is not very likely to ever be true in practice :D)

So let's get concrete here again, because I think I'm coming around to @epoberezkin's concrete recommendation but want to make sure we're saying exactly the same thing (which maybe for the last part of this we are not):

One could reasonably read the current spec as saying "use RFC1034 for hostnames" and understand "use that for email too". That's dumb, but it's possibly what it says today.
I think one could also reasonably read the current spec to use RFC1034 for the hostname format, but for email do whatever the spec says, and also reasonably read that to allow these under that spec if you don't make the (logical but not forced) second step of "RFC1034 is what the spec picked for hostnames so apply that to email too"

In the face of ambiguity, both behaviors are therefore "acceptable" for the current drafts.

So first, if we want one behavior or the other, we update the spec to make sure that if we don't want that, that it clearly says so. That I think we already have, since now we do not ask for RFC1034 anymore. So regardless:

These tests should be merged in for draft 8

As for earlier drafts, again, either behavior sounds acceptable (agree? disagree?) but validators still need to "pick" one, and in theory be testing them -- e.g., in my validator, I'd like the liberal behavior, so I want tests making sure that happens even on draft 7.

So in some sense there are 2 sets of mutually incompatible results, and a need for a validator to pick one of the two, yes? I think we'd all agree that a validator that say, returned false for one of the 2 disputed examples and true for the other is definitely not in compliance.

So, ridiculous as it sounds, maybe what we want is two conflicting tests with opposite results and then each downstream user runs the ones from the behavior they want, and skips the other test?

Otherwise you run into the fundamental problem that the suite tries to solve, which is downstream validators writing their own not-shared tests.

epoberezkin · 2019-03-03T00:03:44Z

@Julian correct me if I am wrong (it’s a bit confusing :) - you say that a validator can freely choose to return true or false in the scenario of this test, and in both cases it would be correct from draft-07 point of view, but not from draft-08 point of view where only one result should be correct.

If so, then I believe that this test should NOT be added to draft-07 (as both interpretations ARE valid, none is REQUIRED to be correct, so no need to test it either way). This test should be added to draft-08 though.

As for validators introducing their own tests - I think they all do it already, so it’s not a problem. Tests appear in validators in the process of addressing the issues that can be very specific to only one implementation - I don’t think it is practical to add them all to the suite. Although if you think otherwise, I don’t mind adding Ajv tests here - some of them are VERY specific though, and many people would not understand why they are needed.

bvisness · 2019-03-04T18:03:54Z

Yeah, I think I would rather have no tests than two sets of tests (especially for an optional part of the spec.)

I still believe that my new test cases (though esoteric) are valid RFC 5322 emails, even though JSON Schema's understanding of a hostname is flawed. I read the JSON schema spec as validating hostname and email independently, but at this point it really seems like it's not worth it anyway.

I balk at the idea of adding a full email test suite to any version of the JSON schema tests. The fact that such a comprehensive test suite would be required to enforce an optional "convenience" feature of the spec makes me think that JSON schema shouldn't have format: "email" at all. Maybe it shouldn't even have format. Whatever the outcome of this, it's clear to me that I should just validate formats in my own application instead of relying on a different implementation.

handrews · 2019-03-05T06:47:03Z

@bvisness

I still believe that my new test cases (though esoteric) are valid RFC 5322 emails, even though JSON Schema's understanding of a hostname is flawed.

json-schema-org/json-schema-spec#716

I'd have at-mentioned you but I had forgotten this was a test suite issue and couldn't find it when I put up the PR.

handrews · 2019-03-05T20:12:47Z

@bvisness

Whatever the outcome of this, it's clear to me that I should just validate formats in my own application instead of relying on a different implementation.

This is one of the most likely directions for format in draft-09 (the one after the one we're wrapping up right now). format is a confusing disaster for schema authors, and a complicated, poorly defined burden for implementors.

My personal preference would be to outright remove the assertion behavior (instead of having it be optional with randomly variable support in practice), say that implementations MAY offer validation as an extension, but authors MUST NOT rely on any such thing, and instead encourage applications to leverage format's annotation functionality to perform whatever level of validation they need for their own use case. This is kinda-sorta discussed in some spec issues, but also with some other possibilities for cleaning this mess up.

bvisness · 2019-03-05T21:28:59Z

That sounds like an excellent direction to me!

Julian · 2019-03-15T11:14:57Z

As for validators introducing their own tests - I think they all do it already, so it’s not a problem. Tests appear in validators in the process of addressing the issues that can be very specific to only one implementation - I don’t think it is practical to add them all to the suite. Although if you think otherwise, I don’t mind adding Ajv tests here - some of them are VERY specific though, and many people would not understand why they are needed.

@epoberezkin are you saying you have tests beyond those in this repo? I definitely don't, not ones for schema behavior, just for my own language-level API, so if so yes definitely would love to see them in this repo! Obviously hard to know without seeing them but it seems likely it should at least be possible to describe each one in a transparent enough way.

epoberezkin · 2019-03-15T23:05:25Z

@Julian yes, lots of JSON schema tests in Ajv repo. I will commit them here in batches.

handrews · 2019-10-10T17:34:24Z

Note that draft 2019-09 now references RFC 1123.

Add tests for hostnames with numbers

079b585

bvisness mentioned this pull request Feb 19, 2019

Test suite for email format is very weak #254

Closed

Remove new hostname tests because RFC 1123 is actually not required

3c48081

bvisness changed the title ~~Add tests for hostnames with numbers~~ Add more tests for valid emails Feb 20, 2019

bvisness added 2 commits February 20, 2019 09:54

Add many more examples of valid emails

52f953b

Shorten the length of some test names

298b47b

bvisness mentioned this pull request Feb 20, 2019

Allow hostname labels starting with digits santhosh-tekuri/jsonschema#20

Merged

epoberezkin mentioned this pull request Feb 21, 2019

Incorrect email format validation? ajv-validator/ajv#956

Closed

epoberezkin mentioned this pull request Mar 24, 2019

Additional keyword tests #257

Merged

karenetheridge mentioned this pull request May 27, 2020

support 'format' vocabulary as assertions karenetheridge/JSON-Schema-Modern#14

Closed

Julian mentioned this pull request Jun 14, 2020

Extend email format tests #405

Merged

Julian closed this Jul 4, 2020

Add more tests for valid emails #253

Add more tests for valid emails #253

Conversation

bvisness commented Feb 19, 2019

bvisness commented Feb 19, 2019

Julian commented Feb 19, 2019

bvisness commented Feb 20, 2019

Julian commented Feb 20, 2019

epoberezkin commented Feb 21, 2019 • edited Loading

epoberezkin commented Feb 21, 2019

Julian commented Feb 21, 2019

bvisness commented Feb 21, 2019

bvisness commented Feb 21, 2019

epoberezkin commented Feb 21, 2019 • edited Loading

epoberezkin commented Feb 21, 2019 • edited Loading

Julian commented Feb 21, 2019

epoberezkin commented Feb 21, 2019 • edited Loading

Julian commented Feb 21, 2019

Julian commented Feb 21, 2019

Julian commented Feb 21, 2019

epoberezkin commented Feb 21, 2019

epoberezkin commented Feb 21, 2019

epoberezkin commented Feb 21, 2019

Julian commented Feb 21, 2019 via email

epoberezkin commented Feb 21, 2019

epoberezkin commented Feb 21, 2019 • edited Loading

Julian commented Feb 21, 2019

bvisness commented Feb 21, 2019

epoberezkin commented Feb 21, 2019 • edited Loading

Julian commented Feb 21, 2019

epoberezkin commented Feb 21, 2019

Julian commented Feb 21, 2019 via email

epoberezkin commented Feb 21, 2019 • edited Loading

epoberezkin commented Feb 21, 2019

epoberezkin commented Feb 21, 2019

bvisness commented Feb 21, 2019

bvisness commented Feb 21, 2019

epoberezkin commented Feb 21, 2019

epoberezkin commented Feb 21, 2019

bvisness commented Feb 21, 2019

Julian commented Mar 1, 2019

epoberezkin commented Mar 1, 2019

awwright commented Mar 1, 2019 • edited Loading

Julian commented Mar 2, 2019 • edited Loading

epoberezkin commented Mar 3, 2019

bvisness commented Mar 4, 2019

handrews commented Mar 5, 2019

handrews commented Mar 5, 2019

bvisness commented Mar 5, 2019

Julian commented Mar 15, 2019

epoberezkin commented Mar 15, 2019

handrews commented Oct 10, 2019

epoberezkin commented Feb 21, 2019 •

edited

Loading

epoberezkin commented Feb 21, 2019 •

edited

Loading

epoberezkin commented Feb 21, 2019 •

edited

Loading

epoberezkin commented Feb 21, 2019 •

edited

Loading

epoberezkin commented Feb 21, 2019 •

edited

Loading

epoberezkin commented Feb 21, 2019 •

edited

Loading

epoberezkin commented Feb 21, 2019 •

edited

Loading

awwright commented Mar 1, 2019 •

edited

Loading

Julian commented Mar 2, 2019 •

edited

Loading