-
-
Notifications
You must be signed in to change notification settings - Fork 311
Attempt to do “product design” of JSON Schema addressing #725
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Not sure what this label means :) |
ProblemThe conflict among the maintainers, contributors and implementers has been on-going since before I learned about JSON schema existence. We all seem to have forgotten that JSON schema was created for some purpose and the attempts to shut down and ridicule any dissent to the current spec direction from many maintainers go beyond what is acceptable in professional conversation, both in tone and in spirit. We definitely do not go by any consensus model, as @ucarion suggested. This attitude of JSON Schema maintainers has already driven away many valuable contributors (e.g. @fge) and I am very close to not caring about the future of the current specification as well. The current version of the spec is already too difficult to implement and I can make a safe bet that there is not a single JSON Schema validator that fully supports the spec, even excluding optional features. Draft-08 raises the implementation game to the next level. Even though I am supporting the validator that is more used than any other JS validator, my concerns about implementation complexity are completely ignored. Possible way forwardI do not think it is a lost cause though. I believe that by doing a "product design" of JSON schema we can align all the interested parties and agree on a simpler version of the spec that achieves 95% of the current value at 50% implementation and cognitive complexity. By product design I mean that for each group of users (new adopters, users, advanced users, spec maintainers, implementers) we would define the "objectives" that the spec delivers to them (in the language of external outcomes, not in terms of using JSON schema) and the user stories (again as what they want to achieve, without any references to the JSON schema vocabulary, which is equivalent to product UX). This work has never been done and the lack of this clarity leads to the current conflicts and feature bloat. These scenarios should start from high level and general, such as:
to very specific
Once we agree on the objectives and scenarios we may need, we also need to assign a value of each scenario for the community, based on some agreed criteria - including the number of groups that need this scenario and the benefits etc. Once we understand the value of all required scenarios, we can consider various options how this scenarios can be achieved - including the current solution that exist in the spec, solution that is proposed in draft-08 and any alternatives. For each solution we need some implementation complexity assessment, but it should be only done by the core maintainers of widely used validators - theoretic considerations of implementation complexity are not helpful. The rest is simple - we define the priority of each scenario as ratio of benefit to implementation complexity, and if this ratio is lower than any agreed threshold the feature should be not included in the spec, whether the feature is old or new. As I wrote, I believe that this product analysis of JSON schema spec will lead to a radical simplification of the spec and will retain 95% of the value at half of the implementation cost, will make implementations more consistent and eventually lead to JSON schema becoming RFC - a win for everybody. Alternative ways forward
|
First, this seems somewhat ambitious. Sometimes that's warranted! But right now I know many of us are fatigued. Consensus is indeed an art, I've tried very hard to make sure we're consensus-driven in the IETF style (as opposed to other styles like at ISO, ECMA, or similar alternatives). It's caused a large amount of strife at times but remarkably we've largely come around to the same opinions by talking through things. However, we're primarily a forum for implementations, and a key thing to keep in mind is mostly we need consensus to change things: We want some certainty that when we make a change, we know it's an improvement. In the typical process of standards development, I'd say you're looking for a list of use cases. Sometimes specifications formally publish a document of use cases, sometimes it's just the examples, but they're always important. The use cases all follow from some sort of charter or purpose. The purpose of JSON Schema is to make assertions about JSON documents. That's it. This makes JSON Schema useful for so many things: validating user input, publishing expectations that servers have of clients, and autogenerating documentation (among other things). These all come from JSON Schema's ability to make assertions about JSON documents. And if it doesn't have to do with that, then we say good luck, but that's out-of-scope. Now there's an implicit part of this, that we're publishing an Internet standard, so it has to fit into the Internet and Web architecture. This means things like:
And so on. These don't advance the purpose per se, but we have different routes to solve the problem, and it's just easier and more accessible for users if we adopt the Internet ecosystem. e.g. we could invent our own form of identifier, but if we (carefully) use the Internet's preferred identifier (the URI), JSON Schema is suddenly more useful as an Internet standard. So, if we can stay within this process, and if you can trust the IETF process, then I'd say let's try to come up with a list of use cases for JSON Schema. The wiki is a pretty good place to publish research like this, let's make a page there. List a use-case, a few tests (positive and negative), and comparisons to other technologies (my favorite is HTML). That said, I'm skeptical there's as much of an implementation problem as you seem to describe. First, it's our job as implementors to take on complexity on behalf of users. This is the Priority of Constituencies. This is not to dismiss all concerns; if there's a good argument that a feature is causing performance issues for users, that's something to consider. Second, if a feature is making life difficult for schema authors, then sure let's work on that. But I'm not sure it gets much easier than "use Third, I don't even think it's that difficult to implement. I've written two validators that work as described in the recent drafts. This process only requires I index the names (the There's lots of arguments to be had about theoretical purity ("BUT what if you have an $id naming a property path!!!!"), and let's consider those, but at the end of the say that's our least important concern. Finally, sometimes there's just multiple correct ways of doing things, and not everyone is going to try the same thing first. If we've picked the best alternative and given people ways to recover from making the wrong first guess, that's going to be the best we can do. So in short, we're kind of burned out, but I was actually looking at starting a Wiki page documenting how references are used "in the wild"; if you want to spearhead that & a list of use-cases (maybe including things people wanted to do but couldn't figure out), let's go for it. |
@awwright thanks for the reply
There are two areas where there is a consensus not reached: addressing (that's what this issue is about) and schema re-use - extending properties. So the suggestion was to address this bit by bit, not all at once. Agreed on "assertion" being high level scope.
I agree with that, I believe that in our case JSON Schema users are authors, in most cases. For users/authors to benefit, the main value of JSON Schema - consistent assertions across platforms - should be seen as a higher priority than flexibility and feature set. As an example, I don't see how users can benefit from unevaluatedProperties keyword, even if it can solve a real problem, if it is not consistently supported across all/most platforms. For users to benefit from the new features, you need a commitment from the core maintainers of validators used on various platforms (at least one per platform) to support a feature within certain amount of time. If there is no such commitment, introducing such feature to the spec would not benefit users, quite the opposite. So "implementors over specifiers" principle is not always followed here.
Without implementations consistently supporting a feature across all platforms, authors' life is definitely going to be difficult. I can see a lot of attention to the ease and flexibility of writing schemas, and not sufficient attention to implementation complexity that it may cause.
In 2015 I thought Ajv will be a project for several weekends. And then I spent 2 years fixing various scenarios reported by users in $ref area. I previously suggested to run any validator against these tests, they are in the same format as in the test suite: https://github.com/epoberezkin/ajv/tree/master/spec/tests (they are using these remotes: https://github.com/epoberezkin/ajv/tree/master/spec/remotes). When I ran other JS validators against Ajv test suite - none of them was passing $ref tests (https://github.com/epoberezkin/test-validators). The whole motivation for creating Ajv was that I could not find a single validator that consistently supported $ref (and I tested 11 JS validators against my schemas - neither was complying with the spec). So while I agree that implementing the spec to support most common scenarios is relatively straightforward, implementing it to support all scenarios for combinations of recursion with base URI change becomes very difficult. I am happy to talk further once you've tested your validators and can confirm they pass all these tests and, if they don't, whether it was easy to fix.
That would indeed help. |
From @erosb in #727 (consolidating here): I though it may make sense to give some feedback from an implementor's point of view about the problems raised and effort. First, I also have the feeling that json schema specification doesn't have accurate-enough goals. Many keywords seem to be quite ad-hoc and unjustified (like
So having formally defined goals & usecases, and keeping these in mind while accepting or rejecting a specification change would be beneficial. Complexity-wise: up to draft-8 I could survive, now the library I maintain supports 3 draft versions (4, 6, 7) and I didn't even have to make breaking API changes. That's good. The most painful point was/is understanding, using and implementing Simplifying the specification has multiple effects:
|
@erosb I'm not sure I agree with 100% of what you said, though I likely agree with 99% of it, but "let's be more careful with backwards incompatible changes", if I oversimplify the parts I really strongly agree with, is way different from anyone else's points I've seen on these long drama posts. That one I definitely agree with, if it's the thrust of your comment. |
@erosb @Julian in the forthcoming draft, we made a point to make the two potentially incompatible changes ( New implementations do not need to actively support the old keywords, and old implementations can continue to support them even in newer drafts. So you can't rely on interoperability between old and new, but if you were using the old keywords and you know that your implementation continues to support them, you can rely on that. And more importantly, you don't suddenly get different behavior from a keyword that looks the same.
|
I'm deeply sorry if you feel I've contributed to the above. In terms of working principals, I want to present a principal I work with: burden of proof. For the maintainers of JSON Schema, we are activly engaged with the community, on slack and elsewhere. Replying to questions, helping people, monitoring StackOverflow. We see the community at work. I feel the developments since draft-4 are evidence that this model has worked, as we look for general consensus. Issues are a platform for open discussion on changes, plus the 1 month manditory review and feedback period before a new draft is published. When someone from the community presents a requirement directly, unless we have experienced and seen otherwise, it looks like an n = 1 issue. If I have not seen a requirement from the community for a need, I feel it's reasonable to request evidence from an issue author to support their request. Burden of proof not for editors, but for new suggestions. To me, this is how I understand the consensus model in practice, given that all editors working on this are not being paid to do so. It's a community effort, supported by individuals giving up their own time to make it a priority. I can see a situation where if a group of implementers get together, and all say "this is too complex", and another group of implementers don't jump up to refute such, then we may have re-evaluate changes made in draft-8. Although I'm the first to argue that although the spec is a draft, it has people who use in in production and we should treat it as such, BUT, we know there's a lag between publication and support (as you stated). If during the 1 month review process, lots of implementers say this is not an OK change, then we have to listen to that. Currently I'm hearing comments from both ends of the spectrum. One of the reasons for allowing the 1 month review process is to try and make decisions based on working code. (iirc that was an IETF principal.) @epoberezkin Maybe now is the right time to do a scoping excercise. I think it's helpful to look at @ucarion's closing comments for #710...
Maybe we could settle on making path to rfc a stronger consideration when evaluating what goes into draft-9 and 10. I think this is open for discussion, and should reflect feedback on draft-8.
I hope you feel this is a fair and measured response, and that we can continue to move forward in a collaborative spirit. I certinly don't want anyone to feel alianted or unwelcome. |
I think @Relequestual addressed this thoroughly, and it's sat open for 8 months now without further engagement. Closing. |
By “addressing” I mean anything that we achieve today via IDs and REFs. I, honestly, don’t have a full list of these things that we achieve.
I mean a standard way how product design is usually done, by defining objectives/goals/outcomes and writing user stories for various groups, without any references to UX (which in this context would be $id, $ref etc.)
I will elaborate more on why I believe it is needed, but please reflect on how what we do here is similar to creating a user facing product via mocking up UX, skipping the “product” part entirely. It rarely leads to good results and invariably leads to feature bloat.
The text was updated successfully, but these errors were encountered: