-
Notifications
You must be signed in to change notification settings - Fork 65
Use 294 as the partial status code #346
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
spec/GraphQLOverHTTP.md
Outdated
[IETF RFC2616 Section 6.1.1](https://datatracker.ietf.org/doc/html/rfc2616#section-6.1.1) | ||
states "codes are fully defined in section 10" implying that though more codes | ||
are expected to be supported over time, valid codes must be present in this | ||
document. For compatibility reasons, using HTTP status `203` which has no |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure it needs to be read like that.
Later in section 6.1.1, they keep an open door for extension:
HTTP status codes are extensible. HTTP applications are not required
to understand the meaning of all registered status codes, though such
understanding is obviously desirable. However, applications MUST
understand the class of any status code, as indicated by the first
digit, and treat any unrecognized response as being equivalent to the
x00 status code of that class, with the exception that an
unrecognized response MUST NOT be cached. For example, if an
unrecognized status code of 431 is received by the client, it can
safely assume that there was something wrong with its request and
treat the response as if it had received a 400 status code. In such
cases, user agents SHOULD present to the user the entity returned
with the response, since that entity is likely to include human-
readable information which will explain the unusual status.
Additional status codes that are not part of section 10 are defined in RFC 6585, which officially Updates: 2616
.
But new status codes are also defined outside of the scope of 2616
, e.g. status code 207
is defined in RFC4918 without any mention of "updating" RFC 2616
, it's only referenced in different context there.
So, going by that, we wouldn't necessarily be restricted to the RFC 2616 status codes - but of course, introducing a new one might bring it's own risk.
That said, RFC 4918 reads to me like it only applies to very specific contents types, so maybe 207
with a different content type might actually not provoke a conflict?
A Multi-Status response conveys information about multiple resources
in situations where multiple status codes might be appropriate. The
default Multi-Status response body is a text/xml or application/xml
HTTP entity with a 'multistatus' root element. Further elements
contain 200, 300, 400, and 500 series status codes generated during
the method invocation. 100 series status codes SHOULD NOT be recorded
in a 'response' XML element.
My thoughts: 203 Non-Authoratative InformationBenefits:
Cons:
206 Partial ContentBenefits:
Cons:
207 Multi-StatusBenefits:
Cons:
294 Partial Response (or another GraphQL-defined code)Benefits:
Cons:
|
It should be noted that since most all requests will be HTTPS, the above firewall considerations only apply to application-level firewalls (Secure Web Gateway), as regular firewalls would not have access to the response code. |
I would select 207 because its exact description matches our use case. But first I would prefer to see some testing with commonly used Secure Web Gateways. I wouldn't worry about server-end proxies because the GraphQL server operator would set that up as needed (which may include changing the response code of the GraphQL server to always be 200). And the use of Secure Web Gateways are probably limited to large companies or governments which are likely enough to have a team ready to allow (or disallow) access to specific websites. So testing becomes less important. The only leaves other proxies, such as a cell provider might install to make their network appear faster, which I don't think proxies would be an issue with 207. So I'd go with 207. (Just my two cents.) 203 is probably the "safest" choice.... |
Note that 201/202 also are as "safe" of a selection as 203, perhaps moreso. But again, the meaning doesn't carry over to GraphQL's intended use at all. |
Super excited we are exploring this! This has been a very common pain points for new Apps adopting GraphQL from REST. Very often, we hear developers complain about everything is 200s, and their reliability metrics from REST can't be reused anymore. This effort would be a great help to mitigate this pain point |
While 203 sounds safest from a proxy perspective, anyone encountering this status in production (without particular GraphQL knownledge) and checking the MDN for HTTP 203 would be confused. They would think that an HTTP proxy in their infrastructure has modified the headers or the response body. So I agree with @Shane32 on that one, it's important that the HTTP semantics match the GraphQL intent. 207 is definitely a closer match. Right now the MDN shows an example of a WebDav response with the status information enclosed. I think several GraphQL implementations do convey similar information as error extensions (for example, graphql-java's. I'm ruling out 206 (because of Content-Range) and 294 (for reasons explained in previous comments). Taking a step back, I think that one of the goals of this spec is to make information about the GraphQL exchanges "leak" as much as possible in the HTTP transport when semantics match; this would help existing tools and newcomers to better understand what's happening without having to parse the actual document. Even if HTTP proxies work fine with the chosen HTTP status, I'm wondering if existing observability tools/dashboards/etc. will really consider another 20x in a different fashion. For example, if responding with HTTP 207 somehow breaks observability dashboards and confuses tools, then this would be a net negative for production usage. |
This was my main concern with using 207 - it seems 100% related to WebDAV and thus tools may not expect it to be used elsewhere, may expect it to contain XML, and might even try and parse the XML. Further, I'd argue that 207 doesn't align completely with GraphQL's partial success; it's defined in RFC4918 as:
i.e. it's more designed for batched processing rather than GraphQL's nested (and dependent/chained) resolution model. I agree that 203 doesn't align semantically 100% (though I think there is a little more than 0% alignment - if errors occur then the data returned is kind of non-authoritative because you couldn't actually retrieve all the data), but I'm still leaning towards it from a compatibility POV. Here's a summary from my POV: ❌ 201 Created - explicitly means "created" and I think it might be useful in future (e.g. for creating a continuable streaming endpoint to represent a subscription or incremental delivery) so I wanted to avoid using that. Moreover it seems 100% inappropriate for use in query operations with no side effects. ❌ 202 Accepted - explicitly means it has not been completed - not appropriate. 🤷♂️ 203 Non-Authoritive Information - slight alignment (due to errors - the authoritive data isn't available) and doesn't have any expectations. ❌ 204 No Content - definitely doesn't work! ❌ 205 Reset Content - no content ❌ 206 Partial Content - Requires 🤔 207 Multi-Status - stronger (but still loose) alignment, tightly tied to WebDAV, has many expectations ❌ 208 Already Reported - inappropriate 🥺 294 Partial Success - 100% semantic alignment because it specifically means "partial success" in the GraphQL sense; no existing tooling expectations, HTTP RFC says to treat as success, unlikely to conflict with other uses, may be a challenge to standardize Ideally, rather than opining on this, we should actually test it - we need to find out how well the various status codes work with native HTTP clients in different environments (browser, android, iOS, various server-side programming languages, desktop apps, etc), different proxies (including things like Cloudflare), different API gateways, different observability tooling, IDSes, etc. Ideally this would be crowd-sourced. Alternatively we just pick one and see what happens. If we're picking one I would either pick 203 (has the least risk of conflict) or 294 (explicit new status code). I could be convinced to use 207 if it were demonstrated that it doesn't break existing tooling (and that 294 does). People do make and use their own HTTP status codes, a number of these can be seen on Wikipedia, so I don't think we should 100% count it out. |
Hi y'all 👋 As we've just discussed in the graphql-over-http wg, the plan is to go with |
The decision to go 294 was based on the discussions above and related conversations. Further, since 207 relates to multiple independent operations I think it's worth reserving for GraphQL batching which does in fact represent multiple independent operations (as opposed to regular GraphQL execution where the operations are interdependent). If you can find anything that 294 breaks, please let us know ASAP! |
If the _GraphQL response_ contains the {data} entry and it is not {null}, then | ||
the server MUST reply with a `2xx` status code and SHOULD reply with `200` | ||
status code. | ||
the server MUST reply with a `2xx` status code. | ||
|
||
If the _GraphQL response_ contains the {data} entry and does not contain the | ||
{errors} entry, then the server SHOULD reply with `200` status code. | ||
|
||
Note: There are no circumstances where the GraphQL specification allows for a | ||
response having {data} as {null} without {errors} being present. | ||
|
||
If the _GraphQL response_ contains both the {data} entry (even if it is {null}) | ||
and the {errors} entry, then the server SHOULD reply with `294` status code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be simplified. Whether {data} is {null} should have no bearing on response semantics and should be removed to simplify the specification -- if {data} exists, then there is no request error and 2xx is required, 200 if no {errors}. If {data} exists alongside {errors}, that's a 294; but if there's {errors} without {data}, that's a 4xx. I would not add specifications for an illegal response, where {data} is {null} and {errors} does not exist.
If the _GraphQL response_ contains the {data} entry and it is not {null}, then | |
the server MUST reply with a `2xx` status code and SHOULD reply with `200` | |
status code. | |
the server MUST reply with a `2xx` status code. | |
If the _GraphQL response_ contains the {data} entry and does not contain the | |
{errors} entry, then the server SHOULD reply with `200` status code. | |
Note: There are no circumstances where the GraphQL specification allows for a | |
response having {data} as {null} without {errors} being present. | |
If the _GraphQL response_ contains both the {data} entry (even if it is {null}) | |
and the {errors} entry, then the server SHOULD reply with `294` status code. | |
If the _GraphQL response_ contains the {data} entry, then | |
the server MUST reply with a `2xx` status code. | |
If the _GraphQL response_ contains the {data} entry and does not contain the | |
{errors} entry, then the server SHOULD reply with `200` status code. | |
If the _GraphQL response_ contains both the {data} entry (even if it is {null}) | |
and the {errors} entry, then the server SHOULD reply with `294` status code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See https://spec.graphql.org/October2021/#sec-Response-Format
If the request included execution, the response map must contain an entry with key
data
.
If the request "executes", then it's 2xx.
If the request failed before execution, due to a syntax error, missing information, or validation error, this entry must not be present.
If it doesn't execute, it's a 4xx / 5xx.
Neither depend on whether data
is null
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do have this:
If an error was raised before execution begins, the data entry should not be present in the result.
If an error was raised during the execution that prevented a valid response, the data entry in the response should be null.
So we can only know that execution may have started. Seems like 2xx to me, anyway. And it makes the spec simpler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the simplification
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your diff reads really confusing because of some weird GitHub glitch, but I think what you’re saying is to change the requirement for when data is null from “allowed to use any status code” to “required that it be a 2xx (and preferably 294)”. This has been discussed in the past and some people felt that data:null should code as an error (there’s no data so there’s no partial success!), whereas other felt it should code as a partial success (execution occurred!). The original express-graphql coded it as a 400 IIRC (maybe a 500? It definitely wasn’t 2xx whatever it was) and many servers followed that pattern, hence why the spec text carefully excludes that case allowing the implementer to make their own choice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would not add specifications for an illegal response, where {data} is {null} and {errors} does not exist.
There is no specification for this, the schema adds a non-normative note indicating it’s not possible in case the reader isn’t aware and wonders what to do in that perceived edge case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way I see it, data
being null is a simple result of null bubbling, as is the case anywhere else in the schema. So if your schema is:
type Query {
dog(id: ID!): Dog!
}
type Dog {
id: ID!
name: String!
}
And you request { dog(id: "3") { name } }
but either the dog's name isn't defined or the dog can't be found, null bubbling occurs and data
ends up being null
. This is no different than if it were to occur anywhere else in the schema. If dog
returned a nullable Dog
type, the client must check if dog
is null. Basically, any GraphQL consumer must know how to handle nullable types (including in this case the root type itself) and/or look for the errors
property to see if any errors were returned.
I just don't see why there is an exception to the rule for a root type, and further, an exception that only occurs in the scenario where the GraphQL server were to provide an illegal response (a null data
without an errors
key).
The original express-graphql coded it as a 400 IIRC (maybe a 500? It definitely wasn’t 2xx whatever it was) and many servers followed that pattern, hence why the spec text carefully excludes that case.
Understood but this does not apply. Any backwards compatibility should be defined for application/json
, which we have defined to always return 200 for reasons already discussed. The language here is for the new response type which uses a different set of rules for status codes.
This has been discussed in the past and some people felt that data:null should code as an error (there’s no data so there’s no partial success!), whereas other felt it should code as a partial success (execution occurred!).
While there's no partial data, it cannot be known that there's no partial success. Granted this may be a bad server design, but if we extended the above sample to mutations, it is certainly possible that a partial save occurred, but null bubbling occurred and obscured the result. Sample:
type Mutation {
setDogName(id: ID!, name: String!): Dog!
}
mutation {
setDogNameA: setDogName(id: "234", name: "Jack") { id } # sample: successful
setDogNameB: setDogName(id: "345", name: "Jack") { id } # sample: fails due to invalid id
}
Here null bubbling would have occurred causing data: null
even though setDogNameA
executed successfully.
See changes for explanation.
Note: this will need testing before we can merge it.