Skip to content

Use 294 as the partial status code #346

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

Conversation

benjie
Copy link
Member

@benjie benjie commented May 8, 2025

See changes for explanation.

Note: this will need testing before we can merge it.

@benjie benjie marked this pull request as ready for review May 8, 2025 19:22
[IETF RFC2616 Section 6.1.1](https://datatracker.ietf.org/doc/html/rfc2616#section-6.1.1)
states "codes are fully defined in section 10" implying that though more codes
are expected to be supported over time, valid codes must be present in this
document. For compatibility reasons, using HTTP status `203` which has no
Copy link

@phryneas phryneas May 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure it needs to be read like that.

Later in section 6.1.1, they keep an open door for extension:

HTTP status codes are extensible. HTTP applications are not required
to understand the meaning of all registered status codes, though such
understanding is obviously desirable. However, applications MUST
understand the class of any status code, as indicated by the first
digit, and treat any unrecognized response as being equivalent to the
x00 status code of that class, with the exception that an
unrecognized response MUST NOT be cached. For example, if an
unrecognized status code of 431 is received by the client, it can
safely assume that there was something wrong with its request and
treat the response as if it had received a 400 status code. In such
cases, user agents SHOULD present to the user the entity returned
with the response, since that entity is likely to include human-
readable information which will explain the unusual status.

Additional status codes that are not part of section 10 are defined in RFC 6585, which officially Updates: 2616.
But new status codes are also defined outside of the scope of 2616, e.g. status code 207 is defined in RFC4918 without any mention of "updating" RFC 2616, it's only referenced in different context there.

So, going by that, we wouldn't necessarily be restricted to the RFC 2616 status codes - but of course, introducing a new one might bring it's own risk.

That said, RFC 4918 reads to me like it only applies to very specific contents types, so maybe 207 with a different content type might actually not provoke a conflict?

A Multi-Status response conveys information about multiple resources
in situations where multiple status codes might be appropriate. The
default Multi-Status response body is a text/xml or application/xml
HTTP entity with a 'multistatus' root element. Further elements
contain 200, 300, 400, and 500 series status codes generated during
the method invocation. 100 series status codes SHOULD NOT be recorded
in a 'response' XML element.

@Shane32
Copy link
Contributor

Shane32 commented May 9, 2025

My thoughts:

203 Non-Authoratative Information

Benefits:

  • Proxies, firewalls and clients are unlikely to refuse a document because the code is 203; this would be a normal and expected status code to see, which does not imply any improper behavior.

Cons:

  • Name and description do not have any relation to GraphQL partial response
  • Proxies may be in use along a HTTP communication path, and may change responses to 203 or consume it. However, for server side logging, it doesn't matter what a proxy does with the status code after it's left the server

206 Partial Content

Benefits:

  • Name somewhat matches intended use

Cons:

  • Proxies, firewalls or even clients may attempt to validate the content, and (1) seeing no Content-Range response header, or (2) not expecting a partial response, reject the request

207 Multi-Status

Benefits:

  • Name and description closely matches intended use
    • Although '207' is used as the overall response status code, the recipient needs to consult the contents of the multistatus response body for further information about the success or failure of the method execution. The response MAY be used in success, partial success and also in failure situations.
  • The status code is commonly in use and should be supported by proxies and firewalls
  • The status code is defined in WebDAV's RFC, layered under HTTP, and not the HTTP protocol itself, much the same way this GraphQL RFC is layered under the HTTP protocol. As such, proxies should be oblivious to the content and pass along the request.

Cons:

  • Firewalls may attempt to validate the content type of data returned with a 207 response type (at a minimum), and finding it is not XML, reject the response.

294 Partial Response (or another GraphQL-defined code)

Benefits:

  • No overlap with other status codes
    • Logging providers can choose to recognize 294 as a GraphQL-specific response code

Cons:

  • Firewalls may reject responses with this unrecognized status code
  • Exactly duplicates the purpose of status code 207
  • Getting IANA approval for a new officially recognized status code may prove difficult, given the presence of 207

@Shane32
Copy link
Contributor

Shane32 commented May 9, 2025

It should be noted that since most all requests will be HTTPS, the above firewall considerations only apply to application-level firewalls (Secure Web Gateway), as regular firewalls would not have access to the response code.

@Shane32
Copy link
Contributor

Shane32 commented May 9, 2025

I would select 207 because its exact description matches our use case. But first I would prefer to see some testing with commonly used Secure Web Gateways. I wouldn't worry about server-end proxies because the GraphQL server operator would set that up as needed (which may include changing the response code of the GraphQL server to always be 200). And the use of Secure Web Gateways are probably limited to large companies or governments which are likely enough to have a team ready to allow (or disallow) access to specific websites. So testing becomes less important. The only leaves other proxies, such as a cell provider might install to make their network appear faster, which I don't think proxies would be an issue with 207. So I'd go with 207. (Just my two cents.)

203 is probably the "safest" choice....

@Shane32
Copy link
Contributor

Shane32 commented May 9, 2025

Note that 201/202 also are as "safe" of a selection as 203, perhaps moreso. But again, the meaning doesn't carry over to GraphQL's intended use at all.

@Keweiqu
Copy link

Keweiqu commented May 15, 2025

Super excited we are exploring this! This has been a very common pain points for new Apps adopting GraphQL from REST. Very often, we hear developers complain about everything is 200s, and their reliability metrics from REST can't be reused anymore. This effort would be a great help to mitigate this pain point

@bclozel
Copy link

bclozel commented May 16, 2025

While 203 sounds safest from a proxy perspective, anyone encountering this status in production (without particular GraphQL knownledge) and checking the MDN for HTTP 203 would be confused. They would think that an HTTP proxy in their infrastructure has modified the headers or the response body. So I agree with @Shane32 on that one, it's important that the HTTP semantics match the GraphQL intent.

207 is definitely a closer match. Right now the MDN shows an example of a WebDav response with the status information enclosed. I think several GraphQL implementations do convey similar information as error extensions (for example, graphql-java's. ErrorType). I'm wondering if promoting this information at the error map level in the main graphql spec would align with this choice. What's your opinion?

I'm ruling out 206 (because of Content-Range) and 294 (for reasons explained in previous comments).

Taking a step back, I think that one of the goals of this spec is to make information about the GraphQL exchanges "leak" as much as possible in the HTTP transport when semantics match; this would help existing tools and newcomers to better understand what's happening without having to parse the actual document. Even if HTTP proxies work fine with the chosen HTTP status, I'm wondering if existing observability tools/dashboards/etc. will really consider another 20x in a different fashion. For example, if responding with HTTP 207 somehow breaks observability dashboards and confuses tools, then this would be a net negative for production usage.

@benjie
Copy link
Member Author

benjie commented May 16, 2025

if responding with HTTP 207 somehow breaks observability dashboards and confuses tools, then this would be a net negative for production usage.

This was my main concern with using 207 - it seems 100% related to WebDAV and thus tools may not expect it to be used elsewhere, may expect it to contain XML, and might even try and parse the XML.

Further, I'd argue that 207 doesn't align completely with GraphQL's partial success; it's defined in RFC4918 as:

The 207 (Multi-Status) status code provides status for multiple independent operations

i.e. it's more designed for batched processing rather than GraphQL's nested (and dependent/chained) resolution model.

I agree that 203 doesn't align semantically 100% (though I think there is a little more than 0% alignment - if errors occur then the data returned is kind of non-authoritative because you couldn't actually retrieve all the data), but I'm still leaning towards it from a compatibility POV.


Here's a summary from my POV:

❌ 201 Created - explicitly means "created" and I think it might be useful in future (e.g. for creating a continuable streaming endpoint to represent a subscription or incremental delivery) so I wanted to avoid using that. Moreover it seems 100% inappropriate for use in query operations with no side effects.

❌ 202 Accepted - explicitly means it has not been completed - not appropriate.

🤷‍♂️ 203 Non-Authoritive Information - slight alignment (due to errors - the authoritive data isn't available) and doesn't have any expectations.

❌ 204 No Content - definitely doesn't work!

❌ 205 Reset Content - no content

❌ 206 Partial Content - Requires Range header - not appropriate.

🤔 207 Multi-Status - stronger (but still loose) alignment, tightly tied to WebDAV, has many expectations

❌ 208 Already Reported - inappropriate

🥺 294 Partial Success - 100% semantic alignment because it specifically means "partial success" in the GraphQL sense; no existing tooling expectations, HTTP RFC says to treat as success, unlikely to conflict with other uses, may be a challenge to standardize


Ideally, rather than opining on this, we should actually test it - we need to find out how well the various status codes work with native HTTP clients in different environments (browser, android, iOS, various server-side programming languages, desktop apps, etc), different proxies (including things like Cloudflare), different API gateways, different observability tooling, IDSes, etc. Ideally this would be crowd-sourced.

Alternatively we just pick one and see what happens. If we're picking one I would either pick 203 (has the least risk of conflict) or 294 (explicit new status code). I could be convinced to use 207 if it were demonstrated that it doesn't break existing tooling (and that 294 does).

People do make and use their own HTTP status codes, a number of these can be seen on Wikipedia, so I don't think we should 100% count it out.

@benjie benjie changed the title Use 203 as the partial status code Use 294 as the partial status code Jun 2, 2025
@martinbonnin
Copy link
Contributor

Hi y'all 👋

As we've just discussed in the graphql-over-http wg, the plan is to go with 294 for a spec freeze on July 1st (same time as the main spec freeze). Please test it out and if you have any objection, please reply below.

@benjie
Copy link
Member Author

benjie commented Jun 2, 2025

The decision to go 294 was based on the discussions above and related conversations. Further, since 207 relates to multiple independent operations I think it's worth reserving for GraphQL batching which does in fact represent multiple independent operations (as opposed to regular GraphQL execution where the operations are interdependent).

If you can find anything that 294 breaks, please let us know ASAP!

Comment on lines 665 to +675
If the _GraphQL response_ contains the {data} entry and it is not {null}, then
the server MUST reply with a `2xx` status code and SHOULD reply with `200`
status code.
the server MUST reply with a `2xx` status code.

If the _GraphQL response_ contains the {data} entry and does not contain the
{errors} entry, then the server SHOULD reply with `200` status code.

Note: There are no circumstances where the GraphQL specification allows for a
response having {data} as {null} without {errors} being present.

If the _GraphQL response_ contains both the {data} entry (even if it is {null})
and the {errors} entry, then the server SHOULD reply with `294` status code.
Copy link
Contributor

@Shane32 Shane32 Jun 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be simplified. Whether {data} is {null} should have no bearing on response semantics and should be removed to simplify the specification -- if {data} exists, then there is no request error and 2xx is required, 200 if no {errors}. If {data} exists alongside {errors}, that's a 294; but if there's {errors} without {data}, that's a 4xx. I would not add specifications for an illegal response, where {data} is {null} and {errors} does not exist.

Suggested change
If the _GraphQL response_ contains the {data} entry and it is not {null}, then
the server MUST reply with a `2xx` status code and SHOULD reply with `200`
status code.
the server MUST reply with a `2xx` status code.
If the _GraphQL response_ contains the {data} entry and does not contain the
{errors} entry, then the server SHOULD reply with `200` status code.
Note: There are no circumstances where the GraphQL specification allows for a
response having {data} as {null} without {errors} being present.
If the _GraphQL response_ contains both the {data} entry (even if it is {null})
and the {errors} entry, then the server SHOULD reply with `294` status code.
If the _GraphQL response_ contains the {data} entry, then
the server MUST reply with a `2xx` status code.
If the _GraphQL response_ contains the {data} entry and does not contain the
{errors} entry, then the server SHOULD reply with `200` status code.
If the _GraphQL response_ contains both the {data} entry (even if it is {null})
and the {errors} entry, then the server SHOULD reply with `294` status code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See https://spec.graphql.org/October2021/#sec-Response-Format

If the request included execution, the response map must contain an entry with key data.

If the request "executes", then it's 2xx.

If the request failed before execution, due to a syntax error, missing information, or validation error, this entry must not be present.

If it doesn't execute, it's a 4xx / 5xx.

Neither depend on whether data is null.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do have this:

If an error was raised before execution begins, the data entry should not be present in the result.

If an error was raised during the execution that prevented a valid response, the data entry in the response should be null.

So we can only know that execution may have started. Seems like 2xx to me, anyway. And it makes the spec simpler.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the simplification

Copy link
Member Author

@benjie benjie Jun 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your diff reads really confusing because of some weird GitHub glitch, but I think what you’re saying is to change the requirement for when data is null from “allowed to use any status code” to “required that it be a 2xx (and preferably 294)”. This has been discussed in the past and some people felt that data:null should code as an error (there’s no data so there’s no partial success!), whereas other felt it should code as a partial success (execution occurred!). The original express-graphql coded it as a 400 IIRC (maybe a 500? It definitely wasn’t 2xx whatever it was) and many servers followed that pattern, hence why the spec text carefully excludes that case allowing the implementer to make their own choice.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not add specifications for an illegal response, where {data} is {null} and {errors} does not exist.

There is no specification for this, the schema adds a non-normative note indicating it’s not possible in case the reader isn’t aware and wonders what to do in that perceived edge case.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way I see it, data being null is a simple result of null bubbling, as is the case anywhere else in the schema. So if your schema is:

type Query {
  dog(id: ID!): Dog!
}

type Dog {
  id: ID!
  name: String!
}

And you request { dog(id: "3") { name } } but either the dog's name isn't defined or the dog can't be found, null bubbling occurs and data ends up being null. This is no different than if it were to occur anywhere else in the schema. If dog returned a nullable Dog type, the client must check if dog is null. Basically, any GraphQL consumer must know how to handle nullable types (including in this case the root type itself) and/or look for the errors property to see if any errors were returned.

I just don't see why there is an exception to the rule for a root type, and further, an exception that only occurs in the scenario where the GraphQL server were to provide an illegal response (a null data without an errors key).

The original express-graphql coded it as a 400 IIRC (maybe a 500? It definitely wasn’t 2xx whatever it was) and many servers followed that pattern, hence why the spec text carefully excludes that case.

Understood but this does not apply. Any backwards compatibility should be defined for application/json, which we have defined to always return 200 for reasons already discussed. The language here is for the new response type which uses a different set of rules for status codes.

This has been discussed in the past and some people felt that data:null should code as an error (there’s no data so there’s no partial success!), whereas other felt it should code as a partial success (execution occurred!).

While there's no partial data, it cannot be known that there's no partial success. Granted this may be a bad server design, but if we extended the above sample to mutations, it is certainly possible that a partial save occurred, but null bubbling occurred and obscured the result. Sample:

type Mutation {
  setDogName(id: ID!, name: String!): Dog!
}

mutation {
  setDogNameA: setDogName(id: "234", name: "Jack") { id }  # sample: successful
  setDogNameB: setDogName(id: "345", name: "Jack") { id }  # sample: fails due to invalid id
}

Here null bubbling would have occurred causing data: null even though setDogNameA executed successfully.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants