OpenAI API conformance tests #513

kfswain · 2025-03-17T16:22:09Z

What would you like to be added: A suite of conformance tests to validate that Inference Gateway + underlying model servers are compliant to the OpenAI API spec. We can start with searching to see if such conformance tests already exist, and if not, it would be good for us to support such a test suite.

Why is this needed: Users are going to have heterogeneous model servers eventually, and we should make sure that they all conform to the same spec, since Inference Gateway is where a user would experience the variance between model servers, we should be able to provide a test suite to ensure they still have consistent API behavior

kfswain · 2025-03-17T16:23:29Z

This should be validated with major model servers (i.e.: vLLM, Triton, TGI, potentially sglang or jetstream)

smarterclayton · 2025-03-17T16:27:11Z

Suggest we validate this by starting with the openai client as called from python, as that is a) how most ecosystem tools will interact with the gateway and b) likely how many ML engineers will invoke the gateway for non-trivial interactions.

smarterclayton · 2025-03-17T16:28:00Z

I also suggest we accelerate validating the openai client in both regular and error configuration before v0.3, since we are increasing visibility at Kubecon.

ahg-g · 2025-03-17T19:58:33Z

Just to clarify, this issue also tracks making the current EPP conformant, not just adding conformance tests.

kfswain · 2025-03-17T21:04:19Z

True, but the only scope of the EPP that would impact here is just how we error handle. A suite of conformance that give a user confidence that their specific blend of: Model Servers + EPP will still work with OpenAI API spec is strongly valuable (and a good suite should catch where the EPP is nonconformant)

kfswain · 2025-04-22T20:31:11Z

I'm considering moving this to a discussion, as the Open AI API seems to be partially implemented by model servers (such as vLLM) and there is no concrete contract as to which endpoints are supported. Colloquially, the /v1/completions and /v1/chat/completions endpoints are supported, but the v1/completions is considered legacy.

Perhaps instead we should specify what IGW currently expects: right now we just expect that there is a model param in the body, and eventually that the prompt is named: prompt

kfswain mentioned this issue Mar 17, 2025

v0.3.0 Release Tracker #493

Closed

14 tasks

kfswain self-assigned this Mar 17, 2025

kfswain added the triage/needs-information Indicates an issue needs more information in order to work on it. label Apr 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

OpenAI API conformance tests #513

OpenAI API conformance tests #513

kfswain commented Mar 17, 2025

kfswain commented Mar 17, 2025

Uh oh!

smarterclayton commented Mar 17, 2025

Uh oh!

smarterclayton commented Mar 17, 2025

Uh oh!

ahg-g commented Mar 17, 2025

Uh oh!

kfswain commented Mar 17, 2025

Uh oh!

kfswain commented Apr 22, 2025

Uh oh!

OpenAI API conformance tests #513

OpenAI API conformance tests #513

Comments

kfswain commented Mar 17, 2025

kfswain commented Mar 17, 2025

Uh oh!

smarterclayton commented Mar 17, 2025

Uh oh!

smarterclayton commented Mar 17, 2025

Uh oh!

ahg-g commented Mar 17, 2025

Uh oh!

kfswain commented Mar 17, 2025

Uh oh!

kfswain commented Apr 22, 2025

Uh oh!