Skip to content

OpenAI API conformance tests #513

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Tracked by #493
kfswain opened this issue Mar 17, 2025 · 6 comments
Open
Tracked by #493

OpenAI API conformance tests #513

kfswain opened this issue Mar 17, 2025 · 6 comments
Assignees
Labels
triage/needs-information Indicates an issue needs more information in order to work on it.

Comments

@kfswain
Copy link
Collaborator

kfswain commented Mar 17, 2025

What would you like to be added: A suite of conformance tests to validate that Inference Gateway + underlying model servers are compliant to the OpenAI API spec. We can start with searching to see if such conformance tests already exist, and if not, it would be good for us to support such a test suite.

Why is this needed: Users are going to have heterogeneous model servers eventually, and we should make sure that they all conform to the same spec, since Inference Gateway is where a user would experience the variance between model servers, we should be able to provide a test suite to ensure they still have consistent API behavior

@kfswain
Copy link
Collaborator Author

kfswain commented Mar 17, 2025

This should be validated with major model servers (i.e.: vLLM, Triton, TGI, potentially sglang or jetstream)

@smarterclayton
Copy link
Contributor

Suggest we validate this by starting with the openai client as called from python, as that is a) how most ecosystem tools will interact with the gateway and b) likely how many ML engineers will invoke the gateway for non-trivial interactions.

@smarterclayton
Copy link
Contributor

I also suggest we accelerate validating the openai client in both regular and error configuration before v0.3, since we are increasing visibility at Kubecon.

@kfswain kfswain mentioned this issue Mar 17, 2025
14 tasks
@ahg-g
Copy link
Contributor

ahg-g commented Mar 17, 2025

Just to clarify, this issue also tracks making the current EPP conformant, not just adding conformance tests.

@kfswain
Copy link
Collaborator Author

kfswain commented Mar 17, 2025

True, but the only scope of the EPP that would impact here is just how we error handle. A suite of conformance that give a user confidence that their specific blend of: Model Servers + EPP will still work with OpenAI API spec is strongly valuable (and a good suite should catch where the EPP is nonconformant)

@kfswain kfswain self-assigned this Mar 17, 2025
@kfswain
Copy link
Collaborator Author

kfswain commented Apr 22, 2025

I'm considering moving this to a discussion, as the Open AI API seems to be partially implemented by model servers (such as vLLM) and there is no concrete contract as to which endpoints are supported. Colloquially, the /v1/completions and /v1/chat/completions endpoints are supported, but the v1/completions is considered legacy.

Perhaps instead we should specify what IGW currently expects: right now we just expect that there is a model param in the body, and eventually that the prompt is named: prompt

@kfswain kfswain added the triage/needs-information Indicates an issue needs more information in order to work on it. label Apr 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triage/needs-information Indicates an issue needs more information in order to work on it.
Projects
None yet
Development

No branches or pull requests

3 participants