Skip to content

It should be possible to use InferencePool without ext_proc #660

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
howardjohn opened this issue Apr 8, 2025 · 7 comments
Open

It should be possible to use InferencePool without ext_proc #660

howardjohn opened this issue Apr 8, 2025 · 7 comments
Labels
triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@howardjohn
Copy link
Contributor

What would you like to be added:
The InferencePool current hardcodes that a user MUST specify and Endpoint picker.

EndpointPickerConfig `json:",inline"`
.

As a vendor agnostic API, we should instead let the API drive behavior not implementation. An implementation should be free to chose it's own implementation of the semantics of InferencePool

@ahg-g
Copy link
Contributor

ahg-g commented Apr 8, 2025

Agreed, the idea behind the EndpointPickerConfig is to provide a one-of config, currently the only option is ExtensionRef, and we were planning to relax the required constraint on it to allow other choices. The ExtensionRef path is needed to allow for bring-your-own EPP.

Do you have something concrete in mind to spec as a second option to ExtensionRef? Or you want to allow not setting a configuration at all and leave it completely to the provider? The latter is also possible.

@howardjohn
Copy link
Contributor Author

I think have an "implementation specific" option that is just up to the provider is probably a good start. I'm not sure what else would be standardized

@shaneutt
Copy link
Member

shaneutt commented Apr 8, 2025

Agreed. 👍 The API needs to be more generally applicable, and not coupled so tightly with a specific implementation. Some kind of resolution here would be ideal as a requirement for a GA release.

@mlavacca
Copy link
Member

mlavacca commented Apr 8, 2025

+1 Having the ext_proc requirement in the API makes this API not vendor-agnostic, as many implementations don't currently support this protocol.

@ahg-g
Copy link
Contributor

ahg-g commented Apr 8, 2025

Note that the current API is designed with that in mind as I mentioned above, the EndpointPickerConfig was proposed as a union of options: we can add another option to EndpointPickerConfig and relax the "required" constraint on ExtensionRef, both are non-breaking changes that we can do at any point.

If there is agreement on the above, the next step is to propose a second option within EndpointPickerConfig, perhaps add an enum to it and make it officially a union?

@hzxuzhonghu
Copy link
Member

+1 to optional

@kfswain kfswain mentioned this issue Apr 23, 2025
17 tasks
@kfswain
Copy link
Collaborator

kfswain commented Apr 24, 2025

Agreed

@kfswain kfswain added the triage/accepted Indicates an issue or PR is ready to be actively worked on. label Apr 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

6 participants