Skip to content

InferencePool status should track the number of ready endpoints #342

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ahg-g opened this issue Feb 15, 2025 · 4 comments
Closed

InferencePool status should track the number of ready endpoints #342

ahg-g opened this issue Feb 15, 2025 · 4 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature.

Comments

@ahg-g
Copy link
Contributor

ahg-g commented Feb 15, 2025

What would you like to be added:

InferencePool status should track the number of ready endpoints

Why is this needed:

To track the health of the inferencePool endpoints

@kfswain kfswain added the kind/feature Categorizes issue or PR as related to a new feature. label Feb 19, 2025
@ahg-g
Copy link
Contributor Author

ahg-g commented Feb 20, 2025

The question is what component should do that?

@hzxuzhonghu
Copy link
Member

From my perspective, sicne epp has already monitored model server metrics, it should be able to do the status check asyn as well

@Kuromesi
Copy link
Contributor

Can we setup our own informer especially for the InferencePool instead of using the informer provided by controller-runtime? This should make things easier if we want to trigger InferencePool updates. I think maybe we should provide more info in InferencePool such as endpoints metrics, available models and etc. which is more user-friendly. A customized informer may allow us to add events to trigger InferencePool reconciling after endpoints or models are updated

And a customized informer may also resolve this issue #369, since we can have a better control of the lifecycle of the informers.

@ahg-g
Copy link
Contributor Author

ahg-g commented Feb 26, 2025

We discussed in the last community meeting that managing the InferencePool object status is not the responsibility of the EPP, it is the responsibility of the gateway controller.

Also, if we are viewing InferencePool as a light-weight version of the Service API, then tracking number of ready endpoints is not actually in scope for the InferencePool status. Users can look at the Deployment status to check that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

No branches or pull requests

4 participants