-
Notifications
You must be signed in to change notification settings - Fork 88
InferencePool status should track the number of ready endpoints #342
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The question is what component should do that? |
From my perspective, sicne epp has already monitored model server metrics, it should be able to do the status check asyn as well |
Can we setup our own informer especially for the InferencePool instead of using the informer provided by controller-runtime? This should make things easier if we want to trigger InferencePool updates. I think maybe we should provide more info in InferencePool such as endpoints metrics, available models and etc. which is more user-friendly. A customized informer may allow us to add events to trigger InferencePool reconciling after endpoints or models are updated And a customized informer may also resolve this issue #369, since we can have a better control of the lifecycle of the informers. |
We discussed in the last community meeting that managing the InferencePool object status is not the responsibility of the EPP, it is the responsibility of the gateway controller. Also, if we are viewing InferencePool as a light-weight version of the Service API, then tracking number of ready endpoints is not actually in scope for the InferencePool status. Users can look at the Deployment status to check that. |
What would you like to be added:
InferencePool status should track the number of ready endpoints
Why is this needed:
To track the health of the inferencePool endpoints
The text was updated successfully, but these errors were encountered: