-
Notifications
You must be signed in to change notification settings - Fork 89
[Metrics] Add grafana dashboard for Inference extension and vLLM metrics #237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hi @JeffLuoo. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
✅ Deploy Preview for gateway-api-inference-extension ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
/ok-to-test |
@JeffLuoo can you provide the steps to reproduce the dashboard? |
Hi @danehans I added the instruction of uploading json dashboard to grafana in the description of the PR. To deploy Prometheus and configure Prometheus to collect metrics, there are different ways to do it and I can add that instruction in another change to focus on the instruction. |
@JeffLuoo I am trying to test this PR. What is your Slack handle so I can ping you with specific questions instead of polluting this PR? |
@danehans I just ping you in slack. |
8fb9526
to
e9f2de7
Compare
${DS_PROMETHEUS} .
|
I would move this under the tools not examples. |
I'm also working with @JeffLuoo to repro the dashboard in my test environment. |
Hi @danehans, I think that's the value of dashboard in the templating. When you load the dashboard, it will be automatically replaced with the id of your data source. I replace it with |
e9f2de7
to
4096b5b
Compare
@ahg-g Moved it to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work on this! 🎉 I was able to successfully reproduce the dashboard in my test environment and configured Prometheus to scrape vLLM metrics. Now, I can see the "Token Throughput" and other metrics appearing in the dashboard.
However, shouldn't these metrics be displayed under the "vLLM" tab? Currently, I see 0 panels for both the InferencePool and vLLM tabs.
Additionally, could you provide more information about the InferencePool tab? Are these metrics sourced from Envoy, or do they come from another component?
Thanks again for your hard work—this is a great addition! 🚀
4096b5b
to
79b41dc
Compare
@danehans Moved panels related to vLLM under vLLM row. Remove the row for inference pool as metrics for inference pool is not yet implemented. |
79b41dc
to
ab7b2ed
Compare
/lgtm Thanks @JeffLuoo ! |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ahg-g, JeffLuoo The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Dashboard preview:
Load dashboard using json file: https://grafana.com/docs/grafana/latest/dashboards/build-dashboards/import-dashboards/#import-a-dashboard