Skip to content

[Metrics] Add grafana dashboard for Inference extension and vLLM metrics #237

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 3, 2025

Conversation

JeffLuoo
Copy link
Contributor

@JeffLuoo JeffLuoo commented Jan 28, 2025

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jan 28, 2025
@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Jan 28, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @JeffLuoo. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Jan 28, 2025
@JeffLuoo
Copy link
Contributor Author

@liu-cong @ahg-g @courageJ

Copy link

netlify bot commented Jan 28, 2025

Deploy Preview for gateway-api-inference-extension ready!

Name Link
🔨 Latest commit ab7b2ed
🔍 Latest deploy log https://app.netlify.com/sites/gateway-api-inference-extension/deploys/67a132bc659aac0009507692
😎 Deploy Preview https://deploy-preview-237--gateway-api-inference-extension.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@ahg-g
Copy link
Contributor

ahg-g commented Jan 28, 2025

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jan 28, 2025
@danehans danehans changed the title [Metrics] Add grafana dashboard for inferecen gateway and vLLM metrics [Metrics] Add grafana dashboard for Inference extension and vLLM metrics Jan 28, 2025
@danehans
Copy link
Contributor

@JeffLuoo can you provide the steps to reproduce the dashboard?

@JeffLuoo
Copy link
Contributor Author

Hi @danehans I added the instruction of uploading json dashboard to grafana in the description of the PR. To deploy Prometheus and configure Prometheus to collect metrics, there are different ways to do it and I can add that instruction in another change to focus on the instruction.

@danehans
Copy link
Contributor

@JeffLuoo I am trying to test this PR. What is your Slack handle so I can ping you with specific questions instead of polluting this PR?

@JeffLuoo
Copy link
Contributor Author

JeffLuoo commented Jan 29, 2025

@danehans I just ping you in slack.

@JeffLuoo JeffLuoo force-pushed the grafana-dashboard branch 2 times, most recently from 8fb9526 to e9f2de7 Compare January 30, 2025 18:22
@danehans
Copy link
Contributor

@JeffLuoo

still needs to be replaced with ${DS_PROMETHEUS}.

@ahg-g
Copy link
Contributor

ahg-g commented Jan 31, 2025

I would move this under the tools not examples.

@danehans
Copy link
Contributor

I'm also working with @JeffLuoo to repro the dashboard in my test environment.

@JeffLuoo
Copy link
Contributor Author

JeffLuoo commented Jan 31, 2025

Hi @danehans, I think that's the value of dashboard in the templating. When you load the dashboard, it will be automatically replaced with the id of your data source. I replace it with ${DS_PROMETHEUS} or just empty string, it is still configured to my datasource automatically.

@JeffLuoo
Copy link
Contributor Author

@ahg-g Moved it to /tools

Copy link
Contributor

@danehans danehans left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work on this! 🎉 I was able to successfully reproduce the dashboard in my test environment and configured Prometheus to scrape vLLM metrics. Now, I can see the "Token Throughput" and other metrics appearing in the dashboard.

However, shouldn't these metrics be displayed under the "vLLM" tab? Currently, I see 0 panels for both the InferencePool and vLLM tabs.

Additionally, could you provide more information about the InferencePool tab? Are these metrics sourced from Envoy, or do they come from another component?

Thanks again for your hard work—this is a great addition! 🚀

@JeffLuoo
Copy link
Contributor Author

JeffLuoo commented Feb 3, 2025

@danehans Moved panels related to vLLM under vLLM row.

Remove the row for inference pool as metrics for inference pool is not yet implemented.

@JeffLuoo JeffLuoo requested a review from danehans February 3, 2025 17:56
@JeffLuoo JeffLuoo requested a review from ahg-g February 3, 2025 21:19
@ahg-g
Copy link
Contributor

ahg-g commented Feb 3, 2025

/lgtm
/approve

Thanks @JeffLuoo !

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 3, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ahg-g, JeffLuoo

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 3, 2025
@k8s-ci-robot k8s-ci-robot merged commit 528ea6e into kubernetes-sigs:main Feb 3, 2025
8 checks passed
kfswain pushed a commit to kfswain/llm-instance-gateway that referenced this pull request Apr 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants