Skip to content

Adaptive metrics probing periods #667

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ahg-g opened this issue Apr 8, 2025 · 2 comments
Open

Adaptive metrics probing periods #667

ahg-g opened this issue Apr 8, 2025 · 2 comments
Labels
triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@ahg-g
Copy link
Contributor

ahg-g commented Apr 8, 2025

What would you like to be added:

Adaptive metrics probing periods. I can think of two options:

  1. Either change the probing period depending on the observed qps: reduce it as qps increases, increase it as qps decreases within pre-defined bounds.
  2. Explore a form of synchronous probing

Why is this needed:

Currently the EPP has a fixed probing period of 50ms, at higher qps this probing period may not be sufficient, leading to stale metrics and suboptimal scheduling decisions.

@liu-cong
Copy link
Contributor

I discussed this in #678

@kfswain
Copy link
Collaborator

kfswain commented Apr 24, 2025

Issue Context: A PR that implements this would need to show improved benchmark results to better quantify the value.

@kfswain kfswain added the triage/accepted Indicates an issue or PR is ready to be actively worked on. label Apr 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

3 participants