Skip to content

update algorithm parameters from env variables #580

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Mar 28, 2025

Conversation

kaushikmitr
Copy link
Contributor

This pull request introduces a significant refactor to make scheduler configuration values dynamically configurable through environment variables. The changes include creating a new Config struct to hold configuration values, updating various functions to use these new configuration values, and adding utility functions to load configuration from environment variables.

Key changes include:

Configuration Management:

  • Added Config struct to hold configuration values (KVCacheThreshold, QueueThresholdCritical, QueueingThresholdLoRA, LoraAffinityThreshold) and initialized it using environment variables (pkg/epp/scheduling/scheduler.go).
  • Implemented utility functions getEnvFloat and getEnvInt to retrieve configuration values from environment variables with default fallbacks (pkg/epp/scheduling/scheduler.go).

Function Updates:

  • Updated leastQueuingFilterFunc and loRASoftAffinityFilter to use the new configuration values (pkg/epp/scheduling/filter.go). [1] [2]

Testing Enhancements:

  • Modified TestLoRASoftAffinityDistribution to set and restore configuration values for testing purposes (pkg/epp/scheduling/filter_test.go). [1] [2] [3]

Scheduler Enhancements:

  • Added logging of current configuration values in the Schedule method for better debugging (pkg/epp/scheduling/scheduler.go).
  • Added UpdateLoraAffinityThreshold function to allow dynamic updates of the LoRA affinity threshold (pkg/epp/scheduling/scheduler.go).[Copilot is generating a summary...]

tested with:
- name: KV_CACHE_THRESHOLD
value: "0.9"
- name: QUEUE_THRESHOLD_CRITICAL
value: "10"
- name: QUEUING_THRESHOLD_LORA
value: "23"
- name: LORA_AFFINITY_THRESHOLD
value: "0.7"
logs printed:
2025-03-26T20:44:36Z LEVEL(-3) scheduling/scheduler.go:206 Scheduler configuration {"request": {"Model":"meta-llama/Llama-2-7b-hf","TargetModels":null,"ResolvedTargetModel":"meta-llama/Llama-2-7b-hf","Critical":true}, "KVCacheThreshold": 0.9, "QueueThresholdCritical": 10, "QueueingThresholdLoRA": 23, "LoraAffinityThreshold": 0.7}
2025-03-26T20:44:36Z LEVEL(-3) scheduling/scheduler.go:214 Scheduling a request.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Mar 26, 2025
@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Mar 26, 2025
@k8s-ci-robot
Copy link
Contributor

Hi @kaushikmitr. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Copy link

netlify bot commented Mar 26, 2025

Deploy Preview for gateway-api-inference-extension ready!

Name Link
🔨 Latest commit bae8e28
🔍 Latest deploy log https://app.netlify.com/sites/gateway-api-inference-extension/deploys/67e600f3a77d070009784ca7
😎 Deploy Preview https://deploy-preview-580--gateway-api-inference-extension.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Mar 26, 2025
@kaushikmitr kaushikmitr changed the title update algorithm parameters form env variables [WIP] update algorithm parameters form env variables Mar 26, 2025
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 26, 2025
)

// getEnvFloat gets a float64 from an environment variable with a default value
func getEnvFloat(key string, defaultVal float64, logger logr.Logger) float64 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move these env var parsing helper functions to /util/env/env.go

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree.

It would likely blow up the size of this PR tho. Global vars are run before the main func. And our filter chain is a global var, which consumes these variables.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or do you just mean the parsing funcs themselves.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it makes sense to have a separate env pkg within utils which has the env parsers. updated accordingly

Copy link
Collaborator

@kfswain kfswain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few comments, I would generally suggest restructuring this.

However, with kubecon so close, and we need this for v0.3.0, and if you are short on time... It may be acceptable to mark TODOs all over this, and add issues the refactor this

)

// getEnvFloat gets a float64 from an environment variable with a default value
func getEnvFloat(key string, defaultVal float64, logger logr.Logger) float64 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or do you just mean the parsing funcs themselves.

}

// LoadConfig loads configuration from environment variables
func LoadConfig() Config {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typically env variables are processed in the main func. It would be considered surprising to process an env var in a package like this and load it into a global var.

The unification of these config variables allows a reader to be able to reliably find the config, rather than having to hunt for its consumption.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i agree with this, will tackle it in a separate issue as suggested above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can add a global struct to hold all env vars for discoverability. I am OK with a follow up.

@@ -108,6 +177,15 @@ var (
}
)

// UpdateLoraAffinityThreshold updates the LoRA affinity threshold value
// This is useful for testing or dynamic reconfiguration
func UpdateLoraAffinityThreshold(newValue float64, logger logr.Logger) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like it's only being consumed in tests, could we make this a test util func for now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, moved it to test


// Log current configuration values for debugging purposes.
logger.V(logutil.VERBOSE).Info("Scheduler configuration",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TRACE is probably preferred since this would run on every request, and its unlikely to change frequently.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes agreed.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 27, 2025
@kaushikmitr kaushikmitr changed the title [WIP] update algorithm parameters form env variables update algorithm parameters form env variables Mar 27, 2025
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 27, 2025
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 27, 2025
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pls add unit tests

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added tests

LoraAffinityThreshold float64
}

var (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: these can be const


// Log current configuration values for debugging purposes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the config is only loaded once, why do we log it in every request? it's already logged in LoadConfig.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its logutil.TRACE, so should be logged in unless tracing is on

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still, I would expect config parameters to be logged once on startup, you can move this log line to NewScheduler for now, and later we should move it to main

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no need for this since this is already logged on startup. Also logging such environment configurable should be DEFAULT level.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

LoraAffinityThreshold: envutil.GetEnvFloat("LORA_AFFINITY_THRESHOLD", defaultLoraAffinityThreshold, baseLogger),
}

baseLogger.V(logutil.DEFAULT).Info("Scheduler configuration loaded",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: You can simply log the config object itself, logger.Info("Scheduler configuration loaded", "config", config)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}

// LoadConfig loads configuration from environment variables
func LoadConfig() Config {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can add a global struct to hold all env vars for discoverability. I am OK with a follow up.

Copy link
Contributor

@liu-cong liu-cong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with @kfswain that there are followups on this, but given the short kubecon timeline I am fine.

The change looks good to me except for a couple nits, and adding unit tests for the env parsing. Will let others approve.

// Set a specific test value for this test
testThreshold := 0.75 // 75%
// Inline update of threshold value
func(newValue float64, logger logr.Logger) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry perhaps I wasn't super clear by inline the function. Let's just do config.LoraAffinityThreshold = newValue. I am fine without the logging since this is just in tests. If you want logging, simply add t.Logf().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


// Log current configuration values for debugging purposes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no need for this since this is already logged on startup. Also logging such environment configurable should be DEFAULT level.

@ahg-g
Copy link
Contributor

ahg-g commented Mar 27, 2025

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Mar 27, 2025
@kaushikmitr kaushikmitr changed the title update algorithm parameters form env variables update algorithm parameters from env variables Mar 27, 2025
@kaushikmitr
Copy link
Contributor Author

added unit test for env parser

@ahg-g
Copy link
Contributor

ahg-g commented Mar 28, 2025

The PR has some lint errors, run make verify to see them locally or click the failed test link: https://prow.k8s.io/view/gs/kubernetes-ci-logs/pr-logs/pull/kubernetes-sigs_gateway-api-inference-extension/580/pull-gateway-api-inference-extension-verify-main/1905364589357305856

@ahg-g
Copy link
Contributor

ahg-g commented Mar 28, 2025

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 28, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ahg-g, kaushikmitr

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 28, 2025
@k8s-ci-robot k8s-ci-robot merged commit 16ded66 into kubernetes-sigs:main Mar 28, 2025
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants