You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
update algorithm parameters from env variables (#580)
* update algorithm parameters form env variables
* move env parsers to a new pkg in utils
* add unit test for env parser
* remove logging env variables during scheduling
* add test for env parser
// Config holds all the configuration values for the scheduler
35
+
typeConfigstruct {
36
+
KVCacheThresholdfloat64
37
+
QueueThresholdCriticalint
38
+
QueueingThresholdLoRAint
39
+
LoraAffinityThresholdfloat64
40
+
}
41
+
33
42
const (
34
-
// TODO(https://github.com/kubernetes-sigs/gateway-api-inference-extension/issues/16) Make this configurable.
35
-
kvCacheThreshold=0.8
36
-
// TODO(https://github.com/kubernetes-sigs/gateway-api-inference-extension/issues/16) Make this configurable.
37
-
queueThresholdCritical=5
38
-
// TODO(https://github.com/kubernetes-sigs/gateway-api-inference-extension/issues/16) Make this configurable.
39
-
// the threshold for queued requests to be considered low below which we can prioritize LoRA affinity.
40
-
// The value of 128 is arrived heuristicically based on experiments.
41
-
queueingThresholdLoRA=128
42
-
// TODO(https://github.com/kubernetes-sigs/gateway-api-inference-extension/issues/16) Make this configurable.
43
-
// loraAffinityThreshold indicates the probability with which we prefer a pod with LoRA affinity over a pod without but having room to fit more LoRA adapters.
44
-
loraAffinityThreshold=0.999
43
+
// Default values to use if environment variables are not set
44
+
defaultKVCacheThreshold=0.8
45
+
defaultQueueThresholdCritical=5
46
+
defaultQueueingThresholdLoRA=128
47
+
defaultLoraAffinityThreshold=0.999
45
48
)
46
49
50
+
// LoadConfig loads configuration from environment variables
51
+
funcLoadConfig() Config {
52
+
// Use a default logger for initial configuration loading
0 commit comments