passing headers to scheduler plugins #775

nirrozenbaum · 2025-05-02T13:38:40Z

This PR makes request headers available in SchedulingContext (for use of scheduler plugins).
additionally, it fixes some minor typos and cosmetics.

Signed-off-by: Nir Rozenbaum <[email protected]>

netlify · 2025-05-02T13:38:57Z

✅ Deploy Preview for gateway-api-inference-extension ready!

Name	Link
🔨 Latest commit	`7b8a83e`
🔍 Latest deploy log	https://app.netlify.com/sites/gateway-api-inference-extension/deploys/6815183a3a3d2d0008f9d0da
😎 Deploy Preview	https://deploy-preview-775--gateway-api-inference-extension.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

nirrozenbaum · 2025-05-02T13:41:33Z

pkg/epp/scheduling/types/types.go

@@ -29,16 +29,18 @@ import (
 // LLMRequest is a structured representation of the fields we parse out of the LLMRequest body.
 type LLMRequest struct {
 	Model string
-	// Target models is a map of target model name to weight.
-	TargetModels map[string]int


removed TargetModels from LLMRequest, which is never set and is always empty.
LLM request holds the fields that are passed to scheduler. the scheduler doesn't care about the target models, but only about the ResolvedTargetModel (traffic splitting is done in request.go. when LLMRequest is passed to scheduler the target model has been resolved already).

Not this PR, but I was wondering if we should even keep the LLMRequest object. I worry that reqCtx is this monolithic super-object. But also, its been very useful to have, and its not really expensive to pass around since you'd just be passing a pointer.

in terms of best practices, it would probably be best to expose to each layer only the things it needs.
one mono super-object may be very convenient, but it opens the door for updating fields that the layer shouldn't.
we might be able to keep the mono object but work with layer interfaces that expose to each layer only the fields it needs.
anyway, as you said, this is not in scope of this PR

Agreed, I was noodling on how best to deal with this since I commented, I think I have some ideas. WIP tho

nirrozenbaum · 2025-05-02T13:41:58Z

cc @kfswain

pkg/epp/scheduling/types/types.go

kfswain · 2025-05-02T16:00:02Z

pkg/epp/scheduling/types/types.go

 	ResolvedTargetModel string
-	Critical            bool
+	// Critical is a boolean that specifies if a request is critical or not.
+	Critical bool


We should make this an int since we already have 3 criticality levels, and likely will be extended to have more.

In current code scheduler changes its algorithm based on the question of whether the request is critical or not.
I agree that this field should be changed, but this requires adaptation in more places other than this field.
should be done in a separate PR dedicated for this point (not in this PR scope).

kfswain · 2025-05-02T16:01:17Z

LGTM for the most part, the criticality change is technically out of scope for this PR, but would love to see it changed. Up to the author, will hold to let them decide.
/lgtm
/approve
/hold

k8s-ci-robot · 2025-05-02T16:01:25Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: kfswain, nirrozenbaum

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [kfswain]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

pkg/epp/scheduling/types/types.go

Signed-off-by: Nir Rozenbaum <[email protected]>

kfswain · 2025-05-02T19:27:57Z

/lgtm

nirrozenbaum · 2025-05-02T19:30:42Z

/unhold

* passing headers to scheduler plugins Signed-off-by: Nir Rozenbaum <[email protected]> * addressed code review comments Signed-off-by: Nir Rozenbaum <[email protected]> --------- Signed-off-by: Nir Rozenbaum <[email protected]>

passing headers to scheduler plugins

581a6a8

Signed-off-by: Nir Rozenbaum <[email protected]>

k8s-ci-robot requested a review from Jeffwan May 2, 2025 13:38

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label May 2, 2025

k8s-ci-robot requested a review from robscott May 2, 2025 13:38

k8s-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label May 2, 2025

nirrozenbaum commented May 2, 2025

View reviewed changes

shaneutt reviewed May 2, 2025

View reviewed changes

pkg/epp/scheduling/types/types.go Outdated Show resolved Hide resolved

kfswain reviewed May 2, 2025

View reviewed changes

k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 2, 2025

k8s-ci-robot assigned kfswain May 2, 2025

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 2, 2025

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 2, 2025

danehans reviewed May 2, 2025

View reviewed changes

pkg/epp/scheduling/types/types.go Show resolved Hide resolved

danehans reviewed May 2, 2025

View reviewed changes

pkg/epp/scheduling/types/types.go Outdated Show resolved Hide resolved

addressed code review comments

7b8a83e

Signed-off-by: Nir Rozenbaum <[email protected]>

nirrozenbaum force-pushed the headers-for-scheduler branch from b2b3244 to 7b8a83e Compare May 2, 2025 19:08

k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 2, 2025

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 2, 2025

k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 2, 2025

k8s-ci-robot merged commit 6a2a3ec into kubernetes-sigs:main May 2, 2025
8 checks passed

nirrozenbaum deleted the headers-for-scheduler branch May 2, 2025 19:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

passing headers to scheduler plugins #775

passing headers to scheduler plugins #775

Uh oh!

nirrozenbaum commented May 2, 2025

Uh oh!

netlify bot commented May 2, 2025 •

edited

Loading

Uh oh!

nirrozenbaum May 2, 2025

Uh oh!

kfswain May 2, 2025

Uh oh!

nirrozenbaum May 2, 2025

Uh oh!

kfswain May 2, 2025

Uh oh!

nirrozenbaum commented May 2, 2025

Uh oh!

Uh oh!

kfswain May 2, 2025

Uh oh!

nirrozenbaum May 2, 2025

Uh oh!

kfswain May 2, 2025

Uh oh!

kfswain commented May 2, 2025

Uh oh!

k8s-ci-robot commented May 2, 2025

Uh oh!

Uh oh!

Uh oh!

kfswain commented May 2, 2025

Uh oh!

nirrozenbaum commented May 2, 2025

Uh oh!

Uh oh!

Uh oh!

passing headers to scheduler plugins #775

passing headers to scheduler plugins #775

Uh oh!

Conversation

nirrozenbaum commented May 2, 2025

Uh oh!

netlify bot commented May 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for gateway-api-inference-extension ready!

Uh oh!

nirrozenbaum May 2, 2025

Choose a reason for hiding this comment

Uh oh!

kfswain May 2, 2025

Choose a reason for hiding this comment

Uh oh!

nirrozenbaum May 2, 2025

Choose a reason for hiding this comment

Uh oh!

kfswain May 2, 2025

Choose a reason for hiding this comment

Uh oh!

nirrozenbaum commented May 2, 2025

Uh oh!

Uh oh!

kfswain May 2, 2025

Choose a reason for hiding this comment

Uh oh!

nirrozenbaum May 2, 2025

Choose a reason for hiding this comment

Uh oh!

kfswain May 2, 2025

Choose a reason for hiding this comment

Uh oh!

kfswain commented May 2, 2025

Uh oh!

k8s-ci-robot commented May 2, 2025

Uh oh!

Uh oh!

Uh oh!

kfswain commented May 2, 2025

Uh oh!

nirrozenbaum commented May 2, 2025

Uh oh!

Uh oh!

Uh oh!

netlify bot commented May 2, 2025 •

edited

Loading