Skip to content

[UPSTREAM-SYNC] Feat: Add invocation of Post response plugins #132

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: upstream-sync
Choose a base branch
from

Conversation

shmuelk
Copy link
Collaborator

@shmuelk shmuelk commented May 7, 2025

This PR adds the invocation of any configured PostResponse plugins during the processing of responses from the chosen inference server. It also enables the PostResponse plugin to add/modify headers that are ultimately sent to the client.

@shmuelk shmuelk requested review from shaneutt and elevran May 7, 2025 11:00
@elevran elevran changed the title [UPSTREAM-SYNC} Feat: Add invocation of Post response plugins [UPSTREAM-SYNC] Feat: Add invocation of Post response plugins May 7, 2025
Copy link
Collaborator

@elevran elevran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small typo and a question. Other than that:
/lgtm
/approve

result, err = s.scheduler.OnResponse(ctx, llmReq, reqCtx.TargetPod)
if err != nil {
logger.V(logutil.DEFAULT).Error(err, "Error handling response")
reqCtx.ResponseStatusCode = errutil.ModelServerError
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

q: will returning an error not send the response back to the user?
If this is an internal logic failure and we've already compute the LLM response, might be wasteful.

@elevran
Copy link
Collaborator

elevran commented May 7, 2025

IIUC, this depends on #135?
Will you draft a new PR after #135 is merged or rebase this one?
Also, consider putting it in draft status until it is ready for review

Comment on lines +221 to +226
for _, pod := range pods {
if pod.GetPod().NamespacedName.String() == targetPodName {
targetPod = pod
break
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if a pod was selected scheduler and crashed before this call, what happens here?
targetPod remains nil?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants