Skip to content
This repository was archived by the owner on May 15, 2025. It is now read-only.

[UPSTREAM-SYNC] Feat: Add invocation of Post response plugins #132

Closed
wants to merge 7 commits into from

Conversation

shmuelk
Copy link
Collaborator

@shmuelk shmuelk commented May 7, 2025

This PR adds the invocation of any configured PostResponse plugins during the processing of responses from the chosen inference server. It also enables the PostResponse plugin to add/modify headers that are ultimately sent to the client.

@shmuelk shmuelk requested review from shaneutt and elevran May 7, 2025 11:00
@elevran elevran changed the title [UPSTREAM-SYNC} Feat: Add invocation of Post response plugins [UPSTREAM-SYNC] Feat: Add invocation of Post response plugins May 7, 2025
Copy link
Collaborator

@elevran elevran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small typo and a question. Other than that:
/lgtm
/approve

result, err = s.scheduler.OnResponse(ctx, llmReq, reqCtx.TargetPod)
if err != nil {
logger.V(logutil.DEFAULT).Error(err, "Error handling response")
reqCtx.ResponseStatusCode = errutil.ModelServerError
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

q: will returning an error not send the response back to the user?
If this is an internal logic failure and we've already compute the LLM response, might be wasteful.

@elevran
Copy link
Collaborator

elevran commented May 7, 2025

IIUC, this depends on #135?
Will you draft a new PR after #135 is merged or rebase this one?
Also, consider putting it in draft status until it is ready for review

Comment on lines +221 to +226
for _, pod := range pods {
if pod.GetPod().NamespacedName.String() == targetPodName {
targetPod = pod
break
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if a pod was selected scheduler and crashed before this call, what happens here?
targetPod remains nil?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct.
The whole question of failure handling is not sufficiently specified in the system and needs to be addressed. There are many failure scenarios (e.g., EPP crashes during different points in processing, envoy crashes, ...) across and inside components.

@shaneutt
Copy link
Collaborator

Since we're going to archive this fork shortly I think we can close this. Any objections?

@shmuelk shmuelk closed this May 13, 2025
@shmuelk shmuelk deleted the post-response branch May 15, 2025 13:13
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants