Skip to content

[build]: Updating vllm deployment to support the latest images and scorers. #112

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 5, 2025

Conversation

kfirtoledo
Copy link

Update the vLLM P2P deployment to support KV-cache and load scorers.

@kfirtoledo kfirtoledo requested review from shaneutt and elevran May 4, 2025 12:16
key: ${HF_SECRET_KEY}
- name: ENABLE_KVCACHE_AWARE_SCORER
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we'll need to find a more elegant way of loading/enabling specific plugins (more than just filters and/or scorers) - but ok for now.

@elevran
Copy link
Collaborator

elevran commented May 4, 2025

Please assess if the CICD failure is related to the code added.

@kfirtoledo
Copy link
Author

@elevran, it is not related to my code. The same failure occurred in the last commit that was merged:
https://github.com/neuralmagic/gateway-api-inference-extension/runs/41610009981
Also, not all PRs trigger the full pipeline, that's why you see some of them passing only part of it.

@kfirtoledo kfirtoledo merged commit 403fae6 into neuralmagic:dev May 5, 2025
1 check failed
clubanderson pushed a commit that referenced this pull request May 7, 2025
Update the vLLM P2P deployment to support KV-cache and load scorers.

Signed-off-by: Kfir Toledo <[email protected]>
clubanderson pushed a commit that referenced this pull request May 7, 2025
Update the vLLM P2P deployment to support KV-cache and load scorers.

Signed-off-by: Kfir Toledo <[email protected]>
clubanderson pushed a commit that referenced this pull request May 7, 2025
Update the vLLM P2P deployment to support KV-cache and load scorers.

Signed-off-by: Kfir Toledo <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants