Skip to content

EPP Architecture proposal #683

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Apr 21, 2025
Merged

Conversation

kfswain
Copy link
Collaborator

@kfswain kfswain commented Apr 14, 2025

No description provided.

@k8s-ci-robot
Copy link
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Apr 14, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: kfswain

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Apr 14, 2025
Copy link

netlify bot commented Apr 14, 2025

Deploy Preview for gateway-api-inference-extension ready!

Name Link
🔨 Latest commit 35052ce
🔍 Latest deploy log https://app.netlify.com/sites/gateway-api-inference-extension/deploys/68068efc6af4700008501f31
😎 Deploy Preview https://deploy-preview-683--gateway-api-inference-extension.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@kfswain kfswain force-pushed the refactor branch 2 times, most recently from 04f18ab to 5de9837 Compare April 18, 2025 21:21
@kfswain kfswain changed the title [WIP] EPP Architecture proposal EPP Architecture proposal Apr 18, 2025
@kfswain kfswain marked this pull request as ready for review April 18, 2025 21:23
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 18, 2025
## Overview
At a quick glance, the EPP is being broken into specific layers. The `Data Layer` is of note, as it is a vertical that will be accessed by all the others. The data layer manages the k8s, data, metric & usage data, as well as processing of the above data to determine resource scarcity regimes.

The other layers are handled in sequential process. Starting with the **Ext-Proc** call. The request is buffered and then sent to the **Routing Layer**, which processes any User defined per-InferenceModel routing rules & request enrichment happening first (at the time of writing that is currently just translating the InferenceModel name to a weight-split actual model). Then _all_ requests pass through the **Flow Controller** to ensure that any request entry to the pool adhereing to the guidelines set by the Priority,Fairness, & Queueing configuration. And finally, the **Scheduling Layer** is the load balancing algorithm that intelligently routes requests based on the current state of the InferencePool.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am concerned about merging this PR due to Flow Controller references without having #674 resolved. Can you link to this issue and provide the add'l context?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: s/Fairness/ Fairness/ (add a space after the comma).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reworded and linked to the issue

@nirrozenbaum
Copy link
Contributor

overall, seems good to me. left one last minor comment.

@kfswain
Copy link
Collaborator Author

kfswain commented Apr 21, 2025

/retest

@nirrozenbaum
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 21, 2025
@kfswain kfswain mentioned this pull request Apr 21, 2025
@k8s-ci-robot k8s-ci-robot merged commit c546506 into kubernetes-sigs:main Apr 21, 2025
7 checks passed
@ahg-g
Copy link
Contributor

ahg-g commented Apr 21, 2025

The subdirectory name should probably be updated, currently 00x-epp-compliance-proposal. Other than giving it a number, why are we calling it "compliance proposal", isn't this more of a EPP "architecture proposal"?

@kfswain
Copy link
Collaborator Author

kfswain commented Apr 21, 2025

The subdirectory name should probably be updated, currently 00x-epp-compliance-proposal. Other than giving it a number, why are we calling it "compliance proposal", isn't this more of a EPP "architecture proposal"?

++ at the first pass, I was going to suggest that an EPP implementer should follow this architecture and we would have tests to validate compliance. But decided against it. Will rename in another PR

@kfswain kfswain deleted the refactor branch April 21, 2025 20:56
rlakhtakia pushed a commit to rlakhtakia/gateway-api-inference-extension that referenced this pull request Apr 25, 2025
* initial changes

* Adding to proposal to give a quick barebones definition to refactor

* feedback changes

* more feedback addressing
k8s-ci-robot pushed a commit that referenced this pull request Apr 25, 2025
* Add unit test coverage for pod APIs under datastore/pkg

* Add unit test coverage for pod APIs under datastore/pkg

* Add unit test coverage for pod APIs under datastore/pkg

* Add unit test coverage for pod APIs under datastore/pkg

* EPP Architecture proposal (#683)

* initial changes

* Adding to proposal to give a quick barebones definition to refactor

* feedback changes

* more feedback addressing

* removed unused Fake struct (#723)

Signed-off-by: Nir Rozenbaum <[email protected]>

* epp: return correct response for trailers (#726)

This looks like a copy paste error.

* Refactor scheduler to run plugins (#677)

* Refactor scheduler to run plugins

* Add scheduler plugin latency metric

* Address comments

* Address comments

* Complete the InferencePool documentation (#673)

* Initial guide for inference pool

* Add extensionReference to the InferencePool spec

* Fix list formatting

* Remove unused labels

* Autogenerate the spec

* Update site-src/api-types/inferencepool.md

Co-authored-by: Rob Scott <[email protected]>

* Update site-src/api-types/inferencepool.md

Co-authored-by: Rob Scott <[email protected]>

* Update site-src/api-types/inferencepool.md

Co-authored-by: Rob Scott <[email protected]>

* Update site-src/api-types/inferencepool.md

Co-authored-by: Rob Scott <[email protected]>

* Update site-src/api-types/inferencepool.md

Co-authored-by: Rob Scott <[email protected]>

* Update site-src/api-types/inferencepool.md

Co-authored-by: Rob Scott <[email protected]>

* Rename llm-pool names in rollout example

* Add use cases for replacing an inference pool

* Rewording the background section

* Create replacing-inference-pool.md

* Replace instructions with a link for how to replace an inference pool

* Update replacing-inference-pool.md

* Update mkdocs.yml

* Update replacing-inference-pool.md

* Update inferencemodel_types.go

* Update inferencepool.md

* Update site-src/guides/replacing-inference-pool.md

Co-authored-by: Rob Scott <[email protected]>

---------

Co-authored-by: Rob Scott <[email protected]>

* reduce log level in metrics logger not to trash the log (#708)

* reduce log level in metrics logger not to trash the log

Signed-off-by: Nir Rozenbaum <[email protected]>

* rename flush metrics to refresh metrics

Signed-off-by: Nir Rozenbaum <[email protected]>

* revert log level

Signed-off-by: Nir Rozenbaum <[email protected]>

---------

Signed-off-by: Nir Rozenbaum <[email protected]>

* few updates in datastore (#713)

* few updates in datastore

Signed-off-by: Nir Rozenbaum <[email protected]>

* PoolSet documentation

Signed-off-by: Nir Rozenbaum <[email protected]>

* error phrasing

Signed-off-by: Nir Rozenbaum <[email protected]>

* removed unused pool arg from PodUpdateOrAddIfNotExist

Signed-off-by: Nir Rozenbaum <[email protected]>

* linter

Signed-off-by: Nir Rozenbaum <[email protected]>

---------

Signed-off-by: Nir Rozenbaum <[email protected]>

* scheduler refactoring (#730)

Signed-off-by: Nir Rozenbaum <[email protected]>

* filter irrelevant pod in pod_reconciler (#696)

* EPP: Update GetRandomPod() to return nil if no pods exist (#731)

Signed-off-by: Daneyon Hansen <[email protected]>

* Move filter and scorer plugins registration to a separate file (#729)

* Move filters and scorers registration to filter/scorer specific files

* Default scheduler config contains empty list of scorers

Signed-off-by: Maya Barnea <[email protected]>

* Default plugin is not a scorer any more

Signed-off-by: Maya Barnea <[email protected]>

* fix scheduler test + lint comments

Signed-off-by: Maya Barnea <[email protected]>

---------

Signed-off-by: Maya Barnea <[email protected]>

* Update issue templates (#738)

* Update issue templates

* Updates artifacts for v0.3.0-rc.1 release

Signed-off-by: Kellen Swain <[email protected]>

* Updates bbr chart for v0.3.0-rc.1 release

Signed-off-by: Kellen Swain <[email protected]>

* Updates artifacts for v0.3.0 release

Signed-off-by: Kellen Swain <[email protected]>

* Adding blank issue template so that all issues start with  label

---------

Signed-off-by: Kellen Swain <[email protected]>

* Add unit test coverage for pod APIs under datastore/pkg

* few updates in datastore (#713)

* few updates in datastore

Signed-off-by: Nir Rozenbaum <[email protected]>

* PoolSet documentation

Signed-off-by: Nir Rozenbaum <[email protected]>

* error phrasing

Signed-off-by: Nir Rozenbaum <[email protected]>

* removed unused pool arg from PodUpdateOrAddIfNotExist

Signed-off-by: Nir Rozenbaum <[email protected]>

* linter

Signed-off-by: Nir Rozenbaum <[email protected]>

---------

Signed-off-by: Nir Rozenbaum <[email protected]>

* few updates in datastore (#713)

* few updates in datastore

Signed-off-by: Nir Rozenbaum <[email protected]>

* PoolSet documentation

Signed-off-by: Nir Rozenbaum <[email protected]>

* error phrasing

Signed-off-by: Nir Rozenbaum <[email protected]>

* removed unused pool arg from PodUpdateOrAddIfNotExist

Signed-off-by: Nir Rozenbaum <[email protected]>

* linter

Signed-off-by: Nir Rozenbaum <[email protected]>

---------

Signed-off-by: Nir Rozenbaum <[email protected]>

* Add unit test coverage for pod APIs under datastore/pkg

---------

Signed-off-by: Nir Rozenbaum <[email protected]>
Signed-off-by: Daneyon Hansen <[email protected]>
Signed-off-by: Maya Barnea <[email protected]>
Signed-off-by: Kellen Swain <[email protected]>
Co-authored-by: Kellen Swain <[email protected]>
Co-authored-by: Nir Rozenbaum <[email protected]>
Co-authored-by: John Howard <[email protected]>
Co-authored-by: Cong Liu <[email protected]>
Co-authored-by: Nicole Xin <[email protected]>
Co-authored-by: Rob Scott <[email protected]>
Co-authored-by: nayihz <[email protected]>
Co-authored-by: Daneyon Hansen <[email protected]>
Co-authored-by: Maya Barnea <[email protected]>
kaushikmitr pushed a commit to kaushikmitr/llm-instance-gateway that referenced this pull request Apr 28, 2025
* Add unit test coverage for pod APIs under datastore/pkg

* Add unit test coverage for pod APIs under datastore/pkg

* Add unit test coverage for pod APIs under datastore/pkg

* Add unit test coverage for pod APIs under datastore/pkg

* EPP Architecture proposal (kubernetes-sigs#683)

* initial changes

* Adding to proposal to give a quick barebones definition to refactor

* feedback changes

* more feedback addressing

* removed unused Fake struct (kubernetes-sigs#723)

Signed-off-by: Nir Rozenbaum <[email protected]>

* epp: return correct response for trailers (kubernetes-sigs#726)

This looks like a copy paste error.

* Refactor scheduler to run plugins (kubernetes-sigs#677)

* Refactor scheduler to run plugins

* Add scheduler plugin latency metric

* Address comments

* Address comments

* Complete the InferencePool documentation (kubernetes-sigs#673)

* Initial guide for inference pool

* Add extensionReference to the InferencePool spec

* Fix list formatting

* Remove unused labels

* Autogenerate the spec

* Update site-src/api-types/inferencepool.md

Co-authored-by: Rob Scott <[email protected]>

* Update site-src/api-types/inferencepool.md

Co-authored-by: Rob Scott <[email protected]>

* Update site-src/api-types/inferencepool.md

Co-authored-by: Rob Scott <[email protected]>

* Update site-src/api-types/inferencepool.md

Co-authored-by: Rob Scott <[email protected]>

* Update site-src/api-types/inferencepool.md

Co-authored-by: Rob Scott <[email protected]>

* Update site-src/api-types/inferencepool.md

Co-authored-by: Rob Scott <[email protected]>

* Rename llm-pool names in rollout example

* Add use cases for replacing an inference pool

* Rewording the background section

* Create replacing-inference-pool.md

* Replace instructions with a link for how to replace an inference pool

* Update replacing-inference-pool.md

* Update mkdocs.yml

* Update replacing-inference-pool.md

* Update inferencemodel_types.go

* Update inferencepool.md

* Update site-src/guides/replacing-inference-pool.md

Co-authored-by: Rob Scott <[email protected]>

---------

Co-authored-by: Rob Scott <[email protected]>

* reduce log level in metrics logger not to trash the log (kubernetes-sigs#708)

* reduce log level in metrics logger not to trash the log

Signed-off-by: Nir Rozenbaum <[email protected]>

* rename flush metrics to refresh metrics

Signed-off-by: Nir Rozenbaum <[email protected]>

* revert log level

Signed-off-by: Nir Rozenbaum <[email protected]>

---------

Signed-off-by: Nir Rozenbaum <[email protected]>

* few updates in datastore (kubernetes-sigs#713)

* few updates in datastore

Signed-off-by: Nir Rozenbaum <[email protected]>

* PoolSet documentation

Signed-off-by: Nir Rozenbaum <[email protected]>

* error phrasing

Signed-off-by: Nir Rozenbaum <[email protected]>

* removed unused pool arg from PodUpdateOrAddIfNotExist

Signed-off-by: Nir Rozenbaum <[email protected]>

* linter

Signed-off-by: Nir Rozenbaum <[email protected]>

---------

Signed-off-by: Nir Rozenbaum <[email protected]>

* scheduler refactoring (kubernetes-sigs#730)

Signed-off-by: Nir Rozenbaum <[email protected]>

* filter irrelevant pod in pod_reconciler (kubernetes-sigs#696)

* EPP: Update GetRandomPod() to return nil if no pods exist (kubernetes-sigs#731)

Signed-off-by: Daneyon Hansen <[email protected]>

* Move filter and scorer plugins registration to a separate file (kubernetes-sigs#729)

* Move filters and scorers registration to filter/scorer specific files

* Default scheduler config contains empty list of scorers

Signed-off-by: Maya Barnea <[email protected]>

* Default plugin is not a scorer any more

Signed-off-by: Maya Barnea <[email protected]>

* fix scheduler test + lint comments

Signed-off-by: Maya Barnea <[email protected]>

---------

Signed-off-by: Maya Barnea <[email protected]>

* Update issue templates (kubernetes-sigs#738)

* Update issue templates

* Updates artifacts for v0.3.0-rc.1 release

Signed-off-by: Kellen Swain <[email protected]>

* Updates bbr chart for v0.3.0-rc.1 release

Signed-off-by: Kellen Swain <[email protected]>

* Updates artifacts for v0.3.0 release

Signed-off-by: Kellen Swain <[email protected]>

* Adding blank issue template so that all issues start with  label

---------

Signed-off-by: Kellen Swain <[email protected]>

* Add unit test coverage for pod APIs under datastore/pkg

* few updates in datastore (kubernetes-sigs#713)

* few updates in datastore

Signed-off-by: Nir Rozenbaum <[email protected]>

* PoolSet documentation

Signed-off-by: Nir Rozenbaum <[email protected]>

* error phrasing

Signed-off-by: Nir Rozenbaum <[email protected]>

* removed unused pool arg from PodUpdateOrAddIfNotExist

Signed-off-by: Nir Rozenbaum <[email protected]>

* linter

Signed-off-by: Nir Rozenbaum <[email protected]>

---------

Signed-off-by: Nir Rozenbaum <[email protected]>

* few updates in datastore (kubernetes-sigs#713)

* few updates in datastore

Signed-off-by: Nir Rozenbaum <[email protected]>

* PoolSet documentation

Signed-off-by: Nir Rozenbaum <[email protected]>

* error phrasing

Signed-off-by: Nir Rozenbaum <[email protected]>

* removed unused pool arg from PodUpdateOrAddIfNotExist

Signed-off-by: Nir Rozenbaum <[email protected]>

* linter

Signed-off-by: Nir Rozenbaum <[email protected]>

---------

Signed-off-by: Nir Rozenbaum <[email protected]>

* Add unit test coverage for pod APIs under datastore/pkg

---------

Signed-off-by: Nir Rozenbaum <[email protected]>
Signed-off-by: Daneyon Hansen <[email protected]>
Signed-off-by: Maya Barnea <[email protected]>
Signed-off-by: Kellen Swain <[email protected]>
Co-authored-by: Kellen Swain <[email protected]>
Co-authored-by: Nir Rozenbaum <[email protected]>
Co-authored-by: John Howard <[email protected]>
Co-authored-by: Cong Liu <[email protected]>
Co-authored-by: Nicole Xin <[email protected]>
Co-authored-by: Rob Scott <[email protected]>
Co-authored-by: nayihz <[email protected]>
Co-authored-by: Daneyon Hansen <[email protected]>
Co-authored-by: Maya Barnea <[email protected]>
rlakhtakia pushed a commit to rlakhtakia/gateway-api-inference-extension that referenced this pull request Apr 28, 2025
* initial changes

* Adding to proposal to give a quick barebones definition to refactor

* feedback changes

* more feedback addressing
kfswain added a commit to kfswain/llm-instance-gateway that referenced this pull request Apr 29, 2025
* initial changes

* Adding to proposal to give a quick barebones definition to refactor

* feedback changes

* more feedback addressing
kfswain added a commit to kfswain/llm-instance-gateway that referenced this pull request Apr 29, 2025
* Add unit test coverage for pod APIs under datastore/pkg

* Add unit test coverage for pod APIs under datastore/pkg

* Add unit test coverage for pod APIs under datastore/pkg

* Add unit test coverage for pod APIs under datastore/pkg

* EPP Architecture proposal (kubernetes-sigs#683)

* initial changes

* Adding to proposal to give a quick barebones definition to refactor

* feedback changes

* more feedback addressing

* removed unused Fake struct (kubernetes-sigs#723)

Signed-off-by: Nir Rozenbaum <[email protected]>

* epp: return correct response for trailers (kubernetes-sigs#726)

This looks like a copy paste error.

* Refactor scheduler to run plugins (kubernetes-sigs#677)

* Refactor scheduler to run plugins

* Add scheduler plugin latency metric

* Address comments

* Address comments

* Complete the InferencePool documentation (kubernetes-sigs#673)

* Initial guide for inference pool

* Add extensionReference to the InferencePool spec

* Fix list formatting

* Remove unused labels

* Autogenerate the spec

* Update site-src/api-types/inferencepool.md

Co-authored-by: Rob Scott <[email protected]>

* Update site-src/api-types/inferencepool.md

Co-authored-by: Rob Scott <[email protected]>

* Update site-src/api-types/inferencepool.md

Co-authored-by: Rob Scott <[email protected]>

* Update site-src/api-types/inferencepool.md

Co-authored-by: Rob Scott <[email protected]>

* Update site-src/api-types/inferencepool.md

Co-authored-by: Rob Scott <[email protected]>

* Update site-src/api-types/inferencepool.md

Co-authored-by: Rob Scott <[email protected]>

* Rename llm-pool names in rollout example

* Add use cases for replacing an inference pool

* Rewording the background section

* Create replacing-inference-pool.md

* Replace instructions with a link for how to replace an inference pool

* Update replacing-inference-pool.md

* Update mkdocs.yml

* Update replacing-inference-pool.md

* Update inferencemodel_types.go

* Update inferencepool.md

* Update site-src/guides/replacing-inference-pool.md

Co-authored-by: Rob Scott <[email protected]>

---------

Co-authored-by: Rob Scott <[email protected]>

* reduce log level in metrics logger not to trash the log (kubernetes-sigs#708)

* reduce log level in metrics logger not to trash the log

Signed-off-by: Nir Rozenbaum <[email protected]>

* rename flush metrics to refresh metrics

Signed-off-by: Nir Rozenbaum <[email protected]>

* revert log level

Signed-off-by: Nir Rozenbaum <[email protected]>

---------

Signed-off-by: Nir Rozenbaum <[email protected]>

* few updates in datastore (kubernetes-sigs#713)

* few updates in datastore

Signed-off-by: Nir Rozenbaum <[email protected]>

* PoolSet documentation

Signed-off-by: Nir Rozenbaum <[email protected]>

* error phrasing

Signed-off-by: Nir Rozenbaum <[email protected]>

* removed unused pool arg from PodUpdateOrAddIfNotExist

Signed-off-by: Nir Rozenbaum <[email protected]>

* linter

Signed-off-by: Nir Rozenbaum <[email protected]>

---------

Signed-off-by: Nir Rozenbaum <[email protected]>

* scheduler refactoring (kubernetes-sigs#730)

Signed-off-by: Nir Rozenbaum <[email protected]>

* filter irrelevant pod in pod_reconciler (kubernetes-sigs#696)

* EPP: Update GetRandomPod() to return nil if no pods exist (kubernetes-sigs#731)

Signed-off-by: Daneyon Hansen <[email protected]>

* Move filter and scorer plugins registration to a separate file (kubernetes-sigs#729)

* Move filters and scorers registration to filter/scorer specific files

* Default scheduler config contains empty list of scorers

Signed-off-by: Maya Barnea <[email protected]>

* Default plugin is not a scorer any more

Signed-off-by: Maya Barnea <[email protected]>

* fix scheduler test + lint comments

Signed-off-by: Maya Barnea <[email protected]>

---------

Signed-off-by: Maya Barnea <[email protected]>

* Update issue templates (kubernetes-sigs#738)

* Update issue templates

* Updates artifacts for v0.3.0-rc.1 release

Signed-off-by: Kellen Swain <[email protected]>

* Updates bbr chart for v0.3.0-rc.1 release

Signed-off-by: Kellen Swain <[email protected]>

* Updates artifacts for v0.3.0 release

Signed-off-by: Kellen Swain <[email protected]>

* Adding blank issue template so that all issues start with  label

---------

Signed-off-by: Kellen Swain <[email protected]>

* Add unit test coverage for pod APIs under datastore/pkg

* few updates in datastore (kubernetes-sigs#713)

* few updates in datastore

Signed-off-by: Nir Rozenbaum <[email protected]>

* PoolSet documentation

Signed-off-by: Nir Rozenbaum <[email protected]>

* error phrasing

Signed-off-by: Nir Rozenbaum <[email protected]>

* removed unused pool arg from PodUpdateOrAddIfNotExist

Signed-off-by: Nir Rozenbaum <[email protected]>

* linter

Signed-off-by: Nir Rozenbaum <[email protected]>

---------

Signed-off-by: Nir Rozenbaum <[email protected]>

* few updates in datastore (kubernetes-sigs#713)

* few updates in datastore

Signed-off-by: Nir Rozenbaum <[email protected]>

* PoolSet documentation

Signed-off-by: Nir Rozenbaum <[email protected]>

* error phrasing

Signed-off-by: Nir Rozenbaum <[email protected]>

* removed unused pool arg from PodUpdateOrAddIfNotExist

Signed-off-by: Nir Rozenbaum <[email protected]>

* linter

Signed-off-by: Nir Rozenbaum <[email protected]>

---------

Signed-off-by: Nir Rozenbaum <[email protected]>

* Add unit test coverage for pod APIs under datastore/pkg

---------

Signed-off-by: Nir Rozenbaum <[email protected]>
Signed-off-by: Daneyon Hansen <[email protected]>
Signed-off-by: Maya Barnea <[email protected]>
Signed-off-by: Kellen Swain <[email protected]>
Co-authored-by: Kellen Swain <[email protected]>
Co-authored-by: Nir Rozenbaum <[email protected]>
Co-authored-by: John Howard <[email protected]>
Co-authored-by: Cong Liu <[email protected]>
Co-authored-by: Nicole Xin <[email protected]>
Co-authored-by: Rob Scott <[email protected]>
Co-authored-by: nayihz <[email protected]>
Co-authored-by: Daneyon Hansen <[email protected]>
Co-authored-by: Maya Barnea <[email protected]>
kfswain added a commit to kfswain/llm-instance-gateway that referenced this pull request Apr 29, 2025
* initial changes

* Adding to proposal to give a quick barebones definition to refactor

* feedback changes

* more feedback addressing
kfswain added a commit to kfswain/llm-instance-gateway that referenced this pull request Apr 29, 2025
* Add unit test coverage for pod APIs under datastore/pkg

* Add unit test coverage for pod APIs under datastore/pkg

* Add unit test coverage for pod APIs under datastore/pkg

* Add unit test coverage for pod APIs under datastore/pkg

* EPP Architecture proposal (kubernetes-sigs#683)

* initial changes

* Adding to proposal to give a quick barebones definition to refactor

* feedback changes

* more feedback addressing

* removed unused Fake struct (kubernetes-sigs#723)

Signed-off-by: Nir Rozenbaum <[email protected]>

* epp: return correct response for trailers (kubernetes-sigs#726)

This looks like a copy paste error.

* Refactor scheduler to run plugins (kubernetes-sigs#677)

* Refactor scheduler to run plugins

* Add scheduler plugin latency metric

* Address comments

* Address comments

* Complete the InferencePool documentation (kubernetes-sigs#673)

* Initial guide for inference pool

* Add extensionReference to the InferencePool spec

* Fix list formatting

* Remove unused labels

* Autogenerate the spec

* Update site-src/api-types/inferencepool.md

Co-authored-by: Rob Scott <[email protected]>

* Update site-src/api-types/inferencepool.md

Co-authored-by: Rob Scott <[email protected]>

* Update site-src/api-types/inferencepool.md

Co-authored-by: Rob Scott <[email protected]>

* Update site-src/api-types/inferencepool.md

Co-authored-by: Rob Scott <[email protected]>

* Update site-src/api-types/inferencepool.md

Co-authored-by: Rob Scott <[email protected]>

* Update site-src/api-types/inferencepool.md

Co-authored-by: Rob Scott <[email protected]>

* Rename llm-pool names in rollout example

* Add use cases for replacing an inference pool

* Rewording the background section

* Create replacing-inference-pool.md

* Replace instructions with a link for how to replace an inference pool

* Update replacing-inference-pool.md

* Update mkdocs.yml

* Update replacing-inference-pool.md

* Update inferencemodel_types.go

* Update inferencepool.md

* Update site-src/guides/replacing-inference-pool.md

Co-authored-by: Rob Scott <[email protected]>

---------

Co-authored-by: Rob Scott <[email protected]>

* reduce log level in metrics logger not to trash the log (kubernetes-sigs#708)

* reduce log level in metrics logger not to trash the log

Signed-off-by: Nir Rozenbaum <[email protected]>

* rename flush metrics to refresh metrics

Signed-off-by: Nir Rozenbaum <[email protected]>

* revert log level

Signed-off-by: Nir Rozenbaum <[email protected]>

---------

Signed-off-by: Nir Rozenbaum <[email protected]>

* few updates in datastore (kubernetes-sigs#713)

* few updates in datastore

Signed-off-by: Nir Rozenbaum <[email protected]>

* PoolSet documentation

Signed-off-by: Nir Rozenbaum <[email protected]>

* error phrasing

Signed-off-by: Nir Rozenbaum <[email protected]>

* removed unused pool arg from PodUpdateOrAddIfNotExist

Signed-off-by: Nir Rozenbaum <[email protected]>

* linter

Signed-off-by: Nir Rozenbaum <[email protected]>

---------

Signed-off-by: Nir Rozenbaum <[email protected]>

* scheduler refactoring (kubernetes-sigs#730)

Signed-off-by: Nir Rozenbaum <[email protected]>

* filter irrelevant pod in pod_reconciler (kubernetes-sigs#696)

* EPP: Update GetRandomPod() to return nil if no pods exist (kubernetes-sigs#731)

Signed-off-by: Daneyon Hansen <[email protected]>

* Move filter and scorer plugins registration to a separate file (kubernetes-sigs#729)

* Move filters and scorers registration to filter/scorer specific files

* Default scheduler config contains empty list of scorers

Signed-off-by: Maya Barnea <[email protected]>

* Default plugin is not a scorer any more

Signed-off-by: Maya Barnea <[email protected]>

* fix scheduler test + lint comments

Signed-off-by: Maya Barnea <[email protected]>

---------

Signed-off-by: Maya Barnea <[email protected]>

* Update issue templates (kubernetes-sigs#738)

* Update issue templates

* Updates artifacts for v0.3.0-rc.1 release

Signed-off-by: Kellen Swain <[email protected]>

* Updates bbr chart for v0.3.0-rc.1 release

Signed-off-by: Kellen Swain <[email protected]>

* Updates artifacts for v0.3.0 release

Signed-off-by: Kellen Swain <[email protected]>

* Adding blank issue template so that all issues start with  label

---------

Signed-off-by: Kellen Swain <[email protected]>

* Add unit test coverage for pod APIs under datastore/pkg

* few updates in datastore (kubernetes-sigs#713)

* few updates in datastore

Signed-off-by: Nir Rozenbaum <[email protected]>

* PoolSet documentation

Signed-off-by: Nir Rozenbaum <[email protected]>

* error phrasing

Signed-off-by: Nir Rozenbaum <[email protected]>

* removed unused pool arg from PodUpdateOrAddIfNotExist

Signed-off-by: Nir Rozenbaum <[email protected]>

* linter

Signed-off-by: Nir Rozenbaum <[email protected]>

---------

Signed-off-by: Nir Rozenbaum <[email protected]>

* few updates in datastore (kubernetes-sigs#713)

* few updates in datastore

Signed-off-by: Nir Rozenbaum <[email protected]>

* PoolSet documentation

Signed-off-by: Nir Rozenbaum <[email protected]>

* error phrasing

Signed-off-by: Nir Rozenbaum <[email protected]>

* removed unused pool arg from PodUpdateOrAddIfNotExist

Signed-off-by: Nir Rozenbaum <[email protected]>

* linter

Signed-off-by: Nir Rozenbaum <[email protected]>

---------

Signed-off-by: Nir Rozenbaum <[email protected]>

* Add unit test coverage for pod APIs under datastore/pkg

---------

Signed-off-by: Nir Rozenbaum <[email protected]>
Signed-off-by: Daneyon Hansen <[email protected]>
Signed-off-by: Maya Barnea <[email protected]>
Signed-off-by: Kellen Swain <[email protected]>
Co-authored-by: Kellen Swain <[email protected]>
Co-authored-by: Nir Rozenbaum <[email protected]>
Co-authored-by: John Howard <[email protected]>
Co-authored-by: Cong Liu <[email protected]>
Co-authored-by: Nicole Xin <[email protected]>
Co-authored-by: Rob Scott <[email protected]>
Co-authored-by: nayihz <[email protected]>
Co-authored-by: Daneyon Hansen <[email protected]>
Co-authored-by: Maya Barnea <[email protected]>
kfswain added a commit to kfswain/llm-instance-gateway that referenced this pull request Apr 29, 2025
* initial changes

* Adding to proposal to give a quick barebones definition to refactor

* feedback changes

* more feedback addressing
kfswain added a commit to kfswain/llm-instance-gateway that referenced this pull request Apr 29, 2025
* Add unit test coverage for pod APIs under datastore/pkg

* Add unit test coverage for pod APIs under datastore/pkg

* Add unit test coverage for pod APIs under datastore/pkg

* Add unit test coverage for pod APIs under datastore/pkg

* EPP Architecture proposal (kubernetes-sigs#683)

* initial changes

* Adding to proposal to give a quick barebones definition to refactor

* feedback changes

* more feedback addressing

* removed unused Fake struct (kubernetes-sigs#723)

Signed-off-by: Nir Rozenbaum <[email protected]>

* epp: return correct response for trailers (kubernetes-sigs#726)

This looks like a copy paste error.

* Refactor scheduler to run plugins (kubernetes-sigs#677)

* Refactor scheduler to run plugins

* Add scheduler plugin latency metric

* Address comments

* Address comments

* Complete the InferencePool documentation (kubernetes-sigs#673)

* Initial guide for inference pool

* Add extensionReference to the InferencePool spec

* Fix list formatting

* Remove unused labels

* Autogenerate the spec

* Update site-src/api-types/inferencepool.md

Co-authored-by: Rob Scott <[email protected]>

* Update site-src/api-types/inferencepool.md

Co-authored-by: Rob Scott <[email protected]>

* Update site-src/api-types/inferencepool.md

Co-authored-by: Rob Scott <[email protected]>

* Update site-src/api-types/inferencepool.md

Co-authored-by: Rob Scott <[email protected]>

* Update site-src/api-types/inferencepool.md

Co-authored-by: Rob Scott <[email protected]>

* Update site-src/api-types/inferencepool.md

Co-authored-by: Rob Scott <[email protected]>

* Rename llm-pool names in rollout example

* Add use cases for replacing an inference pool

* Rewording the background section

* Create replacing-inference-pool.md

* Replace instructions with a link for how to replace an inference pool

* Update replacing-inference-pool.md

* Update mkdocs.yml

* Update replacing-inference-pool.md

* Update inferencemodel_types.go

* Update inferencepool.md

* Update site-src/guides/replacing-inference-pool.md

Co-authored-by: Rob Scott <[email protected]>

---------

Co-authored-by: Rob Scott <[email protected]>

* reduce log level in metrics logger not to trash the log (kubernetes-sigs#708)

* reduce log level in metrics logger not to trash the log

Signed-off-by: Nir Rozenbaum <[email protected]>

* rename flush metrics to refresh metrics

Signed-off-by: Nir Rozenbaum <[email protected]>

* revert log level

Signed-off-by: Nir Rozenbaum <[email protected]>

---------

Signed-off-by: Nir Rozenbaum <[email protected]>

* few updates in datastore (kubernetes-sigs#713)

* few updates in datastore

Signed-off-by: Nir Rozenbaum <[email protected]>

* PoolSet documentation

Signed-off-by: Nir Rozenbaum <[email protected]>

* error phrasing

Signed-off-by: Nir Rozenbaum <[email protected]>

* removed unused pool arg from PodUpdateOrAddIfNotExist

Signed-off-by: Nir Rozenbaum <[email protected]>

* linter

Signed-off-by: Nir Rozenbaum <[email protected]>

---------

Signed-off-by: Nir Rozenbaum <[email protected]>

* scheduler refactoring (kubernetes-sigs#730)

Signed-off-by: Nir Rozenbaum <[email protected]>

* filter irrelevant pod in pod_reconciler (kubernetes-sigs#696)

* EPP: Update GetRandomPod() to return nil if no pods exist (kubernetes-sigs#731)

Signed-off-by: Daneyon Hansen <[email protected]>

* Move filter and scorer plugins registration to a separate file (kubernetes-sigs#729)

* Move filters and scorers registration to filter/scorer specific files

* Default scheduler config contains empty list of scorers

Signed-off-by: Maya Barnea <[email protected]>

* Default plugin is not a scorer any more

Signed-off-by: Maya Barnea <[email protected]>

* fix scheduler test + lint comments

Signed-off-by: Maya Barnea <[email protected]>

---------

Signed-off-by: Maya Barnea <[email protected]>

* Update issue templates (kubernetes-sigs#738)

* Update issue templates

* Updates artifacts for v0.3.0-rc.1 release

Signed-off-by: Kellen Swain <[email protected]>

* Updates bbr chart for v0.3.0-rc.1 release

Signed-off-by: Kellen Swain <[email protected]>

* Updates artifacts for v0.3.0 release

Signed-off-by: Kellen Swain <[email protected]>

* Adding blank issue template so that all issues start with  label

---------

Signed-off-by: Kellen Swain <[email protected]>

* Add unit test coverage for pod APIs under datastore/pkg

* few updates in datastore (kubernetes-sigs#713)

* few updates in datastore

Signed-off-by: Nir Rozenbaum <[email protected]>

* PoolSet documentation

Signed-off-by: Nir Rozenbaum <[email protected]>

* error phrasing

Signed-off-by: Nir Rozenbaum <[email protected]>

* removed unused pool arg from PodUpdateOrAddIfNotExist

Signed-off-by: Nir Rozenbaum <[email protected]>

* linter

Signed-off-by: Nir Rozenbaum <[email protected]>

---------

Signed-off-by: Nir Rozenbaum <[email protected]>

* few updates in datastore (kubernetes-sigs#713)

* few updates in datastore

Signed-off-by: Nir Rozenbaum <[email protected]>

* PoolSet documentation

Signed-off-by: Nir Rozenbaum <[email protected]>

* error phrasing

Signed-off-by: Nir Rozenbaum <[email protected]>

* removed unused pool arg from PodUpdateOrAddIfNotExist

Signed-off-by: Nir Rozenbaum <[email protected]>

* linter

Signed-off-by: Nir Rozenbaum <[email protected]>

---------

Signed-off-by: Nir Rozenbaum <[email protected]>

* Add unit test coverage for pod APIs under datastore/pkg

---------

Signed-off-by: Nir Rozenbaum <[email protected]>
Signed-off-by: Daneyon Hansen <[email protected]>
Signed-off-by: Maya Barnea <[email protected]>
Signed-off-by: Kellen Swain <[email protected]>
Co-authored-by: Kellen Swain <[email protected]>
Co-authored-by: Nir Rozenbaum <[email protected]>
Co-authored-by: John Howard <[email protected]>
Co-authored-by: Cong Liu <[email protected]>
Co-authored-by: Nicole Xin <[email protected]>
Co-authored-by: Rob Scott <[email protected]>
Co-authored-by: nayihz <[email protected]>
Co-authored-by: Daneyon Hansen <[email protected]>
Co-authored-by: Maya Barnea <[email protected]>
JeffLuoo pushed a commit to JeffLuoo/gateway-api-inference-extension that referenced this pull request May 2, 2025
* initial changes

* Adding to proposal to give a quick barebones definition to refactor

* feedback changes

* more feedback addressing
JeffLuoo pushed a commit to JeffLuoo/gateway-api-inference-extension that referenced this pull request May 2, 2025
* Add unit test coverage for pod APIs under datastore/pkg

* Add unit test coverage for pod APIs under datastore/pkg

* Add unit test coverage for pod APIs under datastore/pkg

* Add unit test coverage for pod APIs under datastore/pkg

* EPP Architecture proposal (kubernetes-sigs#683)

* initial changes

* Adding to proposal to give a quick barebones definition to refactor

* feedback changes

* more feedback addressing

* removed unused Fake struct (kubernetes-sigs#723)

Signed-off-by: Nir Rozenbaum <[email protected]>

* epp: return correct response for trailers (kubernetes-sigs#726)

This looks like a copy paste error.

* Refactor scheduler to run plugins (kubernetes-sigs#677)

* Refactor scheduler to run plugins

* Add scheduler plugin latency metric

* Address comments

* Address comments

* Complete the InferencePool documentation (kubernetes-sigs#673)

* Initial guide for inference pool

* Add extensionReference to the InferencePool spec

* Fix list formatting

* Remove unused labels

* Autogenerate the spec

* Update site-src/api-types/inferencepool.md

Co-authored-by: Rob Scott <[email protected]>

* Update site-src/api-types/inferencepool.md

Co-authored-by: Rob Scott <[email protected]>

* Update site-src/api-types/inferencepool.md

Co-authored-by: Rob Scott <[email protected]>

* Update site-src/api-types/inferencepool.md

Co-authored-by: Rob Scott <[email protected]>

* Update site-src/api-types/inferencepool.md

Co-authored-by: Rob Scott <[email protected]>

* Update site-src/api-types/inferencepool.md

Co-authored-by: Rob Scott <[email protected]>

* Rename llm-pool names in rollout example

* Add use cases for replacing an inference pool

* Rewording the background section

* Create replacing-inference-pool.md

* Replace instructions with a link for how to replace an inference pool

* Update replacing-inference-pool.md

* Update mkdocs.yml

* Update replacing-inference-pool.md

* Update inferencemodel_types.go

* Update inferencepool.md

* Update site-src/guides/replacing-inference-pool.md

Co-authored-by: Rob Scott <[email protected]>

---------

Co-authored-by: Rob Scott <[email protected]>

* reduce log level in metrics logger not to trash the log (kubernetes-sigs#708)

* reduce log level in metrics logger not to trash the log

Signed-off-by: Nir Rozenbaum <[email protected]>

* rename flush metrics to refresh metrics

Signed-off-by: Nir Rozenbaum <[email protected]>

* revert log level

Signed-off-by: Nir Rozenbaum <[email protected]>

---------

Signed-off-by: Nir Rozenbaum <[email protected]>

* few updates in datastore (kubernetes-sigs#713)

* few updates in datastore

Signed-off-by: Nir Rozenbaum <[email protected]>

* PoolSet documentation

Signed-off-by: Nir Rozenbaum <[email protected]>

* error phrasing

Signed-off-by: Nir Rozenbaum <[email protected]>

* removed unused pool arg from PodUpdateOrAddIfNotExist

Signed-off-by: Nir Rozenbaum <[email protected]>

* linter

Signed-off-by: Nir Rozenbaum <[email protected]>

---------

Signed-off-by: Nir Rozenbaum <[email protected]>

* scheduler refactoring (kubernetes-sigs#730)

Signed-off-by: Nir Rozenbaum <[email protected]>

* filter irrelevant pod in pod_reconciler (kubernetes-sigs#696)

* EPP: Update GetRandomPod() to return nil if no pods exist (kubernetes-sigs#731)

Signed-off-by: Daneyon Hansen <[email protected]>

* Move filter and scorer plugins registration to a separate file (kubernetes-sigs#729)

* Move filters and scorers registration to filter/scorer specific files

* Default scheduler config contains empty list of scorers

Signed-off-by: Maya Barnea <[email protected]>

* Default plugin is not a scorer any more

Signed-off-by: Maya Barnea <[email protected]>

* fix scheduler test + lint comments

Signed-off-by: Maya Barnea <[email protected]>

---------

Signed-off-by: Maya Barnea <[email protected]>

* Update issue templates (kubernetes-sigs#738)

* Update issue templates

* Updates artifacts for v0.3.0-rc.1 release

Signed-off-by: Kellen Swain <[email protected]>

* Updates bbr chart for v0.3.0-rc.1 release

Signed-off-by: Kellen Swain <[email protected]>

* Updates artifacts for v0.3.0 release

Signed-off-by: Kellen Swain <[email protected]>

* Adding blank issue template so that all issues start with  label

---------

Signed-off-by: Kellen Swain <[email protected]>

* Add unit test coverage for pod APIs under datastore/pkg

* few updates in datastore (kubernetes-sigs#713)

* few updates in datastore

Signed-off-by: Nir Rozenbaum <[email protected]>

* PoolSet documentation

Signed-off-by: Nir Rozenbaum <[email protected]>

* error phrasing

Signed-off-by: Nir Rozenbaum <[email protected]>

* removed unused pool arg from PodUpdateOrAddIfNotExist

Signed-off-by: Nir Rozenbaum <[email protected]>

* linter

Signed-off-by: Nir Rozenbaum <[email protected]>

---------

Signed-off-by: Nir Rozenbaum <[email protected]>

* few updates in datastore (kubernetes-sigs#713)

* few updates in datastore

Signed-off-by: Nir Rozenbaum <[email protected]>

* PoolSet documentation

Signed-off-by: Nir Rozenbaum <[email protected]>

* error phrasing

Signed-off-by: Nir Rozenbaum <[email protected]>

* removed unused pool arg from PodUpdateOrAddIfNotExist

Signed-off-by: Nir Rozenbaum <[email protected]>

* linter

Signed-off-by: Nir Rozenbaum <[email protected]>

---------

Signed-off-by: Nir Rozenbaum <[email protected]>

* Add unit test coverage for pod APIs under datastore/pkg

---------

Signed-off-by: Nir Rozenbaum <[email protected]>
Signed-off-by: Daneyon Hansen <[email protected]>
Signed-off-by: Maya Barnea <[email protected]>
Signed-off-by: Kellen Swain <[email protected]>
Co-authored-by: Kellen Swain <[email protected]>
Co-authored-by: Nir Rozenbaum <[email protected]>
Co-authored-by: John Howard <[email protected]>
Co-authored-by: Cong Liu <[email protected]>
Co-authored-by: Nicole Xin <[email protected]>
Co-authored-by: Rob Scott <[email protected]>
Co-authored-by: nayihz <[email protected]>
Co-authored-by: Daneyon Hansen <[email protected]>
Co-authored-by: Maya Barnea <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants