-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dynamically select pod instead of statically passing Pods/PodIPs into ext proc #12
Comments
In the near future, a new CR, tentatively called llm-instance-gateway/docs/proposals/002-api-proposal/proposal.md Lines 197 to 221 in d385c80
This resource will reference the services exposing inference servers that share certain characteristics, mainly the same set of loaded adapters. The final name of this resource is still being discussed, and you can review the related documents for more information: https://docs.google.com/document/d/1v1Rp6v_AfY5EfwpLqDadDpAaCg7OcnrUutzBUNxGoJE/edit?pli=1 Once introduced, The idea of referencing services directly could be re-evaluated, given that using selectors was the original approach. However, directly referencing pods would involve managing a structure similar to Let me know your thoughts on this—I’d be happy to discuss it further. |
Currently we pass the static pod names and IPs to ext proc server, we should use more dynamic approach to fetch the data from kubernetes cluster like by selectors.
The text was updated successfully, but these errors were encountered: