Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Periodic concurrency mode #411

Merged
merged 7 commits into from
Oct 7, 2023
Merged

Periodic concurrency mode #411

merged 7 commits into from
Oct 7, 2023

Conversation

matthewkotila
Copy link
Contributor

Epic branch. Already tested. Just merging into main.

matthewkotila and others added 7 commits October 6, 2023 17:20
…workflow (#401)

* Implement periodic concurrency manager/worker and inference profiler workflow

* Address feedback

* Address feedback
* Add macros to reuse it for checking range options

* Add tests for periodic-concurrency-range option

* Add periodic-concurrency-range and request-period options

* Add doc for periodic-concurrency-range and request-period

* Add test for request-period option

* Revert macro and add reusable test function

* Add more tests

* Small refactor

* Refactor a subcase

* Require bi-directional gRPC streaming for periodic concurrency mode

* Address feedback

* Refine the error message

* Add bi-directional gRPC streaming options for periodic concurrency mode

* Add request-parameter option and refactor

* Refactor

* Add valid case for request-parameter option

* Add --request-parameter doc and edit periodic concurrency description

* Custom request parameter is currently only supported by gRPC

* Parse and store the type of request parameter

* Add checks between act vs. exp

* Remove uint type and rebase

* Change doc

* Minor fix

* Address feedback
* Initial draft

* Small edit

* Add note

* Address feedback

* Minor fix
…n infinite loop (#403)

* Throw exception when request period larger than max tokens rather than infinite loop

* Update periodic_concurrency_worker.cc
* Initial parameter passing support

* Fix parameter ordering

* Remove commented code

* Remove unnecessary type in request parameter

* Fix includes and map assignment

* Update grpc request parameters to use only strings in PA
* Add continus batch size benchmark to LLM guide

* Update llm.md

* Update llm.md
@matthewkotila matthewkotila merged commit 9748ba1 into main Oct 7, 2023
3 checks passed
@matthewkotila matthewkotila deleted the periodic-concurrency-mode branch October 7, 2023 00:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants