Skip to content

Commit

Permalink
Add Patch for LLM Server (#379)
Browse files Browse the repository at this point in the history
Small patch reflecting some recent changes in `sf.Program` and
`sf.ProgramFunction`.

Was originally included as part of this PR, which adds an integration
test to shortfin llm serving:
#373

But, parsing it out, since that may take a little more time to make
adjustments/add workflow file.

Without it, you get the following error when trying to launch the
server:

```text
[2024-10-30 11:59:09.939] [info] [manager.py:40] System manager command processor stopped
[2024-10-30 11:59:09.991] [error] [on.py:121] Traceback (most recent call last):
  File "/home/amd/stephen/repos/forks/SHARK-Platform/.venv/lib/python3.12/site-packages/starlette/routing.py", line 693, in lifespan
    async with self.lifespan_context(app) as maybe_state:
  File "/home/amd/.pyenv/versions/3.12.5/lib/python3.12/contextlib.py", line 210, in __aenter__
    return await anext(self.gen)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/amd/stephen/repos/forks/SHARK-Platform/.venv/lib/python3.12/site-packages/shortfin_apps/llm/server.py", line 42, in lifespan
    service.start()
  File "/home/amd/stephen/repos/forks/SHARK-Platform/.venv/lib/python3.12/site-packages/shortfin_apps/llm/components/service.py", line 69, in start
    self.inference_program = sf.Program(
                             ^^^^^^^^^^^
TypeError: __new__(): incompatible function arguments. The following argument types are supported:
    1. __new__(cls: object, modules: collections.abc.Sequence[_shortfin_default.lib.local.ProgramModule], *, devices: collections.abc.Sequence[_shortfin_default.lib.local.Device], trace_execution: bool = False, isolation: _shortfin_default.lib.local.ProgramIsolation = ProgramIsolation.PER_FIBER) -> _shortfin_default.lib.local.Program

Invoked with types: nanobind.nb_type_0, kwargs = { modules: list, fiber: _shortfin_default.lib.local.Fiber, trace_execution: bool }

[2024-10-30 11:59:09.991] [error] [on.py:59] Application startup failed. Exiting.
```

With it, you're able to start server, send requests, and receive
responses.
  • Loading branch information
stbaione authored Oct 30, 2024
1 parent 866691e commit 66a4043
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions shortfin/python/shortfin_apps/llm/components/service.py
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ def start(self):
)
]
+ self.inference_modules,
fiber=self.main_fiber,
devices=self.sysman.ls.devices,
trace_execution=False,
)
# Resolve prefill entrypoints.
Expand Down Expand Up @@ -393,7 +393,7 @@ async def run(self):
"".join([f"\n {i}: {ary.shape}" for i, ary in enumerate(args)]),
)
# Invoke. Logits are of shape [bs, bsl, d].
(logits,) = await fn(*args)
(logits,) = await fn(*args, fiber=self.fiber)

# Return results.
for i in range(req_count):
Expand Down

0 comments on commit 66a4043

Please sign in to comment.