Stream LLM Output #46

hyperdriveguy · 2024-08-01T20:26:27Z

wirthlin commented Jun 26, 2024
Can you provide a more detailed summary of why this is difficult? It is something we can look at in the future but we will hold off on it for this summer.

hyperdriveguy commented Jun 26, 2024
Streaming uses (potentially asynchronous) generators. At the moment, LLM generation is abstracted behind several classes so that things can be properly logged and the right "chat branch" for a question can be used. Streaming will require generators at every level of abstraction, from the LLM generation to the web server. The frontend will also need to accept these generators as some kind a stream.
Basically, implementing it means changing the chat interfaces, the chat session manager (including the logging system), the flask web server, and the frontend JavaScript.

The pros outweigh the cons, and at this point it should be more feasible to implement.

hyperdriveguy added the enhancement New feature or request label Aug 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stream LLM Output #46

Stream LLM Output #46

hyperdriveguy commented Aug 1, 2024

Stream LLM Output #46

Stream LLM Output #46

Comments

hyperdriveguy commented Aug 1, 2024