-
Notifications
You must be signed in to change notification settings - Fork 32
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add support for mistral models. They use the same prompt template as Mixtral. This dramatically increases the performance of the full pipeline on a laptop using LLamaCPP. The issue with Mixtral is that even in GGUF form the model is too big for most consumer hardware. Completions and prompting using this model hang for long periods of time. The logging I added is optional debug logs that print all of the prompts in each LLMBlock and the prompt currently being generated. I also added a loading bar for each LLMBlock pass that increments each time we get a prompt response back. Signed-off-by: Charlie Doern <[email protected]>
- Loading branch information
Showing
2 changed files
with
7 additions
and
5 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters