Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Caching previous prompts for later reuse, part 1 #3073

Draft
wants to merge 6 commits into
base: main
Choose a base branch
from

Commits on Oct 10, 2024

  1. fix some missing #includes

    Signed-off-by: Jared Van Bortel <[email protected]>
    cebtenzzre committed Oct 10, 2024
    Configuration menu
    Copy the full SHA
    2267b02 View commit details
    Browse the repository at this point in the history
  2. llmodel: simplify tokenize()

    The value of n_past is no longer used here, so remove the parameter and
    the logic that tries to maintain it.
    
    Signed-off-by: Jared Van Bortel <[email protected]>
    cebtenzzre committed Oct 10, 2024
    Configuration menu
    Copy the full SHA
    be93986 View commit details
    Browse the repository at this point in the history
  3. llmodel: use a span for evalTokens batch to avoid copy

    Signed-off-by: Jared Van Bortel <[email protected]>
    cebtenzzre committed Oct 10, 2024
    Configuration menu
    Copy the full SHA
    9848c89 View commit details
    Browse the repository at this point in the history
  4. chatapi: clean up stubs for unimplented methods

    These were taking up too many lines, were too repetitive, and weren't
    marked [[noreturn]] even though they all throw unconditionally.
    
    Signed-off-by: Jared Van Bortel <[email protected]>
    cebtenzzre committed Oct 10, 2024
    Configuration menu
    Copy the full SHA
    163d5fd View commit details
    Browse the repository at this point in the history
  5. backend: don't expose write access to n_ctx and tokens via PromptContext

    Writing to these directly behind the implementation's back is
    ill-defined. Trying to write internal save/restore for `tokens` hurts my
    brain when the chat UI is trying to manage this directly.
    
    Signed-off-by: Jared Van Bortel <[email protected]>
    cebtenzzre committed Oct 10, 2024
    Configuration menu
    Copy the full SHA
    b1c22db View commit details
    Browse the repository at this point in the history
  6. llmodel: use the token cache to help reuse previous results

    If n_past is smaller in a successive call to prompt() but the input
    shares a common prefix with the token cache after n_past, we can reuse
    the tokens that are already in the KV cache.
    
    Signed-off-by: Jared Van Bortel <[email protected]>
    cebtenzzre committed Oct 10, 2024
    Configuration menu
    Copy the full SHA
    c7d1232 View commit details
    Browse the repository at this point in the history