I keep getting `<|eot_id|>` or `</s>` in my outputs when using chat mode for llama-3-8b-instruct and mistral-7b-instruct. #786

michael-newsrx · 2024-04-30T16:53:33Z

michael-newsrx
Apr 30, 2024

I'm using system(), user(), and assistant() but I keep getting either </s> (for Mistral) or <|eot_id|><|start_header_id|>assistant for llama-3-8b-instruct in my outputs.

What am I doing wrong?

I'm using the following to load the models:

from guidance.models import LlamaCpp
from guidance.models import Model
from huggingface_hub import hf_hub_download
from llama_cpp import Llama

def load_llama_3_8b_instruct_chat(verbose: bool = False, n_ctx=8192) -> Model:
    offload_kqv = True

    repo_id = "lmstudio-community/Meta-Llama-3-8B-Instruct-GGUF"
    gguf_filename = "Meta-Llama-3-8B-Instruct-Q8_0.gguf"
    gguf_path = hf_hub_download(repo_id=repo_id, filename=gguf_filename)
    layers = -1

    model_path = str(gguf_path)
    model_kwargs = dict(n_ctx=n_ctx, n_threads=32, n_gpu_layers=layers, verbose=verbose,
                        offload_kqv=offload_kqv, n_threads_batch=32, logits_all=True, )
    model = Llama(model_path=model_path, **model_kwargs)
    llm = guidance.models.LlamaCppChat(model=model)
    llm.echo = False
    return llm


def load_mistral_7b_chat(verbose: bool = False, n_ctx=8192) -> LlamaCpp:
    offload_kqv = True

    repo_id = "TheBloke/Mistral-7B-Instruct-v0.2-GGUF"
    gguf_filename = "mistral-7b-instruct-v0.2.Q5_K_M.gguf"
    gguf_path = hf_hub_download(repo_id=repo_id, filename=gguf_filename)
    layers = -1

    model_path = str(gguf_path)
    model_kwargs = dict(n_ctx=n_ctx, n_threads=32, n_gpu_layers=layers, verbose=verbose,
                        offload_kqv=offload_kqv, n_threads_batch=32, logits_all=True, )
    model = Llama(model_path=model_path, **model_kwargs)
    print(model.chat_format)
    llm = guidance.models.MistralChat(model=model)

    llm.echo = False
    return llm

and I'm using the following for my tests:

@guidance(stateless=True)
def analyze_article(lm: Model, article: str, temperature: float = 0.0, extra_instruct: str = "")\
        -> Model:
    with user():
        options = {"wrap": 9999, "number": True}
        prompt = mdformat.text(dedent(f"""        
        Create a very succinct "cliff notes" summary in a bulleted format
        for the following article.
        
        Focus only on the information needed
        to create a new article for a peer reviewed print only journal.
        
        Use the funnelling method.
        """), options=options)
        if extra_instruct.strip():
            prompt += "\n\n" + mdformat.text(extra_instruct.strip(), options=options)
        prompt += "\n\nArticle: "
        prompt += "\n\n" + mdformat.text(f"{article}".strip(), options=options)
        lm += prompt
    with assistant():
        reason_regex = "( *(\\*|\\-|\u2022) [^\n]+\n)+"
        lm += "\n\nSummary:\n\n"
        lm += gen(regex=reason_regex, name="analysis", max_tokens=2048,  #
                  stop="\n\n", temperature=temperature)
        return lm

def main():
    # from local_models.mixtral_guidelines import load_mistral_7b_chat as test_model
    from local_models.mixtral_guidelines import load_llama_3_8b_instruct_chat as test_model
    # from local_models.mixtral_guidelines import load_llama_2_7b_chat as test_model
    n_ctx: int = 16384  # Context window size
    with BlockTimer() as timer:
        llm: Model = test_model(verbose=False, n_ctx=n_ctx)
    print()
    print(f"Model load elapsed: {timer.formatted}")
    print()
    llm += system_prompt()
    article = pathlib.Path("test-article.md").read_text()
    print("=== Cliff notes")
    llm += analyze_article(article=straight_quotes(article))
    print(llm["analysis"])
    print()

Test article: test-article.md
Bad output Llama 3: bad output llama 3.md
Bad output Mistral 7b: bad output mistral.md

michael-newsrx · 2024-05-01T13:47:56Z

michael-newsrx
May 1, 2024
Author

If I change the stop= to ["\n\n", "</s>"] or ["\n\n", "<|eot_id|>"] no change in behavior is observed. </s> and <|eot_id|> are both still in the resulting output.

0 replies

Harsha-Nori · 2024-05-02T18:56:31Z

Harsha-Nori
May 2, 2024
Maintainer

I think this may have to do with incorrectly set role tags on the guidance side. We're starting to play with auto-loading fixes for these here (#791). For the time being, subclassing the base classes with properly configured role tags like we do here (

guidance/guidance/models/transformers/_llama.py

Line 8 in acb38d1

class LlamaChat(TransformersChat, Llama):

) might solve this issue, but I will continue looking into it more. Thanks for reporting this, and sorry for the troubles!

0 replies

michael-newsrx · 2024-05-06T16:51:29Z

michael-newsrx
May 6, 2024
Author

Part of my original problem appears to be related to putting the various chat roles inside of @guidance annotated functions.

I finally got what I think is a working LLama3Chat class as follows:

class Llama3Chat(LlamaCpp, Chat):
    _begin_of_text: bool = False

    def get_role_start(self, role_name, **kwargs):

        _ = "" if self._begin_of_text else "<|begin_of_text|>"
        self._begin_of_text = True

        if role_name == "user":
            return _ + "<|start_header_id|>user<|end_header_id|>\n\n"
        elif role_name == "assistant":
            return _ + "<|start_header_id|>assistant<|end_header_id|>\n\n"
        elif role_name == "system":
            return _ + "<|start_header_id|>system<|end_header_id|>\n\n"

    def get_role_end(self, role_name=None):
        return "<|eot_id|>"

However, I still have issues with <|eot_id|>… "leaking" into the output prompt when using regex matching that expects a "\n" or other similar termination before the start of the "<|" sequence.

Is there an additional step I need to add to the Llama3Chat class or otherwise need to poke or prod to ensure the guidance library detects <|eot_id|> correctly, even if the regex match is incomplete?

Response fragment:

Write a single-line social media hashtags paragraph suitable for improved SEO ranking. Provide exactly 3 hashtags. Separate each hashtag with a space. Start each hashtag with a #. Always add a blank line after the hashtags.
<|eot_id|><|start_header_id|>assistant<|end_header_id|>

Social Media Hashtags: #GlobalHealthMatters #COVID19Impact #HealthEquityNow<|eot_id|><|start_header_id|>user<|end_header_id|>

Regex which causes the "social media tags" field to end up with <|eot_id|>… as part of the return value.

def hashtags(lm: Model, temperature: float = 0.0, extra_instruct: str = "") -> Model:
    options = {"wrap": 9999, "number": True}
    with user():
        prompt = mdformat.text(dedent(f"""
        Write a single-line social media hashtags paragraph suitable for improved SEO ranking.
        Provide exactly 3 hashtags.
        Separate each hashtag with a space.
        Start each hashtag with a #.
        Always add a blank line after the hashtags.
        """), options=options)
        if extra_instruct.strip():
            prompt += "\n\n" + mdformat.text(extra_instruct.strip(), options=options)
        lm += mdformat.text(prompt, options=options)
    with assistant():
        lm += "Social Media Hashtags: "
        hash_regex_a = "#[^#, ]+ #[^#, ]+ #[^#, ]+\n"
        lm += gen(regex=hash_regex_a, temperature=temperature, name="hashtags")
    return lm

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

I keep getting `<|eot_id|>` or `</s>` in my outputs when using chat mode for llama-3-8b-instruct and mistral-7b-instruct. #786

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 3 comments

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

I keep getting <|eot_id|> or </s> in my outputs when using chat mode for llama-3-8b-instruct and mistral-7b-instruct. #786

michael-newsrx Apr 30, 2024

Replies: 3 comments

michael-newsrx May 1, 2024 Author

Harsha-Nori May 2, 2024 Maintainer

michael-newsrx May 6, 2024 Author

I keep getting `<|eot_id|>` or `</s>` in my outputs when using chat mode for llama-3-8b-instruct and mistral-7b-instruct. #786

michael-newsrx
Apr 30, 2024

michael-newsrx
May 1, 2024
Author

Harsha-Nori
May 2, 2024
Maintainer

michael-newsrx
May 6, 2024
Author