Why change prompt_token_ids depending on encoder_decoder #851

meanwo · 2024-01-18T05:20:14Z

https://github.com/bentoml/OpenLLM/blob/6eb2ed5028dcaa7e6c7ba60e2ec8dc3377c353be/openllm-python/src/openllm/_runners.py#L181C1-L185

   if self.model.config.is_encoder_decoder:
      max_src_len = context_length
    else:
      max_src_len = context_length - max_new_tokens - 1
    prompt_token_ids = prompt_token_ids[-max_src_len:]

When using the only decoder-based model(llama2) on the hugging face, the prompt_token_ids(input_token_ids) length never changed because of max_new_tokens.

Is there a reason why you set promt_token_ids to change based on max_new_tokens via else syntax??

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why change prompt_token_ids depending on encoder_decoder #851

Why change prompt_token_ids depending on encoder_decoder #851

meanwo commented Jan 18, 2024

Why change prompt_token_ids depending on encoder_decoder #851

Why change prompt_token_ids depending on encoder_decoder #851

Comments

meanwo commented Jan 18, 2024