You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Had a weird bug where the architect command was ending way before I reached max tokens set in .aider.model.metadata.json and explicitly supported by the provider. I would get the following error:
Model together_ai/Qwen/Qwen2.5-72B-Instruct-Turbo has hit a token limit!
Token counts below are approximate.
Input tokens: ~4,688 of 32,768
Output tokens: ~2,022 of 32,768
Total tokens: ~6,710 of 32,768
I modified aider to output debug info from litellm and realized that response was returning a finish_reason: 'length'. Although not documented in the Together AI docs, looks like the default max_tokens value is 2048 if it isn't provided in the response. If I change the max_tokens to 32768 I get a different error. Something like inputs tokens + max_new_tokens must be <= 32769. If I set max_tokens: 20000 as extra_params in .aider.model.settings.yml everything works just fine and I get past 2048 tokens without a problem. I believe I've had this problem in the past with another model, I just don't remember which model I used.
I found that base_coder has the following:
exceptFinishReasonLength:
# We hit the output limit!ifnotself.main_model.info.get("supports_assistant_prefill"):
exhausted=Truebreak# ...ifexhausted:
self.show_exhausted_error()
self.num_exhausted_context_windows+=1return# ...defshow_exhausted_error(self):
# ...res.append(f"Model {self.main_model.name} has hit a token limit!")
res.append("Token counts below are approximate.")
Setting a fixed max tokens works for this specific case, but it seems like a more dynamic approach would be ideal where max_tokens reduces based on the amount of tokenized input. Not exactly sure if that's the best approach though since it's more often that models have a large context and fixed max completion tokens 4096 or 8192. Perhaps a use_max_dynamic_tokens setting in .aider.model.settings.yml that requires context_size?
I'm willing to do the work for this, just need direction and what to look at since I'm not familiar with the codebase for aider and litellm.
- name: "together_ai/Qwen/Qwen2.5-72B-Instruct-Turbo"edit_format: "diff"use_repo_map: trueexamples_as_sys_msg: truereminder: "sys"streaming: false# extra_params:# max_tokens: 20000 # workaround to get around the issue
Issue
Had a weird bug where the architect command was ending way before I reached
max tokens
set in.aider.model.metadata.json
and explicitly supported by the provider. I would get the following error:I modified aider to output debug info from litellm and realized that response was returning a
finish_reason: 'length'
. Although not documented in the Together AI docs, looks like the default max_tokens value is2048
if it isn't provided in the response. If I change themax_tokens
to32768
I get a different error. Something likeinputs tokens + max_new_tokens must be <= 32769
. If I setmax_tokens: 20000
asextra_params
in.aider.model.settings.yml
everything works just fine and I get past2048
tokens without a problem. I believe I've had this problem in the past with another model, I just don't remember which model I used.I found that base_coder has the following:
Setting a fixed max tokens works for this specific case, but it seems like a more dynamic approach would be ideal where
max_tokens
reduces based on the amount of tokenized input. Not exactly sure if that's the best approach though since it's more often that models have a large context and fixed max completion tokens4096
or8192
. Perhaps ause_max_dynamic_tokens
setting in.aider.model.settings.yml
that requirescontext_size
?I'm willing to do the work for this, just need direction and what to look at since I'm not familiar with the codebase for aider and litellm.
Version and model info
Aider v0.60.1
Model: together_ai/Qwen/Qwen2.5-72B-Instruct-Turbo
.aider.model.settings.yml:
.aider.model.metadata.json:
Rough idea of tokens:
The text was updated successfully, but these errors were encountered: