Skip to content

Commit

Permalink
reduce max_completion_tokens to 15 to speed up generation
Browse files Browse the repository at this point in the history
  • Loading branch information
renxida committed Nov 15, 2024
1 parent 64b758b commit cbe008a
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion app_tests/integration_tests/llm/cpu_llm_server_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ def do_generate(prompt, port):
# Create a GenerateReqInput-like structure
data = {
"text": prompt,
"sampling_params": {"max_completion_tokens": 50, "temperature": 0.7},
"sampling_params": {"max_completion_tokens": 15, "temperature": 0.7},
"rid": uuid.uuid4().hex,
"return_logprob": False,
"logprob_start_len": -1,
Expand Down

0 comments on commit cbe008a

Please sign in to comment.