vlm: increase the default max_num_batched_tokens
for multimodal models
#1458
Job | Run time |
---|---|
16s | |
6s | |
8s | |
8s | |
9s | |
47s |
max_num_batched_tokens
for multimodal models
#1458
Job | Run time |
---|---|
16s | |
6s | |
8s | |
8s | |
9s | |
47s |