-
Notifications
You must be signed in to change notification settings - Fork 184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
我在A800上流式推理为什么每输出一个token要2秒钟 #88
Comments
可能跟你的软硬件环境有关 |
大佬,正常情况下应该是多长时间一个token |
我的软硬件环境
我的测试结果,我迭代了3次,计算了平均首token时延和平均解码时延(仅个人测试,不代表官方评测数据)
|
感谢。我再试试 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
No description provided.
The text was updated successfully, but these errors were encountered: