Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dev分支的vllm_model_server.py指定GPU貌似无效 #79

Open
Jimmy-L99 opened this issue Nov 8, 2024 · 1 comment
Open

dev分支的vllm_model_server.py指定GPU貌似无效 #79

Jimmy-L99 opened this issue Nov 8, 2024 · 1 comment

Comments

@Jimmy-L99
Copy link

Jimmy-L99 commented Nov 8, 2024

if __name__ == "__main__":
    parser = argparse.ArgumentParser()

    parser.add_argument("--host", type=str, default="localhost")
    parser.add_argument("--dtype", type=str, default="bfloat16")
    parser.add_argument("--device", type=str, default="cuda:1")
    parser.add_argument("--port", type=int, default=10000)
    parser.add_argument("--model-path", type=str, default="models/glm-4-voice-9b")
    args = parser.parse_args()

    worker = ModelWorker(args.model_path, args.dtype, args.device)
    uvicorn.run(app, host=args.host, port=args.port, log_level="info")

参数这里设置default="cuda:1",命令行也尝试--device cuda:1,但加载模型还是在GPU0

2586962 root   0  Compute   0%  20632MiB  25%     0%   6584MiB python GLM4-Voice/GLM-4-Voice-dev/vllm_model_server.py --device cuda:1
@sixsixcoder
Copy link

该问题已在PR中修复。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants