Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[core][compiled graph] Support inter-execution compute-communication overlap #47944

Open
ruisearch42 opened this issue Oct 8, 2024 · 0 comments
Labels
compiled-graph enhancement Request for new feature and/or capability triage Needs triage (eg: priority, bug/not-bug, and owning component)

Comments

@ruisearch42
Copy link
Contributor

Description

Existing compute-communication overlap in compiled graphs only support intra-execution overlap: i.e., only operations from the same execution loop can be overlapped. This is insufficient for some of the use-cases (e.g., vLLM) where the performance gain mainly comes from overlapping compute and communication operations from different executions.

We need to design and implement a mechanism to support inter-execution overlap.

Use case

No response

@ruisearch42 ruisearch42 added enhancement Request for new feature and/or capability triage Needs triage (eg: priority, bug/not-bug, and owning component) compiled-graph labels Oct 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiled-graph enhancement Request for new feature and/or capability triage Needs triage (eg: priority, bug/not-bug, and owning component)
Projects
None yet
Development

No branches or pull requests

1 participant