-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add pipeline composition RFC #723
base: main
Are you sure you want to change the base?
Conversation
docs/rfcs/PipelineComposition.md
Outdated
LogicalResult run(Operation *op); | ||
}; | ||
``` | ||
`PipelineSchedule` object encapsulates compiled pipeline graph. Main method is `LogicalResult run(Operation *op);` which follows existing MLIR `PassManager::run`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by the compiled pipeline graph?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PipelineGraph
object is populated by set of pipelines with dependencies, and and then it compiles them into some internal representation which runs those pipelines in order, according to those dependencies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the schedule the result of the linearization of the DAG or is it the class that will linearize it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PipelineSchedule
is linearized DAG, createPipelineSchedule
will do the linearization.
docs/rfcs/PipelineComposition.md
Outdated
ArrayRef<StringRef> predecessors, | ||
ArrayRef<StringRef> successors, | ||
ArrayRef<StringRef> jumpTargets, | ||
std::function<void(OpPassManager &)> populateFunc); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the pipeline is a set of Patterns populated in populateFunc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pipeline is set of passes.
docs/rfcs/PipelineComposition.md
Outdated
|
||
## Motivation | ||
|
||
TBD use cases from IREE, TPP |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would help reader to start with a motivation. I assume that the dependency-based graph would avoid some mistake when user configure the pipeline manually and unintentionally break the dependency. Is it correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Expanded motivation.
docs/rfcs/PipelineComposition.md
Outdated
After user populated the graph object they must call `createPipelineSchedule` method to compile the resulted graph into runnable schedule. | ||
`createPipelineSchedule` will build a DAG from pipelines dependencies provided by user, and will try to get linear execution order to satify these dependencies. | ||
|
||
If two pipelines doesn't have direct and indirect dependencies, order in which they will be executed is not specified, but stable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that's asking too much of this framework. I'd say "stability is depending on the passes accepting canonical forms from each other", and we make sure we always run canonicalization between DAG nodes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here I only meant order of pipelines/passes is stable regardless of in which order registerPipeline
s were called (in POC impl I'm just sorting by pipeline name first to make it stable), but yes, I can remove this for more implementation freedom.
docs/rfcs/PipelineComposition.md
Outdated
LogicalResult run(Operation *op); | ||
}; | ||
``` | ||
`PipelineSchedule` object encapsulates compiled pipeline graph. Main method is `LogicalResult run(Operation *op);` which follows existing MLIR `PassManager::run`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the schedule the result of the linearization of the DAG or is it the class that will linearize it?
Passes inside pipeline can set this attribute to indicate they want compilatin flow to jump to the specific point. | ||
After current pipeline is finished, runtime will check if module object have attribute set and if it does, jump to the selected pipeline and clear the attribute. | ||
|
||
Setting attribute to the value, which wasnt in `jumpTargets` for the current pipeline will result in error and abort the compilation flow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
jumpTargets
seem to be used for control flow.
I'd create a conditional and looping semantics instead as a type of sub-graph.
For example a pipeline node that lists a bunch of passes (or sub-nodes) and has arity
(ex. until-converge-max-n
). Or another that has two sub nodes with a select
from an IR property (ex. DLTI target information).
Giving users the ability to jump to arbitrary targets is a foot gun that we might not want to create.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, for looping until convergence/fixed point I've added llvm/llvm-project#87166.
It's works fine for simple cases like canonicalization+CSE, but in numba-mlir I had a dozen of passes in the potential loop from multiple different pipelines so I wanted an explicit control when I want to loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some items are missing from our discussion, mainly how to build the DAG and how to schedule it.
Building the DAG
Building a DAG is simply taking all passes and insert in the first available slot (similar to tree insertion). Since there is no implicit ordering for passes, this may be restricted to O(n^2)
.
We could reduce the complexity by creating sub-graphs inside sub-graphs and connecting the super-graphs together.
For example:
- All passes before bufferization are a sub-graph that leads into
bufferization
. There is no implicit order (needs to be scheduled). This is equivalent in saying bufferization depends on all of those passes, but explicitly joining all nodes into a single one. - Bufferization as a node with all cleanups
- Same for vectorization, lowering, etc.
/----\ /----\ /----\
Ingress -------- Buff -------- Vect -------- Lower -> HW
\----/ \----/ \----/
Where Buff
, Vect
and Lower
are fixed sequences of passes (per target, so can be conditional).
Scheduling
Each of those sub-graphs above will need to be scheduled. This is just graph scheduling, and can be much simpler if we hide loops and conditionals inside nodes.
Loops become a single node that is guaranteed to finish (run until convergence, but stop hard at N
iterations, where N
is configurable but less than a global MaxN
).
If we follow the sub-graph design, then scheduling is always restricted to the sub-graph. This works well with a recursive algorithm that schedules the outer-most graph, then descends into sub-graphs, expanding them in linear from.
Cleaning up
After the graph is linear (with potential loop and conditional nodes), we can start the cleanup, for example, de-duplicating passes that have no writable transforms in between.
Failure
Failure can happen at any stage above and the error message must make clear which stage and what happened. Failed creating a DAG, sub-DAG, scheduling some sub-graph, etc.
void registerPipeline( | ||
StringRef name, | ||
ArrayRef<StringRef> predecessors, | ||
ArrayRef<StringRef> successors, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd also avoid having both predecessors
and sucessors
. This feels like a duplication and hard to get right on larger graphs.
What I had in mind is just:
- Dependencies: Passes that you must run before (analyses and transforms)
- Post-clean up: Canonicalization that can help the following passes
Dependencies can be bundles or specific passes. Bundles can be just a list of passes (ex. buff+ownership), a loop or a conditional (see below). Both bundles and passes have deps/cleanups and we can simplify the graph after linearization.
Post-cleanups would also be simplified (de-duped) if one pass lists it as its cleanups and the following pass lists it as its dependencies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding having both predecessors and successors,
Following (hypothetical) pipeline:
numpy-to-linalg torch-to-linalg
\ /
bufferization
/ \
linalg-to-cpu linalg-to-gpu
We don't want to bufferization to know about specific ***-to-linalg
pipelines, as it is a frontend details, irrelevant to bufferization, and we don't want it to know about linalg-to-***
either as it backend details.
So pipeline should looks like
numpy-to-linalg: [], [bufferization]
torch-to-linalg: [], [bufferization]
bufferization: [], []
linalg-to-cpu: [bufferization], []
linalg-to-gpu: [bufferization], []
Subgarphs are useful by itself, but regarding encapsulating control flow into subgraphs, let's say we have following pipeline:
Now, (external) user wants to add
With jumps they can just do
And the rest of the pipeline will stay unchanged. With subgraphs they will have to extract existing |
Please review these guidelines to help with the review process: