You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently all KDP pipelines are basically parallelized linked lists - each node load balances output to all later steps. We currently don't support conditional branching, but it would be a useful feature to implement. This would allow for far more complex pipeline structure, giving pipeline designers much more flexibility to adapt existing workflows to use KDP.
📖 Additional Details
We're going to need to add conditional branching to the edge definitions in a pipeline. There are probably a bunch of ways to go about this.. some ok, most bad. I'd like to be thoughtful here so that we don't hack something together that breaks a lot of the KDP "core tenets," if you will.
The specification should likely go in the pipeline.spec, as the conditionals should be immutable. We should also force that all nodes be reachable, i.e. having a path from root given some set of conditionals. The pipeline validator (yet to be linked to the operator) should also check for possible cycles & emit a warning, possibly even block application of the pipeline unless a specific flag has_cycles (or something to that effect) is specified with the pipeline definition.
The goal here is to be as concise and readable as possible so as not to bloat the pipeline definition schema, while maintaining high extendability & flexibility for pipeline developers.
Where input flows to last_0 in a nominal case, and to last_1 in an error condition or some other result. Let's say that we set a boolean flag got_error in the output of middle at the top-level of the JSON object.
We could extend graph.edges to include the mapping in the following manner:
edges:
- source: starttarget: middle
- source: middletarget: last_0# lack of "conditions" could imply default branch
- source: middletarget: last_1conditions:
- match_value: # support multiple types of conditionskey: "got_error"val: "true"# array for multiple ? or perhaps take an approach similar to elasticsearch boolean queries
As is evidenced above, there's still plenty to think about. This seems like a good start.
The text was updated successfully, but these errors were encountered:
💪 Motivation
Currently all KDP pipelines are basically parallelized linked lists - each node load balances output to all later steps. We currently don't support conditional branching, but it would be a useful feature to implement. This would allow for far more complex pipeline structure, giving pipeline designers much more flexibility to adapt existing workflows to use KDP.
📖 Additional Details
We're going to need to add conditional branching to the edge definitions in a pipeline. There are probably a bunch of ways to go about this.. some ok, most bad. I'd like to be thoughtful here so that we don't hack something together that breaks a lot of the KDP "core tenets," if you will.
The specification should likely go in the
pipeline.spec
, as the conditionals should be immutable. We should also force that all nodes be reachable, i.e. having a path from root given some set of conditionals. The pipeline validator (yet to be linked to the operator) should also check for possible cycles & emit a warning, possibly even block application of the pipeline unless a specific flaghas_cycles
(or something to that effect) is specified with the pipeline definition.The goal here is to be as concise and readable as possible so as not to bloat the pipeline definition schema, while maintaining high extendability & flexibility for pipeline developers.
Ideas for implementation
Consider the following pipeline definition:
With the following corresponding yaml:
A valid use case might look like the following:
Where input flows to
last_0
in a nominal case, and tolast_1
in an error condition or some other result. Let's say that we set a boolean flaggot_error
in the output ofmiddle
at the top-level of the JSON object.We could extend
graph.edges
to include the mapping in the following manner:As is evidenced above, there's still plenty to think about. This seems like a good start.
The text was updated successfully, but these errors were encountered: