Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FR: Add plan_anchor field to PlanRel #725

Open
tokoko opened this issue Oct 11, 2024 · 0 comments
Open

FR: Add plan_anchor field to PlanRel #725

tokoko opened this issue Oct 11, 2024 · 0 comments

Comments

@tokoko
Copy link
Contributor

tokoko commented Oct 11, 2024

I'm working on a library that lets you build up substrait plans using dataframe API. Over the course of dataframe transformations various unrelated plans need to be merged to construct the final substrait plan. For the most part it's pretty straightforward, but you need to deal with extension functions and references during the merges.

Functions in substrait fortunately let you define "pointers" in a relaxed fashion with anchors. This means that as long as there is some global way of assigning a unique identifier to each function beforehand, merging plans become straightforward. Unlike functions, relationship between PlanRels and RefereceRels is modeled differently, RefereceRels need to be aware of the ordinal position of the target rel in the relations array. This makes merging plans very tricky. One has to either traverse the whole plan to adjust ordinal positions on each merge or use some sort of placeholders and do a single-pass traversal at the end.

To make this simpler, substrait could use the same anchor mechanism with PlanRels. An additional plan_anchor field can be introduced in PlanRel message, which will be used by ReferenceRel instead of relying on ordinal positions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant