-
Notifications
You must be signed in to change notification settings - Fork 406
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add replaceWhere functionality #1957
Comments
I exposed the predicate parameter for the rust engine writer but it's currently not doing anything because the functionality in Rust is not built yet |
take |
I'll give this a try |
WriteBuilder uses |
It would be great to do this usig logical expressions rather then the physical ones - much like @Blajda recently updated for merge. The good thing there is we get some type coercion for free, which has been a hassle with expressions. In python we will likely have to accept strings and do the parsing.. |
@roeap I think we can start allowing arrow expressions as input, which we can serialize as substrait and then deserialize with Datafusion-substrait |
This would be a great goal, but I would say lets be consistent in that and make a deliberate API choice. I.e not have substrait supported in one method but not the other... Good news is substrait plans are of course logical plans :) |
@roeap we should be able to add this to merge, update, delete and write and then just add the conversion inside the pyo3 binding, so it's a Python only feature. |
@r3stl355 its #1720 had been up for a while before it got merged. @ion-elgreco - sure to get started, and as you said right now this could just be internal. Substrait is a nice feature for rust as well, of course as alternative path since we are lookig to integrate into datafusions internal planning. |
# Description First/naive implementation of `replaceWhere` for `write`. Code compiles and there is a test to verify the outcome. I would appreciate any feedback on improving the structure/implementation. For example, I copied the part of code from `delete` operation because there is no way to call that code in `delete` directly from `write` - should I look into extracting that code from `delete` to somewhere central? Seems to also works with partitions columns. # Related Issue(s) #1957 # Documentation Added a section in docs --------- Signed-off-by: Nikolay Ulmasov <[email protected]> Co-authored-by: Ion Koutsouris <[email protected]>
# Description First/naive implementation of `replaceWhere` for `write`. Code compiles and there is a test to verify the outcome. I would appreciate any feedback on improving the structure/implementation. For example, I copied the part of code from `delete` operation because there is no way to call that code in `delete` directly from `write` - should I look into extracting that code from `delete` to somewhere central? Seems to also works with partitions columns. # Related Issue(s) delta-io#1957 # Documentation Added a section in docs --------- Signed-off-by: Nikolay Ulmasov <[email protected]> Co-authored-by: Ion Koutsouris <[email protected]>
Description
PySpark has a cool
replaceWhere
function that lets you override existing data in a Delta table that matches a predicate with new data. Here's an example of thereplaceWhere
functionality:What do folks think about adding
replaceWhere
functionality to Python deltalake.It's possible that the Rust
predicate
argument inwrite_deltalake
already exposes this functionality.The text was updated successfully, but these errors were encountered: