Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(python, rust): add drop column operation [BLOCKED] #2710

Draft
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

ion-elgreco
Copy link
Collaborator

Description

Adds a drop column operation, can be used to drop root fields and nested fields in a struct.

@ion-elgreco ion-elgreco changed the title feat(pyhon, rust): add drop column operation feat(python, rust): add drop column operation Jul 27, 2024
@github-actions github-actions bot added binding/python Issues for the Python package binding/rust Issues for the Rust crate labels Jul 27, 2024
@ion-elgreco ion-elgreco enabled auto-merge (squash) July 27, 2024 16:13
crates/core/src/operations/drop_column.rs Outdated Show resolved Hide resolved
Comment on lines +59 to +65
/// Specify if you want to raise if the specified column does not exist
pub fn with_raise_if_not_exists(mut self, raise: bool) -> Self {
self.raise_if_not_exists = raise;
self
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question out of curiosity: What would be the use case or reason to allow drop column commits for columns that are missing from the schema?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's say I drop 10 columns, and one is a typo, it could still be useful for the commit to pass.

I probably need to add another check though to check if the schema actually changed, otherwise the commits are pointless

@hntd187
Copy link
Collaborator

hntd187 commented Jul 29, 2024

Don't drop columns require column mapping otherwise you have to re-write all the datafiles?

@ion-elgreco
Copy link
Collaborator Author

ion-elgreco commented Jul 29, 2024

Don't drop columns require column mapping otherwise you have to re-write all the datafiles?

Why would it? Column mapping would help when we rename and don't want to rewrite but dropping is just a logical change

Maybe I'm missing something that column mapping would bring though.

@hntd187
Copy link
Collaborator

hntd187 commented Jul 29, 2024

https://github.com/delta-io/delta/blob/master/PROTOCOL.md#column-mapping says so right here in the protocol

@ion-elgreco
Copy link
Collaborator Author

https://github.com/delta-io/delta/blob/master/PROTOCOL.md#column-mapping says so right here in the protocol

It doesn't describe though why it's needed though?

Is it because you could potentially re-add the same column name but have completely different type or data?

@ion-elgreco
Copy link
Collaborator Author

@hntd187 added a check to see if column mapping is enabled. I'll put in DRAFT until we have actual column mapping support

@ion-elgreco ion-elgreco marked this pull request as draft July 30, 2024 14:50
auto-merge was automatically disabled July 30, 2024 14:50

Pull request was converted to draft

@ion-elgreco ion-elgreco changed the title feat(python, rust): add drop column operation feat(python, rust): add drop column operation [BLOCKED] Aug 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
binding/python Issues for the Python package binding/rust Issues for the Rust crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants