-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature Request: Comprehensive Export/Import Functionality for Table Management #623
Comments
Thanks for the kind words! What you propose makes sense, and I think for the most part the functionality is already there, mainly via the The idea is that the In fact this is precisely what how we use Seafowl in production ourselves, the metastore is a separate component and Seafowl(s) talk to to it to learn about schemas, tables and object store locations when executing a query. One major drawback however is that it does not support writes, meaning the initial DML would need to be performed out-of-band for now. |
Thank you for letting me know. I was thinking of a scenario where writing tasks, which use more resources, are performed on multiple Seafowl instances, and then the table information is combined in a read-only Seafowl. I will try using clade, referring to the example you provided. What does out-of-band mean? Can you tell me what methods are available? I'm currently using a method where I create an external table and then perform aggregation queries to create the necessary tables. |
By out-of-band mean that you use something else to write the tables, or even Seafowl, but then you separately persist the metadata to you metastore and expose it to Seafowl instances via clade. |
Thank you for explaining clade to me. It seems very useful for working with multiple read-only Seafowl instances. |
Yes that is correct. Until the clade interface is extended to perform writes as well you're going to have to do it out-of-band.
Perhaps some custom service can get access to the PG/SQLite connection string and perform replication from them. Or something more out-of-the-box, e.g. https://github.com/superfly/litefs or https://www.splitgraph.com/blog/deploying-serverless-seafowl |
Feature Request: Comprehensive Export/Import Functionality for Datalake Table Management
Overview
First and foremost, thank you for your incredible work on the seafowl project. Your efforts in managing table information using SQLite are truly appreciated. To further enhance the project's capabilities and support continuous table creation and read-only service operations, I'd like to propose a new feature: comprehensive export and import functionality for table information.
Proposed Feature
The feature would consist of three main components:
Full Non-System Table Export/Import:
Individual Table Export/Import:
Schema-based Table Export/Import:
Use Case
This feature would greatly benefit users who need to:
Potential Implementation
While I understand that the specifics of implementation would be up to the project maintainers, some initial thoughts include:
Benefits
Conclusion
I believe this feature would significantly enhance seafowl's utility and appeal to a broader range of users and use cases. I'm excited to hear your thoughts on this proposal and would be happy to provide any additional information or clarification if needed.
Thank you for considering this feature request, and for your continued dedication to the seafowl project.
The text was updated successfully, but these errors were encountered: