Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Write to multiple delta tables from a single topic #161

Open
Kikkon opened this issue Dec 21, 2023 · 4 comments
Open

Write to multiple delta tables from a single topic #161

Kikkon opened this issue Dec 21, 2023 · 4 comments

Comments

@Kikkon
Copy link

Kikkon commented Dec 21, 2023

Referencing the design doc, I understand that if I want to write to multiple tables, I would need multiple processes and topics. This means I would need to manage the state of multiple topics & process.

However, since our company is relatively small, I think a single process would be sufficient for my needs.

So I would like to inquire about what the best practices are for this part? I think I could fork this repository and make some modifications, for example using a field in the topic as routing to write to different tables. But that feels a bit hack. I'm hoping for an approach more aligned with the current architecture. I'm happy to contribute back to the community if needed.

@ThijsKoot
Copy link

ThijsKoot commented Mar 14, 2024

We're interested in this as well at our company.

@mightyshazam
Copy link
Collaborator

One way to keep it less hacky is to add the ability to filter messages after the transformation. This would still require you to run two processes, but it wouldn't require multiple topics.
The idea of one consumer delivering multiple places is a layer of complexity that probably exists above this project.

@leon-oosterwijk-akunacapital

I have this working on a private fork. trying to figure out how to share this.

@Yevhenii-Kozlovskyi1
Copy link

Hi There! Has anyone had any luck implementing this approach, or encountered any blockers with this solution? I am interested in any implementation.
Our team is thinking of changing our Delta Lake architecture due to scaling, performance, isolation, and delta-sharing things.
One such approach is using a separate delta table per some input data characteristic which means our kafka-delta-ingest application should be able to recognize to which delta lake table the income message should be written based on some message properties.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants