You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for this excellent library, I have successfully used this to transform 40m records from Postgres to DynamoDb (I see there is an open issue from this so will work on a PR). I am working on a new etl pipeline which takes a csv, enriches it with a call to a web service and then generates a new csv with a line for each record. The file is huge (22m lines) and I need to be able to make multiple service calls in parallel for this to be efficient (~40). I think I need to use the cluster/worker model but I can’t work out how to use this. Would it be possible to add and example of how this should work? Many thanks.
The text was updated successfully, but these errors were encountered:
I got this working without the cluster function finally. It could be better optimised but seems to work well for our needs. This is the code if it helps anyone (I am using fast-csv for the CSV formatting):
Thanks for this excellent library, I have successfully used this to transform 40m records from Postgres to DynamoDb (I see there is an open issue from this so will work on a PR). I am working on a new etl pipeline which takes a csv, enriches it with a call to a web service and then generates a new csv with a line for each record. The file is huge (22m lines) and I need to be able to make multiple service calls in parallel for this to be efficient (~40). I think I need to use the cluster/worker model but I can’t work out how to use this. Would it be possible to add and example of how this should work? Many thanks.
The text was updated successfully, but these errors were encountered: