Modernizing the ETL of my largest project
- Transition all csv/xlsx output to database tables
- Transition manual diagnostic checking & etl to dbt+dagster
- Only clean and write updates for newest data
-
- Python
- Docker
- Postgres
- Dagster
-
- AWS
- PowerBI
- Vitals data loaded and cleaned
- Vaccination data loaded and cleaned
- Dbt models reorganized
- Tests added to all scripts
- Weekly run set up
- Migrate local parquet storage to aws s3
- Migrate local postgres to aws rds
- Set up config to run on ECS cluster