A data aggregation and visualization tool for health metrics for US Counties.
This project is a data aggregation and visualization tool for health metrics for US Counties. It integrates data from BioBot, Verily, and others to provide a comprehensive view of health metrics for US Counties.
The site is currently hosted at County Health.
In order to run this project locally for development purposes, you will need to have the following installed:
The following steps will get you a copy of the project up and running on your local machine for development and testing purposes. It assumes that you have the requirements listed above installed. A python virtual environment is recommended.
# Clone the repository
git clone https://github.com/jogoodma/county-health.git
# Install Python dependencies
cd county-health
pip install -r packages/pipelines/requirements.txt
# Install Node dependencies
cd packages/site && pnpm install
# Run the data pipeline to update your local data.
cd ../../
make update
# Run the site locally
cd packages/site && pnpm run dev
.
├── data
│ └── db - Parquet files for DuckDB
└── packages
├── pipeline - Pipeline code for integrating datasets (Dagster)
└── site - Static site builder code (Astro)
This project was inspired by the blog post Build a poor man’s data lake from scratch with DuckDB by Pete Hunt and Sandy Ryza.
Libraries used with love include:
Many thanks to the community for their work on these projects.