Skip to content

A repository for VAE-based simulation of latent cell characteristics.

Notifications You must be signed in to change notification settings

simonlevine/singlecell-sim-by-VAE

Repository files navigation

02-{5,7}12 Final Project: Simulating Latent Single-Cell Gene Expression of COVID-19 Patients with Variational Autoencoder and Implications for Augmenting Classification Models

Abstract

We present a Variational Autoencoder-based simulation of single cell gene expression from healthy and COVID-19 patients' PBMC cells. To demonstrate the utility of this simulation, we build a synthetic dataset seeded from rare cell types. We then train a classifier of ventilation severity and demonstrate that, by augmenting with simulated data, predictive performance declines slightly. However, based on phylogeny tree analysis, the simulated rare cell data does not differ dramatically from the original distribution, as desired. Regardless of these mixed results, we consider our simulation as a solid baseline and a promising future direction to aiding in single cell research of COVID-19.

Setup

Pull the code properly

git clone --recursive https://github.com/simonlevine/singlecell-sim-by-VAE.git

Install dependencies

python -m venv venv
./venv/bin/pip install -r requirements.txt
pip install --user dvc

Grab the data

source .envrc && make download_data

Reproduce

source .envrc && dvc repro

About

A repository for VAE-based simulation of latent cell characteristics.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published