We aim to functionally characterize approximately 100,000 coding variants across Mendelian disease genes, addressing the significant gap in understanding the impact of human genomic variations. By analyzing the phenotypic impacts of these variants, we seek to elucidate genotype-phenotype relationships in inherited disorders. We will create a searchable database detailing these variant effects, accessible through the IGVF consortium, which will contribute to public health by aiding in the diagnosis and treatment of Mendelian disorders.
GDrive folder (internal): link
This repo contains the analysis scripts and notebooks for the VarChAMP project.
The data is stored in a separate repo, 2021_09_01_VarChAMP-data
, which is added as a submodule to this repo.
Profiles from all the plates are in 2021_09_01_VarChAMP-data/profiles
.
All levels of profiles downstream of the aggregation step in the pycytominer workflow are in that folder.
-
Fork the repo
-
Clone the repo
git clone [email protected]:<YOUR USER NAME>/2021_09_01_VarChAMP.git
-
Download the contents of the submodule
git submodule update --init --recursive cd 2021_09_01_VarChAMP-data dvc pull git lfs pull
-
Install the conda environment within each folder before running the notebooks. We use mamba to manage the computational environment. To install mamba see instructions. After installing mamba, execute the following to install and navigate to the environment:
# First, install the conda environment mamba env create --force --file environment.yml # If you had already installed this environment and now want to update it mamba env update --file environment.yml --prune # Then, activate the environment and you're all set! environment_name=$(grep "name:" environment.yml | awk '{print $2}') mamba activate $environment_name
-
Run the notebooks