This usecase describe how to run MiXeR analysis (http://github.com/precimed/mixer) on synthetic data, generated with simu_linux (http://github.com/precimed/simu). All commands below assume that $SIF
and $SINGULARITY_BIND
environmental variables are defined as described in Getting started section of the main README file.
- Generate two pairs of traits for MiXeR analysis, covering two scenarios: complete polygenic overlap (
shared
), non-overlaping causal variants (unique
), and partly overlaping causal variants (partial
):
singularity shell --home $PWD:/home $SIF/gwas.sif
simu_linux --qt --bfile /REF/hapgen/chr21 --seed 123 --causal-n 100 100 --trait1-sigsq 1 0 --trait2-sigsq 0 1 --num-components 2 --out unique --num-traits 2
simu_linux --qt --bfile /REF/hapgen/chr21 --seed 123 --causal-n 100 --trait1-sigsq 1 --trait2-sigsq 1 --num-components 1 --out shared --num-traits 2
simu_linux --qt --bfile /REF/hapgen/chr21 --seed 123 --causal-n 50 50 50 --trait1-sigsq 1 1 0 --trait2-sigsq 1 0 1 --num-components 3 --out partial --num-traits 2
plink2 --bfile /REF/hapgen/chr21 --glm --pheno shared.pheno --pheno-name trait1 --out shared.1
plink2 --bfile /REF/hapgen/chr21 --glm --pheno shared.pheno --pheno-name trait2 --out shared.2
plink2 --bfile /REF/hapgen/chr21 --glm --pheno unique.pheno --pheno-name trait1 --out unique.1
plink2 --bfile /REF/hapgen/chr21 --glm --pheno unique.pheno --pheno-name trait2 --out unique.2
plink2 --bfile /REF/hapgen/chr21 --glm --pheno partial.pheno --pheno-name trait1 --out partial.1
plink2 --bfile /REF/hapgen/chr21 --glm --pheno partial.pheno --pheno-name trait2 --out partial.2
- Convert the output of
plink2
to a format compatible withMiXeR
:
singularity exec --home $PWD:/home $SIF/python3.sif python
import pandas as pd
for fname, out in [('{x}.{y}.trait{y}.glm.linear'.format(x=x,y=y), '{}.{}.sumstats'.format(x,y)) for x in ['unique', 'shared', 'partial'] for y in ['1', '2']]:
pd.read_csv(fname, delim_whitespace=True)[['ID', 'REF', 'ALT', 'OBS_CT', 'T_STAT']].rename(columns={'ID':'SNP', 'REF':'A1', 'ALT':'A2', 'OBS_CT':'N', 'T_STAT':'Z'}).to_csv(out,index=False, sep='\t')
- Run univariate MiXeR analysis (each analysis should take up to 10 minutes on a standard laptop, in this demo example)
singularity shell --home $PWD:/home $SIF/python3.sif
export MIXER_COMMON_ARGS="--chr2use 21 --ld-file /REF/ldsc/1000G_EUR_Phase3_plink/[email protected] --bim-file /REF/ldsc/1000G_EUR_Phase3_plink/[email protected] --extract /REF/ldsc/1000G_EUR_Phase3_plink/1000G.EUR.QC.prune_maf0p05_rand2M_r2p8.rep1.snps"
python /tools/mixer/precimed/mixer.py fit1 $MIXER_COMMON_ARGS --trait1-file unique.1.sumstats --out unique.1
# now you have two options
# Option 1. Run the above command for the other 5 traits:
python /tools/mixer/precimed/mixer.py fit1 $MIXER_COMMON_ARGS --trait1-file unique.2.sumstats --out unique.2
python /tools/mixer/precimed/mixer.py fit1 $MIXER_COMMON_ARGS --trait1-file shared.1.sumstats --out shared.1
python /tools/mixer/precimed/mixer.py fit1 $MIXER_COMMON_ARGS --trait1-file shared.2.sumstats --out shared.2
python /tools/mixer/precimed/mixer.py fit1 $MIXER_COMMON_ARGS --trait1-file partial.1.sumstats --out partial.1
python /tools/mixer/precimed/mixer.py fit1 $MIXER_COMMON_ARGS --trait1-file partial.2.sumstats --out partial.2
# Option 2. Do a bit of cheating, and copy copy the .json files, enforcing exacly the same genetic architecture
# ("pi", "sig2beta" and "sig2zero" parameters) in all six traits. This shouldn't make much difference as long as
# univariate estimates are fairly accurate.
cp unique.1.json unique.2.json && cp unique.1.json shared.1.json && cp unique.1.json shared.2.json && cp unique.1.json partial.1.json && cp unique.1.json partial.2.json
# Now proceed with bivariate analyses
python /tools/mixer/precimed/mixer.py fit2 $MIXER_COMMON_ARGS --trait1-file unique.1.sumstats --trait2-file unique.2.sumstats --trait1-params unique.1.json --trait2-params unique.2.json --out unique
python /tools/mixer/precimed/mixer.py fit2 $MIXER_COMMON_ARGS --trait1-file shared.1.sumstats --trait2-file shared.2.sumstats --trait1-params shared.1.json --trait2-params shared.2.json --out shared
python /tools/mixer/precimed/mixer.py fit2 $MIXER_COMMON_ARGS --trait1-file partial.1.sumstats --trait2-file partial.2.sumstats --trait1-params partial.1.json --trait2-params partial.2.json --out partial
python /tools/mixer/precimed/mixer_figures.py two --json shared.json --out shared
python /tools/mixer/precimed/mixer_figures.py two --json unique.json --out unique
python /tools/mixer/precimed/mixer_figures.py two --json partial.json --out partial
-
The results are available in
shared.png
,unique.png
andpartial.png
figures. -
You could also use MIXER_SIMU.job script to run this on a cluster 20 times, each with a random subsets of SNPs as defined by
1000G.EUR.QC.prune_maf0p05_rand2M_r2p8.rep[1-20].snps
files, and use variation in parameter estimates amoung these runs this to infer uncertainty of parameter estimates. AdjustMIXER_SIMU.job
to match configuration of your cluster. Then trigger analysis by runningsbatch MIXER_SIMU.job
command. Once you get all results, combine them across 20 runs, and produce figures as follows:
singularity shell --home $PWD:/home $SIF/python3.sif
python /tools/mixer/precimed/mixer_figures.py combine --json [email protected] --out unique.fit
python /tools/mixer/precimed/mixer_figures.py combine --json [email protected] --out unique.test
python /tools/mixer/precimed/mixer_figures.py combine --json [email protected] --out shared.fit
python /tools/mixer/precimed/mixer_figures.py combine --json [email protected] --out shared.test
python /tools/mixer/precimed/mixer_figures.py combine --json [email protected] --out partial.fit
python /tools/mixer/precimed/mixer_figures.py combine --json [email protected] --out partial.test
python /tools/mixer/precimed/mixer_figures.py two --statistic mean std --json-fit unique.fit.json --json-test unique.test.json --out mixer_simu_unique
python /tools/mixer/precimed/mixer_figures.py two --statistic mean std --json-fit shared.fit.json --json-test shared.test.json --out mixer_simu_shared
python /tools/mixer/precimed/mixer_figures.py two --statistic mean std --json-fit partial.fit.json --json-test partial.test.json --out mixer_simu_partial
After processing the resulting figures shoud look like this:
To include density plots to the resulting figure, add --trait1-file
and --trait2-file
arguments like this:
python /tools/mixer/precimed/mixer_figures.py two --statistic mean std --json-fit unique.fit.json --json-test unique.test.json --out mixer_simu_unique --trait1-file unique.1.sumstats --trait2-file unique.2.sumstats