Skip to content

Test data replacement addition chart

GCJMackenzie edited this page Dec 8, 2021 · 13 revisions

Test data replacement/addition chart

New File Name: Source File Name: Replacing file: description: repo-location Source File origin:
Germline:
test2.recal test.recal New File Recalibration table output from VariantRecalibrator, used by ApplyVQSR data/genomics/homo_sapiens/illumina/gatk/variantrecalibrator/ https://github.com/GCJMackenzie/test_data/tree/master/vrecals_base
test2.recal.idx test.recal.idx New File Recalibration table index output from VariantRecalibrator, used by ApplyVQSR data/genomics/homo_sapiens/illumina/gatk/variantrecalibrator/ https://github.com/GCJMackenzie/test_data/tree/master/vrecals_base
test2.tranches test.tranches New File Recalibration table tranches output from VariantRecalibrator, used by ApplyVQSR data/genomics/homo_sapiens/illumina/gatk/variantrecalibrator/ https://github.com/GCJMackenzie/test_data/tree/master/vrecals_base
test2_allele_specific.recal test_allele_specific.recal New File Allele specific Recalibration table output from VariantRecalibrator, used by ApplyVQSR data/genomics/homo_sapiens/illumina/gatk/variantrecalibrator/ https://github.com/GCJMackenzie/test_data/blob/master/vrecals_as/
test2_allele_specific.recal.idx test_allele_specific.recal.idx New File Allele specific Recalibration table index output from VariantRecalibrator, used by ApplyVQSR data/genomics/homo_sapiens/illumina/gatk/variantrecalibrator/ https://github.com/GCJMackenzie/test_data/blob/master/vrecals_as/
test2_allele_specific.tranches test_allele_specific.tranches New File Allele specific Recalibration table tranches output from VariantRecalibrator, used by ApplyVQSR data/genomics/homo_sapiens/illumina/gatk/variantrecalibrator/ https://github.com/GCJMackenzie/test_data/blob/master/vrecals_as/
test2_germline_1.fq.gz normal_0.000+disease_1.000_1.fq.gz New File Synthetic raw reads file used to generate disease test data for HaplotypeCaller data/genomics/homo_sapiens/illumina/fastq/ https://github.com/lescai-teaching/datasets_class/tree/master/germline_calling/reads/
test2_germline_2.fq.gz normal_0.000+disease_1.000_2.fq.gz New File Synthetic raw reads file used to generate disease test data for HaplotypeCaller data/genomics/homo_sapiens/illumina/fastq/ https://github.com/lescai-teaching/datasets_class/tree/master/germline_calling/reads/
test_germline_1.fq.gz normal_1.000+disease_0.000_1.fq.gz New File Synthetic raw reads file used to generate normal test data for HaplotypeCaller data/genomics/homo_sapiens/illumina/fastq/ https://github.com/lescai-teaching/datasets_class/tree/master/germline_calling/reads/
test_germline_2.fq.gz normal_1.000+disease_0.000_2.fq.gz New File Synthetic raw reads file used to generate normal test data for HaplotypeCaller data/genomics/homo_sapiens/illumina/fastq/ https://github.com/lescai-teaching/datasets_class/tree/master/germline_calling/reads/
test2_haplotc.vcf.gz HaplotypeCaller_disease_103.vcf.gz New File vcf output from HaplotypeCaller using germline disease reads data/genomics/homo_sapiens/illumina/gatk/haplotypecaller_calls/ https://github.com/lescai-teaching/datasets_class/tree/master/germline_calling/variants
test2_haplotc.vcf.gz.tbi HaplotypeCaller_disease_103.vcf.gz.tbi New File vcf.tbi output from HaplotypeCaller using germline disease reads data/genomics/homo_sapiens/illumina/gatk/haplotypecaller_calls/ https://github.com/lescai-teaching/datasets_class/tree/master/germline_calling/variants
test_haplotc.vcf.gz HaplotypeCaller_normal.vcf.gz New File vcf output from HaplotypeCaller using germline normal reads data/genomics/homo_sapiens/illumina/gatk/haplotypecaller_calls/ https://github.com/lescai-teaching/datasets_class/tree/master/germline_calling/variants
test_haplotc.vcf.gz.tbi HaplotypeCaller_normal.vcf.gz.tbi New File vcf.tbi output from HaplotypeCaller using germline normal reads data/genomics/homo_sapiens/illumina/gatk/haplotypecaller_calls/ https://github.com/lescai-teaching/datasets_class/tree/master/germline_calling/variants
test2_haplotc.ann.vcf.gz HaplotypeCaller_disease_103_snpEff.ann.vcf.gz New File vcf output from HaplotypeCaller using germline disease reads annotated using snpEff data/genomics/homo_sapiens/illumina/gatk/haplotypecaller_calls/ https://github.com/lescai-teaching/datasets_class/tree/master/germline_calling/variants
test2_haplotc.ann.vcf.gz.tbi HaplotypeCaller_disease_103_snpEff.ann.vcf.gz.tbi New File vcf.tbi output from HaplotypeCaller using germline disease reads annotated using snpEff data/genomics/homo_sapiens/illumina/gatk/haplotypecaller_calls/ https://github.com/lescai-teaching/datasets_class/tree/master/germline_calling/variants
test_haplotc.ann.vcf.gz HaplotypeCaller_normal_snpEff.ann.vcf.gz New File vcf output from HaplotypeCaller using germline normal reads annotated using snpEff data/genomics/homo_sapiens/illumina/gatk/haplotypecaller_calls/ https://github.com/lescai-teaching/datasets_class/tree/master/germline_calling/variants
test_haplotc.ann.vcf.gz.tbi HaplotypeCaller_normal_snpEff.ann.vcf.gz.tbi New File vcf.tbi output from HaplotypeCaller using germline normal reads annotated using snpEff data/genomics/homo_sapiens/illumina/gatk/haplotypecaller_calls/ https://github.com/lescai-teaching/datasets_class/tree/master/germline_calling/variants
test.g.vcf.gz test.g.vcf.gz New File output from haplotypecaller run in GVCF mode using germline normal sample data/genomics/homo_sapiens/illumina/gatk/haplotypecaller_calls/ Local run of gatk_joint_germline_variant_calling subworkflow
test.g.vcf.gz.tbi test.g.vcf.gz.tbi New File output index from haplotypecaller run in GVCF mode using germline normal sample data/genomics/homo_sapiens/illumina/gatk/haplotypecaller_calls/ Local run of gatk_joint_germline_variant_calling subworkflow
test2.g.vcf.gz test2.g.vcf.gz New File output from haplotypecaller run in GVCF mode using germline diesease sample data/genomics/homo_sapiens/illumina/gatk/haplotypecaller_calls/ Local run of gatk_joint_germline_variant_calling subworkflow
test2.g.vcf.gz.tbi test2.g.vcf.gz.tbi New File output index from haplotypecaller run in GVCF mode using germline disease sample data/genomics/homo_sapiens/illumina/gatk/haplotypecaller_calls/ Local run of gatk_joint_germline_variant_calling subworkflow
Somatic:
Cram file equivalents also added for each of the following bam files
test2.paired_end.recalibrated.sorted.bam tumour.recal.bam test2.paired_end.recalibrated.sorted.bam recalibrated bam file of tumor reads data/genomics/homo_sapiens/illumina/bam/ https://github.com/GCJMackenzie/datasets_class/tree/master/somatic_calling/alignments
test.paired_end.recalibrated.sorted.bam normal.recal.bam test.paired_end.recalibrated.sorted.bam recalibrated bam file of normal reads data/genomics/homo_sapiens/illumina/bam/ https://github.com/GCJMackenzie/datasets_class/tree/master/somatic_calling/alignments
test2.paired_end.recalibrated.sorted.bam.bai tumour.recal.bam.bai test2.paired_end.recalibrated.sorted.bam.bai recalibrated bam index of tumor reads data/genomics/homo_sapiens/illumina/bam/ https://github.com/GCJMackenzie/datasets_class/tree/master/somatic_calling/alignments
test.paired_end.recalibrated.sorted.bam.bai normal.recal.bam.bai test.paired_end.recalibrated.sorted.bam.bai recalibrated bam index of normal reads data/genomics/homo_sapiens/illumina/bam/ https://github.com/GCJMackenzie/datasets_class/tree/master/somatic_calling/alignments
test2.paired_end.markduplicates.sorted.bam tumour.md.bam test2.paired_end.markduplicates.sorted.bam bam file of tumor reads with duplicates marked data/genomics/homo_sapiens/illumina/bam/ https://github.com/GCJMackenzie/datasets_class/tree/master/somatic_calling/alignments
test.paired_end.markduplicates.sorted.bam normal.md.bam test.paired_end.markduplicates.sorted.bam bam file of normal reads with duplicates marked data/genomics/homo_sapiens/illumina/bam/ https://github.com/GCJMackenzie/datasets_class/tree/master/somatic_calling/alignments
test2.paired_end.markduplicates.sorted.bam.bai tumour.md.bam.bai test2.paired_end.markduplicates.sorted.bam.bai bam index of tumor reads with duplicates marked data/genomics/homo_sapiens/illumina/bam/ https://github.com/GCJMackenzie/datasets_class/tree/master/somatic_calling/alignments
test.paired_end.markduplicates.sorted.bam.bai normal.md.bam.bai test.paired_end.markduplicates.sorted.bam.bai bam index of tumor reads with duplicates marked data/genomics/homo_sapiens/illumina/bam/ https://github.com/GCJMackenzie/datasets_class/tree/master/somatic_calling/alignments
test_1.fastq.gz normal_1.000+disease_0.000_1.fq.gz test_1.fastq.gz Synthetic raw reads file used to generate normal test data for mutect2 data/genomics/homo_sapiens/illumina/fastq/ https://github.com/GCJMackenzie/datasets_class/tree/master/somatic_calling/reads
test_2.fastq.gz normal_1.000+disease_0.000_2.fq.gz test_2.fastq.gz Synthetic raw reads file used to generate normal test data for mutect2 data/genomics/homo_sapiens/illumina/fastq/ https://github.com/GCJMackenzie/datasets_class/tree/master/somatic_calling/reads
test2_1.fastq.gz normal_0.700+disease_0.300_1.fq.gz test2_1.fastq.gz Synthetic raw reads file used to generate tumor test data for mutect2 data/genomics/homo_sapiens/illumina/fastq/ https://github.com/GCJMackenzie/datasets_class/tree/master/somatic_calling/reads
test2_2.fastq.gz normal_0.700+disease_0.300_2.fq.gz test2_2.fastq.gz Synthetic raw reads file used to generate tumor test data for mutect2 data/genomics/homo_sapiens/illumina/fastq/ https://github.com/GCJMackenzie/datasets_class/tree/master/somatic_calling/reads
test_pon.vcf.gz test1.vcf.gz New File variant calls of normal sample run through mutect2 in panel of normals mode data/genomics/homo_sapiens/illumina/gatk/pon_mutect2_calls/ Local run of gatk_create_som_pon subworkflow
test_pon.vcf.gz.tbi test1.vcf.gz.tbi New File variant calls tbi of normal sample run through mutect2 in panel of normals mode data/genomics/homo_sapiens/illumina/gatk/pon_mutect2_calls/ Local run of gatk_create_som_pon subworkflow
test_pon.vcf.gz.stats test1.vcf.gz.stats New File variant calls stats file of normal sample run through mutect2 in panel of normals mode data/genomics/homo_sapiens/illumina/gatk/pon_mutect2_calls/ Local run of gatk_create_som_pon subworkflow
test2_pon.vcf.gz test2.vcf.gz New File variant calls of tumor sample run through mutect2 in panel of normals mode data/genomics/homo_sapiens/illumina/gatk/pon_mutect2_calls/ Local run of gatk_create_som_pon subworkflow
test2_pon.vcf.gz.tbi test2.vcf.gz.tbi New File variant calls index of tumor sample run through mutect2 in panel of normals mode data/genomics/homo_sapiens/illumina/gatk/pon_mutect2_calls/ Local run of gatk_create_som_pon subworkflow
test2_pon.vcf.gz.stats test2.vcf.gz.stats New File variant calls stats file of tumor sample run through mutect2 in panel of normals mode data/genomics/homo_sapiens/illumina/gatk/pon_mutect2_calls/ Local run of gatk_create_som_pon subworkflow
test_pon_genomicsdb.tar.gz test_panel/ test_genomicsdb.tar.gz genomicsdb workspace made from test_pon and test2_pon vcf files, used to test createsomaticpanelofnormals and genotypegvcfs, this is only useful for tests and is not valid for true panel of normals as it contains tumor sample data/genomics/homo_sapiens/illumina/gatk/ Local run of gatk_create_som_pon subworkflow
test.pileups.table test_normal.pileups.table test.pileups.table pileups table for normal sample data/genomics/homo_sapiens/illumina/gatk/ Local run of gatk_tumor_normal_somatic_variant_calling subworkflow
test2.pileups.table test_tumor.pileups.table test2.pileups.table pileups table for tumor sample data/genomics/homo_sapiens/illumina/gatk/ Local run of gatk_tumor_normal_somatic_variant_calling subworkflow
test_test2_paired_mutect2_calls.artifact-prior.tar.gz test.tar.gz test_test2_paired_mutect2_calls.artifact-prior.tar.gz artifacts prior from learnreadorientationmodel being run on paired tumor normal f1r2 data/genomics/homo_sapiens/illumina/gatk/ Local run of gatk_tumor_normal_somatic_variant_calling subworkflow
test_test2_paired.contamination.table test.contamination.table test_test2_paired.contamination.table contamination table of paired tumor normal data data/genomics/homo_sapiens/illumina/gatk/ Local run of gatk_tumor_normal_somatic_variant_calling subworkflow
test_test2_paired.segmentation.table test.segmentation.table test_test2_paired.segmentation.table segmentation data of paired tumor normal data data/genomics/homo_sapiens/illumina/gatk/ Local run of gatk_tumor_normal_somatic_variant_calling subworkflow
test_test2_paired_mutect2_calls.vcf.gz test.vcf.gz test_test2_paired_mutect2_calls.vcf.gz variant calls for tumor normal paired samples output from mutect2 data/genomics/homo_sapiens/illumina/gatk/paired_mutect2_calls/ Local run of gatk_tumor_normal_somatic_variant_calling subworkflow
test_test2_paired_mutect2_calls.vcf.gz.tbi test.vcf.gz.tbi test_test2_paired_mutect2_calls.vcf.gz.tbi variant calls tbi for tumor normal paired samples output from mutect2 data/genomics/homo_sapiens/illumina/gatk/paired_mutect2_calls/ Local run of gatk_tumor_normal_somatic_variant_calling subworkflow
test_test2_paired_mutect2_calls.vcf.gz.stats test.vcf.gz.stats test_test2_paired_mutect2_calls.vcf.gz.stats variant calls stats file for tumor normal paired samples output from mutect2 data/genomics/homo_sapiens/illumina/gatk/paired_mutect2_calls/ Local run of gatk_tumor_normal_somatic_variant_calling subworkflow
test_test2_paired_mutect2_calls.f1r2.tar.gz test.f1r2.tar.gz test_test2_paired_mutect2_calls.f1r2.tar.gz description: data/genomics/homo_sapiens/illumina/gatk/paired_mutect2_calls/ Local run of gatk_tumor_normal_somatic_variant_calling subworkflow
test_test2_paired_filtered_mutect2_calls.vcf.gz test_filtered.vcf.gz New File tumor normal vcf file after being passed through filtermutectcalls data/genomics/homo_sapiens/illumina/gatk/paired_mutect2_calls/ Local run of gatk_tumor_normal_somatic_variant_calling subworkflow
test_test2_paired_filtered_mutect2_calls.vcf.gz.tbi test_filtered.vcf.gz.tbi New File tbi file for the tumor normal vcf file after being passed through filtermutectcalls data/genomics/homo_sapiens/illumina/gatk/paired_mutect2_calls/ Local run of gatk_tumor_normal_somatic_variant_calling subworkflow
test_test2_paired_filtered_mutect2_calls.vcf.gz.filteringStats.tsv test_filtered.vcf.gz.filteringStats.tsv New File filtering stats file for the tumor normal vcf file after being passed through filtermutectcalls data/genomics/homo_sapiens/illumina/gatk/paired_mutect2_calls/ Local run of gatk_tumor_normal_somatic_variant_calling subworkflow
mitochon_standin.recalibrated.sorted.bam test.paired_end.recalibrated.sorted.bam New File since we have no mitochondria test data, I have kept this old bam file as a stand in data/genomics/homo_sapiens/illumina/bam/ Old test-datasets recalibrated bam
mitochon_standin.recalibrated.sorted.bam.bai test.paired_end.recalibrated.sorted.bam.bai New File since we have no mitochondria test data, I have kept this old bam file index as a stand in data/genomics/homo_sapiens/illumina/bam/ Old test-datasets recalibrated bam
Reference:
genome.fasta Homo_sapiens_assembly38_chr21.fasta New File fasta file containing sequence data from chr21 of hg assembly 38 data/genomics/homo_sapiens/genome/chr21/sequence/ https://github.com/lescai-teaching/datasets_class/tree/master/reference/sequence
genome.fasta.fai Homo_sapiens_assembly38_chr21.fasta.fai New File index/dictionary file data/genomics/homo_sapiens/genome/chr21/sequence/ https://github.com/lescai-teaching/datasets_class/tree/master/reference/sequence
genome.dict Homo_sapiens_assembly38_chr21.dict New File index/dictionary file data/genomics/homo_sapiens/genome/chr21/sequence/ https://github.com/lescai-teaching/datasets_class/tree/master/reference/sequence
genome.fasta.pac Homo_sapiens_assembly38_chr21.fasta.pac New File index/dictionary file data/genomics/homo_sapiens/genome/chr21/sequence/ https://github.com/lescai-teaching/datasets_class/tree/master/reference/sequence
genome.fasta.bwt Homo_sapiens_assembly38_chr21.fasta.bwt New File index/dictionary file data/genomics/homo_sapiens/genome/chr21/sequence/ https://github.com/lescai-teaching/datasets_class/tree/master/reference/sequence
genome.fasta.amb Homo_sapiens_assembly38_chr21.fasta.amb New File index/dictionary file data/genomics/homo_sapiens/genome/chr21/sequence/ https://github.com/lescai-teaching/datasets_class/tree/master/reference/sequence
genome.fasta.ann Homo_sapiens_assembly38_chr21.fasta.ann New File index/dictionary file data/genomics/homo_sapiens/genome/chr21/sequence/ https://github.com/lescai-teaching/datasets_class/tree/master/reference/sequence
genome.fasta.sa Homo_sapiens_assembly38_chr21.fasta.sa New File index/dictionary file data/genomics/homo_sapiens/genome/chr21/sequence/ https://github.com/lescai-teaching/datasets_class/tree/master/reference/sequence
genome.1.bt2 Homo_sapiens_assembly38_chr21.1.bt2 New File index/dictionary file data/genomics/homo_sapiens/genome/chr21/sequence/ https://github.com/lescai-teaching/datasets_class/tree/master/reference/sequence
genome.2.bt2 Homo_sapiens_assembly38_chr21.2.bt2 New File index/dictionary file data/genomics/homo_sapiens/genome/chr21/sequence/ https://github.com/lescai-teaching/datasets_class/tree/master/reference/sequence
genome.3.bt2 Homo_sapiens_assembly38_chr21.3.bt2 New File index/dictionary file data/genomics/homo_sapiens/genome/chr21/sequence/ https://github.com/lescai-teaching/datasets_class/tree/master/reference/sequence
genome.4.bt2 Homo_sapiens_assembly38_chr21.4.bt2 New File index/dictionary file data/genomics/homo_sapiens/genome/chr21/sequence/ https://github.com/lescai-teaching/datasets_class/tree/master/reference/sequence
genome.rev.1.bt2 Homo_sapiens_assembly38_chr21.rev.1.bt2 New File index/dictionary file data/genomics/homo_sapiens/genome/chr21/sequence/ https://github.com/lescai-teaching/datasets_class/tree/master/reference/sequence
genome.rev.2.bt2 Homo_sapiens_assembly38_chr21.rev.2.bt2 New File index/dictionary file data/genomics/homo_sapiens/genome/chr21/sequence/ https://github.com/lescai-teaching/datasets_class/tree/master/reference/sequence
genome.interval_list hg38_chr21.interval_list New File intervals list file containing 1 interval for all of chr21 data/genomics/homo_sapiens/genome/chr21/sequence/ written mannualy
germlineresources/ gatkbundle/ New Files subdirectory of 9 germline resources for use with gatk tools (and anything else that uses germline resources in vcf format) data/genomics/homo_sapiens/genome/chr21/gatkbundle/ https://github.com/lescai-teaching/datasets_class/tree/master/reference/gatkbundle
Clone this wiki locally