ChimeraMiner

A pipeline for searching the chimeric sequences in the NGS sequence data.

Reference Genome Preparation

In my server, my reference genome fold is:

/home/luna/Desktop/database/homo_bwa

So, in your analysis platform, change to your directory.

In this fold, we have 26 fasta files (chr{1..22, X, Y, MT}.fa hsa.fa), each fasta file have indexed with bwa index.

In additional, hsa.fa include all sequence from all chromosome (chr{1..22, X, Y, MT}).

Index for each fasta file:

for chr in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X Y MT
do
	bwa index -a bwtsw chr$chr.fa &>> bwa-index.log
	samtools faidx chr$chr.fa &>> samtool-index.log
done
bwa index -a bwtsw hsa.fa &>> bwa-index.log
samtools faidx hsa.fa &>> samtool-index.log

In here, please make sure the name of your chromosomes is like "chr1 chr2 chr3 ... chrX chrY chrMT", Thanks.

Perl Modules Preparation

In this pipeline, we need some Perl Modules, we should install these modules before run it.

Getopt::Long
Cwd qw(abs_path)
File::Basename
File::Spec

I recommand to use "cpanm" to install Perl Modules.

Test

The Test folder contains an example. It turns out that all the scripts are running. You can check out how to use the pipeline. In this folder, just run workstep.sh first, this shell will generate bam file and chimera's files. When all works in workstep.sh finished, then run filterstep.sh, this shell will deal with the chimera's files and count.

First, align reads to reference

use bwa mem to map the reads to hg19, generate a list contains "SampleID BAMFile", each sample each line;

ref=/home/luna/Desktop/database/homo_bwa/hsa.fa
bwa mem -t 20 -k 30 -R '@RG\tID:Test\tLB:Test\tSM:Test\tPL:ILLUMINA' $ref test_R1.fq.gz test_R2.fq.gz | awk '$6 !~ /H/' | samtools view -Sb -t ${ref}.fai -o $dir/Test/Test.dh.bam -
echo -e "Test\t$dir/Test/Test.dh.bam" > bam.lst

Generate_Shell_Finder.pl

Use bam.lst as input file. Generate a shelle script for running the ChimeraMiner.

perl Generate_Shell_Finder.pl -i bam.lst -o runFinder.Test.sh -L 20 -r $ref

runFinder.Test.sh

run the shell script, do some works:

Insertion.SRExtract.ReConFastq.pl for searching insertion chimeras, extracting soft-clipped alignment reads as candidate single-ended chimeric reads (direct and inverted) and re-constructring pe fastqs (each chromosome has two fastq files) for candidate chimeras
aligned chromosome fastq files to chromosome reference respectively.
SearchOverlapSEchimera.pl for searching the overlap sequence of two adjacent segments of candidate single-ended chimeric reads, when searching the overlap sequence, we carry out each chromosome independently.

ChimerasDownstream.pl

This script is used for downstream analysis of chimeras:

Merge chimeras of each chromosome to a file
Tranform the raw format of chimeras to a better format for viewing
Extracting the direct chimera and inverted chimera to different files.
Count the chimera types'++' '--' '+-' '-+' , and the number of direct chimera and inverted chimera.

the usage of bwa

see the details in the page.

Contact:

We will be pleased to address any question or concern you may have with the ChimeraMiner: [email protected]

Citing ChimeraMiner

If you use ChimeraMiner in your work, please cite:

Lu, N.; Li, J.; Bi, C.; Guo, J.; Tao, Y.; Luan, K.; Tu, J.; Lu, Z. ChimeraMiner: An Improved Chimeric Read Detection Pipeline and Its Application in Single Cell Sequencing. Int. J. Mol. Sci. 2019, 20, 1953.

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
Test		Test
A1_Extract.Chimeras.pl		A1_Extract.Chimeras.pl
A2_TransFormat.pl		A2_TransFormat.pl
A3_GetInfo.Chimeras.PRE.pl		A3_GetInfo.Chimeras.PRE.pl
AlignReadstoRef.pl		AlignReadstoRef.pl
ChimerasDownstream.pl		ChimerasDownstream.pl
Generate_Shell_Finder.pl		Generate_Shell_Finder.pl
Insertion.SRExtract.ReConFastq.pl		Insertion.SRExtract.ReConFastq.pl
README.md		README.md
SearchOverlapSEchimera.pl		SearchOverlapSEchimera.pl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ChimeraMiner

Reference Genome Preparation

Perl Modules Preparation

Test

First, align reads to reference

Generate_Shell_Finder.pl

runFinder.Test.sh

ChimerasDownstream.pl

the usage of bwa

Contact:

Citing ChimeraMiner

About

Releases

Packages

Languages

dulunar/ChimeraMiner

Folders and files

Latest commit

History

Repository files navigation

ChimeraMiner

Reference Genome Preparation

Perl Modules Preparation

Test

First, align reads to reference

Generate_Shell_Finder.pl

runFinder.Test.sh

ChimerasDownstream.pl

the usage of bwa

Contact:

Citing ChimeraMiner

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages