parsing RNA-seq data with Unique Molecular Identifiers (UMI)
-
Sequencing adapter trimming
-
UMI barcode, following G, and 3' poly-(A) trimming
UMI_trim.pl yoursample.fastq.gz yoursample
-
Read mapping to genome
-
Converting bam to bed
bamToBed -bed12 -i yoursample.bam | awk -vOFS='\t' '{split($4,a,/=/); $4=a[2]; print $0}' | gzip > yoursample.bed.gz
-
UMI collapse
UMI_collapse.pl yoursample.bed.gz yoursample.UMI_collapsed.bed.gz
-
UMI deduplicates
UMI_dedup.pl yoursample.UMI_collapsed.bed.gz yoursample.UMI_dedup.bed.gz
xi.wang (at) dkfz-heidelberg.de