RSeQC module tin.py creates RefseqID instead of ENSEMBLIDs of transcripts #1442

AnissaE · 2024-10-28T08:28:08Z

Description of the bug

Hi all,
I ran the pipeline to analyze bulk RNAseq including TIN module from RSeQC to remove samples with low median TIN (thanks for integrating It to the pipeline it is definitely helpful!!). I ran nfcore/rnaseq ( see command line below) and after 21 hours, I get this output in xls file for each sample(see .xls output for one sample). In the xls file, however, the column geneID, there are RefSeq entry including transcripts names such as rna1, rnaX, etc... (see command head -20 1117877.markdup.sorted.tin.xls below) After some googling, I found out that the expected output of tx names should be ensemblID. Is there an additional argument to change the transcript name or was there a problem that occurred during the processing ? Here is the command that I used and the outputs (xls file). The pipeline finished successfully, I didn't get any error, I labelled this issue as a bug by default.

Thanks for your help,
Anissa.
1117877.markdup.sorted.tin.xls

Command used and terminal output

~/nextflow run nf-core/rnaseq -profile singularity -r 3.6 --max_cpus ${THREADS} --max_memory ${MAX_MEM} --aligner star_salmon --input $SAMPLESHEET --outdir "${FILES}/results_TIN" --genome GRCh38 --gencode --gtf ${REF_DIR}/gencode.v39.annotation.gtf.gz -work-dir "${FILES}/results_TIN/work" --rseqc_modules 'bam_stat,inner_distance,infer_experiment,junction_annotation,junction_saturation,read_distribution,read_duplication,tin'

head -20 1117877.markdup.sorted.tin.xls
geneID chrom tx_start tx_end TIN
rna0 chr1 11873 14409 0.0
rna1 chr1 14361 29370 52.880286053751696
rna3 chr1 17368 17391 0.0
rna2 chr1 17368 17436 0.0
rna4 chr1 17408 17431 0.0
rna5 chr1 30365 30503 0.0
rna6 chr1 30437 30458 0.0
rna7 chr1 34610 36081 0.0
NM_001005484.1 chr1 69090 70008 0.0
rna9 chr1 120711 133748 15.569611580163949
rna10 chr1 134772 140566 17.925981206462087
rna13 chr1 142436 146418 28.571428571428534
rna11 chr1 142436 174392 18.35966859786253
rna12 chr1 142436 174392 20.947542906746364
rna14 chr1 142436 143602 0.0
rna15 chr1 146469 174392 22.22074678696301
rna16 chr1 146469 174392 23.745214873731445
rna17 chr1 149039 174392 18.276476252845494
rna18 chr1 153506 174392 18.220218733785686

Relevant files

No response

System information

Nextflow version 23.10.1 build 5891
System: Linux 4.18.0-553.22.1.el8_10.x86_64
Container engine: singularity
nf-core/rneaseq version 3.6

AnissaE added the bug Something isn't working label Oct 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RSeQC module tin.py creates RefseqID instead of ENSEMBLIDs of transcripts #1442

RSeQC module tin.py creates RefseqID instead of ENSEMBLIDs of transcripts #1442

AnissaE commented Oct 28, 2024

RSeQC module tin.py creates RefseqID instead of ENSEMBLIDs of transcripts #1442

RSeQC module tin.py creates RefseqID instead of ENSEMBLIDs of transcripts #1442

Comments

AnissaE commented Oct 28, 2024

Description of the bug

Command used and terminal output

Relevant files

System information