Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error message - BAM file shares no contigs with GTF - Knight eQTL rnaseq call #571

Open
Chunmingl opened this issue Apr 20, 2023 · 13 comments

Comments

@Chunmingl
Copy link
Contributor

From the step of rnaseqc_call, 8 bam files returned with an error message indicating BAM file shares no contigs with GTF
and this error message is preventing moving forward to the next step for these 8 bam files.

Here is one of the error message from one of the bam files
[tb755bc6d14ec7dca]: Executing script in Singularity returns an error (exitcode=11, stderr=/mnt/vast/hpc/csg/cl4215/ROSMAP/knight/output/rnaseq/PA00003164.rnaseqc.gene_tpm.gct.stderr). The script has been saved to /home/cl4215/.sos/f09861f58d65b381/singularity_run_20397.sh. To reproduce the error please run: singularity exec /mnt/vast/hpc/csg/snuc_pseudo_bulk/eight_celltypes_analysis/SuSiE/containers/rna_quantification.sif /bin/bash /home/cl4215/.sos/f09861f58d65b381/singularity_run_20397.sh

Below is submitted command

nohup sos run ~/githubrepo/xqtl-pipeline/code/molecular_phenotypes/calling/RNA_calling.ipynb rnaseqc_call \
    --cwd /mnt/vast/hpc/csg/cl4215/ROSMAP/knight/output/rnaseq \
    --samples /mnt/vast/hpc/csg/cl4215/ROSMAP/knight/xqtl_protocol_data.fastqlist    --data-dir /mnt/vast/hpc/csg/cl4215/ROSMAP/knight \
    --gtf /mnt/vast/hpc/csg/cl4215/mrmash/reference_data/Homo_sapiens.GRCh38.103.chr.reformatted.collapse_only.gene.gtf \
    --container /mnt/vast/hpc/csg/snuc_pseudo_bulk/eight_celltypes_analysis/SuSiE/containers/rna_quantification.sif  \
    --reference-fasta /mnt/vast/hpc/csg/cl4215/mrmash/reference_data/GRCh38_full_analysis_set_plus_decoy_hla.noALT_noHLA_noDecoy.fasta \
    --bam_list /mnt/vast/hpc/csg/cl4215/ROSMAP/knight/output/rnaseq/xqtl_protocol_data_bam_list \
    -c /mnt/vast/hpc/csg/molecular_phenotype_calling/csg.yml -j 10 -q csg2
@gaow
Copy link
Contributor

gaow commented Apr 21, 2023

@Chunmingl i see some tips online that might help -- did you try those? One other obvious thing to check is if your BAM files is intact, that is, if the size of the BAM files in question is much smaller than others that work.

@Chunmingl
Copy link
Contributor Author

The bam files were intact, and the qc files looked ok. however, many more samples returned the same errors after rerunning from rnaseq call. @hsun3163

tail /mnt/vast/hpc/csg/cl4215/ROSMAP/knight/output/rnaseq/test/PA00000961.rnaseqc.gene_tpm.gct.stderr

BAM file shares no contigs with GTF
BAM file shares no contigs with GTF
BAM file shares no contigs with GTF
BAM file shares no contigs with GTF

@hsun3163
Copy link
Collaborator

As it occurs, the bam file is empty. Can you point me to the analysis notebook that documenting all your analysis as we discussed?

hs3163@csglogin:/mnt/vast/hpc/csg/cl4215/ROSMAP/knight/output/rnaseq/test$ ls -lah  /mnt/vast/hpc/csg/cl4215/ROSMAP/knight/output/rnaseq/test/PA00000961*
-rw-r--r-- 1 cl4215 hgrcgrid_statgen    0 May 23 15:12 /mnt/vast/hpc/csg/cl4215/ROSMAP/knight/output/rnaseq/test/PA00000961.Aligned.sortedByCoord.out.bam
-rw-r--r-- 1 cl4215 hgrcgrid_statgen 3.6G Jun  3 18:38 /mnt/vast/hpc/csg/cl4215/ROSMAP/knight/output/rnaseq/test/PA00000961.Aligned.toTranscriptome.out.bam
-rw-r--r-- 1 cl4215 hgrcgrid_statgen  144 May 28 10:27 /mnt/vast/hpc/csg/cl4215/ROSMAP/knight/output/rnaseq/test/PA00000961.rnaseqc.gene_tpm.gct.stderr
-rw-r--r-- 1 cl4215 hgrcgrid_statgen    0 May 24 13:18 /mnt/vast/hpc/csg/cl4215/ROSMAP/knight/output/rnaseq/test/PA00000961.rnaseqc.gene_tpm.gct.stdout
-rw-r--r-- 1 cl4215 hgrcgrid_statgen 6.3M Jun  4 02:13 /mnt/vast/hpc/csg/cl4215/ROSMAP/knight/output/rnaseq/test/PA00000961.rsem.genes.results
-rw-r--r-- 1 cl4215 hgrcgrid_statgen  15M Jun  4 02:13 /mnt/vast/hpc/csg/cl4215/ROSMAP/knight/output/rnaseq/test/PA00000961.rsem.isoforms.results
-rw-r--r-- 1 cl4215 hgrcgrid_statgen   73 Jun  4 01:58 /mnt/vast/hpc/csg/cl4215/ROSMAP/knight/output/rnaseq/test/PA00000961.rsem.isoforms.stderr
-rw-r--r-- 1 cl4215 hgrcgrid_statgen 506K Jun  4 02:13 /mnt/vast/hpc/csg/cl4215/ROSMAP/knight/output/rnaseq/test/PA00000961.rsem.isoforms.stdout

@Chunmingl
Copy link
Contributor Author

I submitted in bash scripts. Here are the some of the scripts I have previously run.

/home/cl4215/githubrepo/mrmash_ROSMAP/Knight/knight_rnaseq3.sh
/home/cl4215/githubrepo/mrmash_ROSMAP/Knight/knight_rnaseq4.sh

@hsun3163
Copy link
Collaborator

hsun3163 commented Jun 13, 2023

please do document the analysis in the notebook going forward...it would be hard to keep track of bash scripts.

that being said, how is

/mnt/vast/hpc/csg/cl4215/ROSMAP/knight/output/rnaseq/xqtl_protocol_data_bam_list

generated ?

This doesn't seems to be the correct output for the STAR_output step of the knight data

@Chunmingl
Copy link
Contributor Author

for the star output step: I ran it multiple times (including test running with a smaller sample size ) - most of the time it ended with no error message but the status of the job did not seem complete.

tail /mnt/vast/hpc/csg/cl4215/ROSMAP/knight/errout/knight_staroutput3.log

INFO: Waiting for the completion of 2 tasks.
INFO: Waiting for the completion of 2 tasks.
INFO: Waiting for the completion of 2 tasks.
INFO: Waiting for the completion of 2 tasks.
INFO: Waiting for the completion of 2 tasks.
INFO: Waiting for the completion of 2 tasks.
INFO: Waiting for the completion of 2 tasks.
INFO: Waiting for the completion of 2 tasks.
INFO: Waiting for the completion of 2 tasks.
INFO: Waiting for the completion of 2 tasks. 

@hsun3163
Copy link
Collaborator

for the star output step: I ran it multiple times (including test running with a smaller sample size ) - most of the time it ended with no error message but the status of the job did not seem complete.

tail /mnt/vast/hpc/csg/cl4215/ROSMAP/knight/errout/knight_staroutput3.log

INFO: Waiting for the completion of 2 tasks.
INFO: Waiting for the completion of 2 tasks.
INFO: Waiting for the completion of 2 tasks.
INFO: Waiting for the completion of 2 tasks.
INFO: Waiting for the completion of 2 tasks.
INFO: Waiting for the completion of 2 tasks.
INFO: Waiting for the completion of 2 tasks.
INFO: Waiting for the completion of 2 tasks.
INFO: Waiting for the completion of 2 tasks.
INFO: Waiting for the completion of 2 tasks. 

Given the complexity of the issue, again, please prepare a notebook documenting all the command you have ran and the log file associated with each command. Otherwise it is impossible to look into.

@Chunmingl
Copy link
Contributor Author

Here is the summarized notebook of scripts and log files
/home/cl4215/githubrepo/mrmash_ROSMAP/Knight/knight_contigs_GTF.ipynb

@hsun3163
Copy link
Collaborator

Here is the summarized notebook of scripts and log files /home/cl4215/githubrepo/mrmash_ROSMAP/Knight/knight_contigs_GTF.ipynb

can you fork the fungi-QTL-analysis repo, and send a pr to upload this notebook (preferable along with other notebooks that are relevant to the xQTL project)? I don't really have access to notebooks that are not in my home dir due to the way jupyterlab works.

@Chunmingl
Copy link
Contributor Author

Pr is sent

@hsun3163
Copy link
Collaborator

hsun3163 commented Jun 15, 2023

Pr is sent

Apparently the two samples' STAR failed due to walltime. can you rerun the failed samples in a new cwd with increased walltime by setting --walltime in the SOS command?

hs3163@csglogin:/mnt/vast/hpc/csg/cl4215/ROSMAP/knight/output/test$ qacct -j 5473097
==============================================================
qname        csg.q
hostname     node48
group        cl4215
owner        cl4215
project      NONE
department   defaultdepartment
jobname      job_t3c81fbcdec02417a
jobnumber    5473097
taskid       undefined
account      sge
priority     0
qsub_time    Sat May 27 17:00:29 2023
start_time   Sat May 27 17:26:43 2023
end_time     Sat May 27 22:26:44 2023
granted_pe   orte
slots        8
failed       37  : qmaster enforced h_rt, h_cpu, or h_vmem limit
exit_status  137                  (Killed)
ru_wallclock 18001s
ru_utime     0.175s
ru_stime     0.034s
ru_maxrss    7.410KB
ru_ixrss     0.000B
ru_ismrss    0.000B
ru_idrss     0.000B
ru_isrss     0.000B
ru_minflt    8013
ru_majflt    0
ru_nswap     0
ru_inblock   0
ru_oublock   16
ru_msgsnd    0
ru_msgrcv    0
ru_nsignals  0
ru_nvcsw     5076
ru_nivcsw    6
cpu          141516.840s
mem          4653.239TBs
io           492.403GB
iow          0.000s
maxvmem      35.705GB
arid         undefined
ar_sub_time  undefined
category     -u cl4215 -q csg.q -l h_rt=18000,h_vmem=40G -pe orte 8

@Chunmingl
Copy link
Contributor Author

I received an error message
/mnt/vast/hpc/csg/cl4215/ROSMAP/knight/errout/knight_staroutput7.log

ERROR: [picard_qc (picard_qc)]: [picard_qc]: Failed to execute process
"bash(fr"""set -e\ntouch {_output[0]:n}.CollectMultipleMetrics...["cord_bam"].zap()\n\n"
name 'job_size' is not defined
[STAR_output]: Exits with 1 pending step (STAR_output)

the submitted script can be accessed in the recent pull request:
fungen-xqtl-analysis/analysis/Wang_Columbia/knight/eQTL/knight_STAR_Output7.sh

@hsun3163
Copy link
Collaborator

I received an error message /mnt/vast/hpc/csg/cl4215/ROSMAP/knight/errout/knight_staroutput7.log

ERROR: [picard_qc (picard_qc)]: [picard_qc]: Failed to execute process
"bash(fr"""set -e\ntouch {_output[0]:n}.CollectMultipleMetrics...["cord_bam"].zap()\n\n"
name 'job_size' is not defined
[STAR_output]: Exits with 1 pending step (STAR_output)

the submitted script can be accessed in the recent pull request: fungen-xqtl-analysis/analysis/Wang_Columbia/knight/eQTL/knight_STAR_Output7.sh

Some one have had this issue before, which was solved after updating their sos, can you try again after updating your sos?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants