Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error executing process > 'mod:dss' #27

Open
SilviaMariaMacri opened this issue Jun 19, 2024 · 12 comments
Open

Error executing process > 'mod:dss' #27

SilviaMariaMacri opened this issue Jun 19, 2024 · 12 comments

Comments

@SilviaMariaMacri
Copy link

Operating System

Other Linux (please specify below)

Other Linux

Red Hat Enterprise Linux release 8.6

Workflow Version

v.1.2.1

Workflow Execution

Command line (Cluster)

Other workflow execution

No response

EPI2ME Version

No response

CLI command run

/hpcshare/genomics/ASL_ONC/NextFlow_RunningDir/nextflow-23.10.0-all run epi2me-labs/wf-somatic-variation -profile singularity -resume -process.executor pbspro -process.memory 256.GB -work-dir /archive/s2/genomics/onco_nanopore/test_som_var/work -with-timeline --snv --sv --mod --sample_name OHU0002HI --bam_normal /archive/s2/genomics/onco_nanopore/HUM_OHU_OHU0002HTNDN/OHU0002HTNDN_dx0_dx-1_new.bam --bam_tumor /archive/s2/genomics/onco_nanopore/HUM_OHU_OHU0002ITTDN/OHU0002ITTDN_dx0_dx-1_new.bam --ref /archive/s1/sconsRequirements/databases/reference/resources_broad_hg38_v0_Homo_sapiens_assembly38.fasta --out_dir /archive/s2/genomics/onco_nanopore/test_som_var --basecaller_cfg [email protected] --phase_normal --classify_insert --force_strand --normal_min_coverage 0 --tumor_min_coverage 0 --haplotype_filter_threads 32 --severus_threads 32 --dss_threads 4 --modkit_threads 32 -process.cpus 32 -process.queue fatnodes

Workflow Execution - CLI Execution Profile

singularity

What happened?

Pipeline failed in its last step mod:dss.

During the issue replication (command "bash .command.run" in the working directory), as suggested by the error message, more information was shown:

System errno 22 unmapping file: Invalid argument
Error in fread("normal.bed", sep = "\t", header = T) :
Opened 15.96GB (17139453993 bytes) file ok but could not memory map it. This is a 64bit process. There is probably not enough contiguous virtual memory available.
Execution halted

Relevant log output

Error executing process > 'mod:dss (3)'

Caused by:
  Process `mod:dss (3)` terminated with an error exit status (137)

Command executed:

  #!/usr/bin/env Rscript
  library(DSS)
  require(bsseq)
  require(data.table)
  # Disable scientific notation
  options(scipen=999)
  
  # Import data
  tumor = fread("tumor.bed", sep = '	', header = T)
  normal = fread("normal.bed", sep = '	', header = T)
  # Create BSobject
  BSobj = makeBSseqData( list(tumor, normal),
      c("Tumor", "Normal") )
  # DML testing
  dmlTest = DMLtest(BSobj, 
      group1=c("Tumor"), 
      group2=c("Normal"),
      equal.disp = FALSE,
      smoothing=TRUE,
      smoothing.span=500,
      ncores=4)
  # Compute DMLs
  dmls = callDML(dmlTest,
      delta=0.25,
      p.threshold=0.001)
  # Compute DMRs
  dmrs = callDMR(dmlTest,
      delta=0.25,
      p.threshold=0.001,
      minlen=100,
      minCG=5,
      dis.merge=1500,
      pct.sig=0.5)
  # Write output files
  write.table(dmls, 'OHU0002HI.6mA_+.dml.tsv', sep='\t', quote=F, col.names=T, row.names=F)
  write.table(dmrs, 'OHU0002HI.6mA_+.dmr.tsv', sep='\t', quote=F, col.names=T, row.names=F)

Command exit status:
  137

Command output:
  (empty)

Command error:
  
      anyMissing, rowMedians
  
  
  Attaching package: 'MatrixGenerics'
  
  The following objects are masked from 'package:matrixStats':
  
      colAlls, colAnyNAs, colAnys, colAvgsPerRowSet, colCollapse,
      colCounts, colCummaxs, colCummins, colCumprods, colCumsums,
      colDiffs, colIQRDiffs, colIQRs, colLogSumExps, colMadDiffs,
      colMads, colMaxs, colMeans2, colMedians, colMins, colOrderStats,
      colProds, colQuantiles, colRanges, colRanks, colSdDiffs, colSds,
      colSums2, colTabulates, colVarDiffs, colVars, colWeightedMads,
      colWeightedMeans, colWeightedMedians, colWeightedSds,
      colWeightedVars, rowAlls, rowAnyNAs, rowAnys, rowAvgsPerColSet,
      rowCollapse, rowCounts, rowCummaxs, rowCummins, rowCumprods,
      rowCumsums, rowDiffs, rowIQRDiffs, rowIQRs, rowLogSumExps,
      rowMadDiffs, rowMads, rowMaxs, rowMeans2, rowMedians, rowMins,
      rowOrderStats, rowProds, rowQuantiles, rowRanges, rowRanks,
      rowSdDiffs, rowSds, rowSums2, rowTabulates, rowVarDiffs, rowVars,
      rowWeightedMads, rowWeightedMeans, rowWeightedMedians,
      rowWeightedSds, rowWeightedVars
  
  The following object is masked from 'package:Biobase':
  
      rowMedians
  
  Loading required package: parallel
  Loading required package: data.table
  
  Attaching package: 'data.table'
  
  The following object is masked from 'package:SummarizedExperiment':
  
      shift
  
  The following object is masked from 'package:GenomicRanges':
  
      shift
  
  The following object is masked from 'package:IRanges':
  
      shift
  
  The following objects are masked from 'package:S4Vectors':
  
      first, second
  
  .command.run: line 164:    35 Killed                  /usr/bin/env Rscript .command.sh

Work dir:
  /archive/s2/genomics/onco_nanopore/test_som_var/work/20/a4581d28e28dd29ec5e3e0e78d757f

Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`

Application activity log entry

No response

Were you able to successfully run the latest version of the workflow with the demo data?

no

Other demo data information

no attempt
@RenzoTale88
Copy link
Contributor

@SilviaMariaMacri the process is running out of memory (error 137). You can try reducing the number of threads for the DSS process to --dss_threads 2, which should reduce the amount of memory required.

@SilviaMariaMacri
Copy link
Author

@RenzoTale88 thank you for your reply.
I reduced --dss_threads firstly to 2 and then to 1 but it gave me the same error.

@RenzoTale88
Copy link
Contributor

Then you can try increasing the memory provided to the DSS process. Simply save the following block of code in a separate file:

process {
    withName: dss {
        memory = X.GB
    }
}

Where X is the amount of memory in GB that the process should use. Save the file as a custom configuration ending with .config and provide it to nextflow with the -c option:

nextflow run epi2me-labs/wf-somatic-variation -c < path to custom config file> < options here >

@RenzoTale88
Copy link
Contributor

@SilviaMariaMacri did you try providing a custom configuration file as mentioned above?

@SilviaMariaMacri
Copy link
Author

@RenzoTale88
yes, but after setting the memory to 256 GB it kept giving me the same error. Then I manually modified the .command.run file by setting the memory to 380 GB and launched the job out of the pipeline; it seems to have successfully completed the job after almost 70 hours of running time.
Now the pipeline is running with the new memory setting and I think it will finish without error since the single job did it.
What do you think the reason of so long running time and this high memory use are? Can it be avoided?

@RenzoTale88
Copy link
Contributor

@SilviaMariaMacri it is quite difficult to say. The DSS process, as the name suggests, relies on the DSS R package to identify the differentially modified regions/loci. The impact on the memory is linked to the size of the dataset and the number of cores used for the analysis, which makes it difficult to predict for every use-case.

@SilviaMariaMacri
Copy link
Author

Hi @RenzoTale88,

I'm using two whole genome sequencing bam files obtained with dorado and with double methylation (5mC_5hmC and 6mA). The bam file weight is 87G and 120G respectively for normal and tumor tissue.
Six mod:dss processes are sent to pbs code, three of them successfully finish, the fourth one reaches the maximum time limit of 100 hours (each job setting consists of 1 cpu and 750GB of memory).
So, by increasing the number of cpus I obtain memory error and by setting only 1 cpu I obtain time limit error.

Are there any plans to solve this problem by maybe dividing the input files into more than one file (i.e. one for each chromosome) and lauching the job separately for each file? Alternatively, do you have any suggestion to solve my case?

Thanks

@RenzoTale88
Copy link
Contributor

Hi @SilviaMariaMacri sorry to hear this is giving you issues. Do you have access to the logs of the processes failing (i.e. do you have access to the work directory)? That might help us figure out what is going wrong.

@SilviaMariaMacri
Copy link
Author

Thank you for you answer @RenzoTale88
Yes, here are two log files (with exit status 143 and 130), but I can't get much information
.command.log_exitcode130.log
.command.log_exitcode143.log

@RenzoTale88
Copy link
Contributor

@SilviaMariaMacri thanks for sharing. I'll see if there is a way to reduce the memory usage of the process. I'll keep you updated on the process. Thanks in advance for your patience!

@SilviaMariaMacri
Copy link
Author

Hi @RenzoTale88
do you have any update on the process? Thank you

@RenzoTale88
Copy link
Contributor

@SilviaMariaMacri sorry for the long silence. We have been running a number of tests, trying to figure out how to improve the situation, and are still working on a longer term solution for the memory issue.
In the meanwhile, we released v1.3.1 that adds the option --diff_mod, that can disable DSS by setting it to false. This should allow the workflow to run to completion, and to emit the outputs that you can then analyse manually.
We realise this is not a solution, and I apologise for the inconvenience.

Andrea

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants