-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
First test with Dolomite Nadia data #101
Comments
You seem have it set up and run it the right way, for some reason umi-tools however think there are no barcodes in your sample:
umi-tools estimates the number of barcodes from this file: You can also run umi-tools manually ny activating the conda environment to play arround with parameters or so:
|
Thanks @seb-mueller, I can confirm that trimmed_repaired_R1.fastq.gz is completely empty. I looked through the logs to try and guess which upstream command creates these files where I see this command:
The trimmed_R*.fastq.gz files do seem to be created by cutadapt in the early steps with what looks like correct-ish results, but they seem to be gone when the run completes. |
I suspect that there is a problem with my fastq format, which I've seen before when using bbMap. My fastq entry names have the following format where there is both a trailing "/2" in the read name and a leading "2" in the second field. |
Used this command:
and am now getting non-empty files and the analysis is progressing to the alignment stage. |
Thats impressive bugfixing work Richard! Since you have seen this before (and might therefore occur for others), this might be material for the trouble-shooting section. Do you have a link or so that elaborates on this problem? Also, bbmap might benefit from handling those files, so we can maybe send them a bug report. |
I suspect that not many others are getting read names like ours. I think we have that mix of formats for some backwards compatibility issues in our pipeline and I ran into a similar error when using a different bbMap tool last year. From my understanding of the standard fastq formats we are the outlier. That said Brian Bushnell over at bbMap might have some text stipulating the fastq formats bbmap can handle. My run after fixing the read names got real close to the end, but still appears to have hit an error (let me know if you'd prefer a different issue of this): Activating conda environment:
|
Indeed, if this drags out we might transfer this to its own issue (not sure how this is handled elsewere though).
I haven't had this error before, but suspect this has to do with how the working directory is set. Normally I call snakemake from the cloned archive (containing the Snakefile) and specifiying the working dir using
Lastly, it might help if you set |
I get a similar error running DropSeq data (Mascosko 2015). I downloaded files from GEO and SRA and did some filtering. SRR1873277_S1_L001_R1_001.fastq
SRR1873277_S1_L001_R2_001.fastq
I get this issue running version 0.5 or the current My understanding is that
This hangs on the After seeing this, I think the FASTQ file may be problem, I seem to be missing second field from the headers. Is this required? |
In my case, I was able to fix it by adding the second field to the header.
With this I hope this helps anyone running into similar problems but it may help to check for this problem in the pipeline (or remove the requirement for compatible headers if not needed). Update This gives a "Found 0 cell barcodes in file" error.
I'm guessing this is because the barcodes are parsed from the header rather than the 1st 8 bases of the R1. In this case, the code to fix the files is:
Update 2 |
Hi folks,
I'm trying dropSeeqPipe for the first time on some test Nadia data we generated. I am using the example Nadia template config and I have 4 samples, each with paired-end 125bp reads.
When I start snakemake it seems that fastqc completes successfully, and most of the cutadapt jobs are successful, but things go awry when my first bbmap/whitelist/umi_tools commands are run.
I'll try running a single sample to see if I can isolate the error, but I'd be interested to hear if you have any advice.
thanks
Richard
Here's a trace...
The text was updated successfully, but these errors were encountered: