-
Notifications
You must be signed in to change notification settings - Fork 334
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fastp not removing all Illumina universal adapter sequences as indicated by FastQC #558
Comments
I have a similar issue, but with Nextera adapters. fastp says no contamination, FastQC says nextera, up to 10% by the read end. Even when I supply the Nextera fasta file (the one provided by trimmomatic) virtually no trimming happens. Trimmomatic with This isn't a perfect comparison, I think fastp default min window Q is 20, not 25, but still. Something seems off here. I'm using v0.23.2. |
Same problem. Any suggestion is welcome. Thanks! |
I switched back to fastqc/trimmomatic/fastqc. I'm removing fastp from my workflows. There are also a couple concerning GitHub issues about reproducibility. I like the tool but I can't use it if these things aren't resolved. |
Hi, there~I met a similar problem and I figured out an explanation myself which at least works for mine. The possible reason that Fastp does not recoginze and remove the adapter while FastQC detects is that R1 reads are shorter than 150bp, which means the adapter in R1.fastq.gz detected by FastQC is actually the reversed and complementary adapter of R2. So, in this situation, if you want to remove the adapter in R1 via Fastp, specify the adapter sequence in Fastp command with "-a reversed_and_complementary_adapter_sequence_of_Read2". And if you want to remove the adapter in R2, use the sequence of reversed and complementary adapter of R1. When you have a library shorter than 150bp, Sequencer will keep reading bases after finishing your inserts and continue to read the bases according to the adapter of the opposite strand. My guess is that FastQC can detect those widely-used adapters both reversed or not while Fastp can't, which means Fastp can only auto-detect those widely-used adapters literally based on the sequences given. I would suggest to play with Fastp with the sequence of the other strand adapter. Or you can simply extract some reads sequence and analyze it manually, to find where the adapter is and what actual it is. Please feel free to let me know if I didn't make it clear or if it works for you. Thanks! |
Hi, I recently ran fastp on an Illumina dataset with the following command:
fastp -i SRR18278237.fastq.gz -o SRR18278237.fastp.gz -z 9 -l 15 -w 16 --dedup --dup_calc_accuracy 6 -x -3 --cut_mean_quality 20 -j SRR18278237.fastp.json -h SRR18278237.fastp.html
I expected that this command would remove the Illumina universal adapter sequences from the reads. However, after running FastQC on the output files, I'm still seeing a significant adapter content in the FastQC report, specifically towards the end of the reads (please see attached screenshot).
Could you please help me understand the following:
I have attached the JSON and HTML reports from
fastp
for your reference. I would greatly appreciate any insights or suggestions you might have to resolve this issue.Thank you for your assistance and for developing such a useful tool.
Best regards,
Xiaowen
Uploading SRR18278237 (1).fastp.zip…
The text was updated successfully, but these errors were encountered: