Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When input cfdna fastq data is masked (human reads are masked) #222

Open
arpit20328 opened this issue Aug 30, 2024 · 3 comments
Open

When input cfdna fastq data is masked (human reads are masked) #222

arpit20328 opened this issue Aug 30, 2024 · 3 comments

Comments

@arpit20328
Copy link

Hi @colindaven @sannareddyk @B1T0 @Colorstorm @BioNij

Wochenende ran well when we inputed cfdna fastq profiles containing human reads as well with other pathogenic reads.

We were interested in how wochenende will respond when we input same cfdna fastq profiles but with masking human origin reads.

I masked human reads with help of tool https://github.com/ncbi/sra-human-scrubber

Is there a difference if we use same reference database and any difference in the interpretation of results ?

I will be using relative abundance as = number of assigned reads / seq_length as my parameter to calculate abundance.

@colindaven
Copy link
Contributor

Hi @arpit20328

glad you're getting good results out of the tool.

If you mask human reads then you will get less mappings to the human genome. This is assuming of course no microbial reads are considered as human by mistake and masked.

Less or no mappings to the human genome will not cause any great problems. In the reporting step which we recommend using for normalization, it will break the bacteria per human cell normalization of course, since that relies on human read mappings. Otherwise you should be fine.

If you want to use any other form of abundance calculation, sure, but we only support and encourage use of the normalization in the reporting subdir.

cheers

Maybe @irosenboom has played with masking human reads before?

@arpit20328
Copy link
Author

thanks @colindaven . great tool by the way.

@irosenboom
Copy link
Contributor

Hi @arpit20328 ,
I am glad that Wochenende runs well on your input fastq files.

I agree with @colindaven that masking the human reads will only break the bacteria per human cell normalization. The other steps work perfectly fine and the pipeline will run even faster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants