-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
suitable for CDS and/or contig taxonomic assignment? #28
Comments
Hi Mike, thanks for your interest in KMCP. You can try splitting CDS/contigs when using
stats
Another example of Klebsiella pneumoniae CDS:
|
Awesome, thanks for plotting out a clear path! I looks forward to trying it out :) |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hey there, @shenwei356 :)
Thanks again for not only additional wonderful software, but excellent documentation as usual!
I'm looking for a better way to taxonomically classify predicted coding sequences and contigs from metagenomic assemblies (i currently use CAT with NCBI's nr).
I really want a combination of GTDB for bacteria/archaea, and then also be able to combine euks from NCBI, so your infrastructure enabling that sort of thing is really appealing to me 🙏
I see in issue #27 you note that KMCP is not suitable for long reads. Is your thinking similar for assembled contigs too?
And would you expect to have the same thoughts about taxonomically classifying predicted coding sequences (which might average around 800-1000 bases)?
If only one of those would be possible, I can imagine it might be reasonable to use it to infer the other. E.g., if contigs are do-able, then assigning all CDSs whatever their source contig tax was. And if CDSs are do-able, employing some consensus approach to assign to the contig the tax of its CDSs.
Sorry if i'm missing if you've covered this elsewhere already, and thanks for any of your thoughts!
The text was updated successfully, but these errors were encountered: