Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Be able to run multiple bin QC metrics #713

Open
harper357 opened this issue Nov 7, 2024 · 7 comments
Open

Be able to run multiple bin QC metrics #713

harper357 opened this issue Nov 7, 2024 · 7 comments
Labels
enhancement New feature or request

Comments

@harper357
Copy link
Contributor

Description of feature

I posed this in the slack, but I think it would be great if it was possible to run multiple bin QC metrics (BUSCO, GUNC, CheckM2, etc) and get all of that data in the MultiQC report.

I noticed #707 which adds some new options, but I don't know if it is also allowing for multiple or not.

@harper357 harper357 added the enhancement New feature or request label Nov 7, 2024
@jfy133
Copy link
Member

jfy133 commented Nov 7, 2024

I think conceptually this makes sense however it's not so trivial as the output of each one goes into downstream process and possibly a custom script (not written by myself), so it might not be so trivial. It might take a bit of time to get to it

@harper357
Copy link
Contributor Author

Personally, even if it is just reported in the MultiQC report and not used for anything downstream I think it would be a nice thing to have.

Once the mentioned PR is merged, I can try to work up a new PR to try it out.

Or maybe I should try getting work hours to write up a whole benchmarking pipeline so its easier to compare new software. That seems rather ambitious though.

@jfy133
Copy link
Member

jfy133 commented Nov 8, 2024

So had a chance to quickly through the code:

  • One area where we have the one or other for BUSCO/CheckM is filtering contigs for completeness/contamination for going into GTDBTkl but this could then be replaced by parameter to say 'which' of the two outputs should be used for that filtering
  • CheckM(1) is required for GUNC (busco doesn't work), but that doesn't block a user from also running BUSCO
  • I was mistaken regarding the downstream custom script, that should be fine (BIN_SUMMARY), the script seems to allow inserting all

So actually this might be relatively easy to do in fact (conceptually) ino terms of running everthing :)

However only BUSCO is in MultiQC, so adding CheckM/GUNC etc. into MultiQC is a whole other task 😬

@harper357
Copy link
Contributor Author

Oh, hmm. Well maybe that is a better use of my time first then.

Thanks for looking into this though!

@jfy133
Copy link
Member

jfy133 commented Nov 11, 2024

Oh, hmm. Well maybe that is a better use of my time first then.

Which bit?

@harper357
Copy link
Contributor Author

Sorry, I meant writing some MultiQC plugins for CheckM, CheckM2 and GUNC.

@jfy133
Copy link
Member

jfy133 commented Nov 11, 2024

Yes, that would be good! Divide and conquer (at least once I'm not travelling or have sick kids as I now have 😵‍💫🤦🏼‍♀️)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants