Replies: 5 comments 17 replies
-
Indeed, I think it having this feature supported natively by NF would bring some important benefits:
My understanding is that the code is implemented in Groovy, therefore it should be straightforward to have it supported by the NF runtime. My only concern is that you may need to iterate fast with validation feature to add new features or improve functionalities. etc. This is why I think it could be useful to have the validation as nextflow plugin. By doing that, the validation plugin could have its own release cycle without having to wait for the NF release cadence. Thoughts? |
Beta Was this translation helpful? Give feedback.
-
That would be nice. For the json-schema would be able to move forward with new json-schema version support. |
Beta Was this translation helpful? Give feedback.
-
Awesome stuff, thanks @apeltzer! I'll try to fill in some details here and some pitfalls to be aware of in what we've done so far. HistoryOriginal kudos actually goes back to @sven1103 and @msmootsgi way back in #866 - we were starting to get carried away with having funky comment strings in nextflow config files and other custom formats and @msmootsgi pushed for us to use JSON Schema instead. Great call, saved us from creating another new standard and has worked beautifully 👍🏻 For those unfamiliar with it, JSON Schema is a standardised format / vocabulary for writing a document (schema) that describes what a piece of structured data should look like. Official site is here: https://json-schema.org/ and excellent documentation is here: https://json-schema.org/understanding-json-schema/ Initially we started playing with writing a metaschema - that is, a schema that describes how our schema should be created, which in turn describe how the input data should look. After a long pause and then lot of head scratching I took a step back and decided that this was overkill and returned to vanilla JSON schema documents (almost, see below) using the Draft-07 spec (the latest when I wrote this code). The @nf-core pipeline template now comes with a file called The schema builderWriting these big and complex JSON Schema documents by hand is no fun, and asking all pipeline developers to do it without making mistakes was never going to work. So the first thing that we did was to build a tool for the developers to build the schema. This is used by running the command
This has worked really well so far and we now have a lot of pipelines with complete and valid schema files, where no-one has had to touch a line of JSON by hand. Compromises and weird stuffThis builder code went through a lot of iterations and changes. We ended up with a few strange things:
The schema launcherOnce we had a builder and some pipeline schema started being written, we created a launch interface. This isn't super relevant here so I'll go light on the details, but it basically works in the same way as the builder above - you do DocumentationOnce you have this document as simple JSON outside of Nextflow, you can do all sorts. The website parameter documentation is now built with it if found, and command line help is also coming soon. One place to write the docs and it can be used everywhere. ValidationOk, the final piece of the puzzle was to use these nice schema validation rules to actually validate some input data. The finishing touches are being put to this for the pipeline template (which will by synced out to all @nf-core pipelines) now in nf-core/tools#852 @KevinMenden did an awesome job working with this to make it work with a groovy library to handle the complex JSON Schema validation, and also bundling this into a zip file in the template so that the pipelines will work offline without needing to download any dependencies. This ability to work without an internet connection is super important. This is my main concern with a plugin approach as described above. Whilst we were there, we also added a check for parameters with the same name as any core Nextflow option (that will now become an @nf-core guideline, that these must be avoided). And we have some warnings about unexpected params, though these don't halt execution by themselves as they could be quite common. ConclusionOk, essay done. We would love for this all to become a Nextflow standard instead of just nf-core, that was always the plan. We will roll out our own approach above as it's basically done already, but if core Nextflow functionality can be added for validation that would be amazing and we can remove that code again. |
Beta Was this translation helpful? Give feedback.
-
Wow, we have enough material for a paper! 😅 Now, I think the goal of this discussion is to implement a common interface for the schema the json schema validation so that it can be used by any NF pipeline other than then nf-core pipelines and, above all, not to replace all the amazing nf-core tool features ie. builder, launcher and docs. My understanding is that the only technical problem is the following:
This is a very good point. Currently, the plugins mechanism pull the plugin archive (a zip) from the internet when it cannot find it locally. Currently, it looks in the It could be enough to extend it to allow the search of plugins also in the pipeline directory |
Beta Was this translation helpful? Give feedback.
-
👍 Excited for this feature! |
Beta Was this translation helpful? Give feedback.
-
Hi team,
as there was noone taking up the baton from this Thread here https://twitter.com/tallphil/status/1360938099925213184 I tried to give it a start and maybe others can chime in.
nf-core recently started with work mostly done by @KevinMenden , @drpatelh , @ewels and others to have validation of input parameters for pipelines. They created a way to validate parameters supplied to a pipeline against the already established JSON schema that nf-core pipelines inherently use for creating the launch feature on the webpage.
An example including it into the latest nf-core template on the
dev
branch can be found here:nf-core/tools#852
Paolo suggested that this could be something that Nextflow itself could potentially support as a core feature. There are several points that this currently does:
--profile
, though I'm not sure what happens with a--profile foobar -profile test,docker
🤔 )--help
from the JSON schema (=> no need to keep this manually in sync, much less hardcoded parts in the pipeline & works generically between pipelines)We also have further functionality in there, but these seem to be rather nf-core specific (e.g. the nf-core headers, e-mail feature are also in separate
lib/
code pieces), so maybe not as suitable as the above to be of potential interest for core nextflow features.Just wanted to give this a kick off and bring in the people really involved, I was just a fortunate front-line & happy prototype user 🤗
Beta Was this translation helpful? Give feedback.
All reactions