-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Propose Enhancements from Sapporo WES 2.0.0 #220
Comments
Hey @suecharo thank you so much for contributing these ideas! Changes that have been made to support real world use cases are highly valuable as they tend to represent a real gap in the specification. 1. Fix ServiceInfo - default_workflow_engine_parameters TypeAh yes, I can see we totally missed this in the implementation of #182, or at least combined the engine and the version into a single string. It also looks like we do not communicate the In regards to your specific proposal the changes to the schema would constitute a breaking change and would necessitate a bump to WDL I would modify what you have proposed to be non-breaking. Maybe something like: {
"default_engine_parameters": [
{ "name": "--cpu", "value": "4", "engine": "{engine_id/verison}" }
]
} 2. Add sort_order and state Query Parameters to GET /runsThis is logical to me (and something we also have implemented on our own WES engine). Although I would tweak a few of the recommendations
WE have gone through various iterations of a field like this on our API, and have settled on a specification that looks like the following:
This is more complex, definitely. IT filled the use case where we wanted to sort by multiple columns in multiple different directions. if ONLY the Would you be open to a format like that? We could simplify for now to be:
3. Enhancing POST /runs to Support Download Workflow AttachmentsA few questions around this, especially given that the original
4. Enable Downloading of Run OutputsI definitely understand where the need to download outputs came, (we encountered a similar problem)
This makes sense and I think is relatively easy to implement.
Is this output downloading the
I have similar concerns as above, but this one may be more permissible, especially if it was serviced by a DRS API instead. |
Hello, @patmagee. 1. Fix ServiceInfo - default_workflow_engine_parameters TypeThank you for the suggestion on making this change non-breaking. I hadn't considered adding an workflow-execution-service-schemas/openapi/workflow_execution_service.openapi.yaml Line 867 in 3a832ab
engine field. (though I understand we could handle this with AdditionalProperties).
Anyway, I just wanted to point out that you might have forgotten to update 2. Add sort_order and state Query Parameters to GET /runsThe format you proposed,
looks great. An example would be something like:
In addition to sorting, I also implemented filtering. Here’s the definition I came up with:
It would be great if you could consider this as well. 3. Enhancing POST /runs to Support Download Workflow Attachments
For workflows like CWL, where remote URLs are included in the However, this behavior is entirely engine-dependent, and since Sapporo aims to support multiple engines, we frequently used workflow attachments to handle this. For workflows like CWL, where remote URLs can be included in the
Currently, we only assume public URIs. That said, we're exploring OpenID Connect (OIDC) with WES and S3 storage. In short, we're testing a scenario where an access token obtained from an OIDC provider (e.g., Keycloak) is used to authenticate WES requests via an Authorization Header, and the same token is used to access OIDC-integrated S3 storage (e.g., MinIO). Even if we integrate with a DRS layer, it's challenging to handle everything through just a workflow_url. Having a mechanism to enumerate workflow attachments in an object format like this could still be beneficial. 4. Enable Downloading of Run OutputsYes, I understand the difficulties in officially adopting this into the WES spec (but it does become necessary when implementing WES, doesn't it?).
Yes, we are zipping the actual files under the outputs directory and sending them as a stream. I understand your concerns here as well. We're hoping that as the DRS layer matures, these issues will be addressed. For example, after execution, the outputs could be uploaded to S3 (DRS), and |
We have completely rewritten Sapporo, a WES implementation, and released it as version 2.0.0. This new Sapporo is based on WES 1.1.0 and is implemented using Python's FastAPI. We first defined types based on WES 1.1.0, as seen in sapporo/schemas.py, and built the functionalities from there. Our goal was to avoid extending the original WES whenever possible. However, there were some necessary extensions that we implemented, which are summarized in this issue. We would be delighted if these extensions could be integrated into the original WES.
1. Fix
ServiceInfo
-default_workflow_engine_parameters
TypeList[DefaultWorkflowEngineParameter]
Dict[str, List[DefaultWorkflowEngineParameter]
Since WES 1.1.0 now supports multiple workflow engines within a single WES instance, this extension was necessary. The expected structure is as follows (example):
2. Add
sort_order
andstate
Query Parameters toGET /runs
To enhance the
GET /runs
endpoint, we propose introducing query parameters for sorting and filtering runs:sort_order
query parameter to sort runs based onstart_time
in either ascending or descending order. The value should be "asc" or "desc", with the default being "desc".state
query parameter to filter runs based on their state, such as "COMPLETE", "RUNNING", etc.3. Enhancing
POST /runs
to Support Download Workflow AttachmentsCurrently,
workflow_attachment
only supports attaching files via form-data. To improve its flexibility and functionality, we suggest the following enhancements:workflow_attachment_obj
that includes a download URL. This would allow WES to download the attachment file at runtime and stage the file. An example object structure would be:4. Enable Downloading of Run Outputs
We propose adding endpoints to download run outputs in various formats:
GET /runs/{run_id}/outputs
should return the outputs (outputs
field of the response fromGET /runs/{run_id}
) in JSON format.GET /runs/{run_id}/outputs?download=true
should return the outputs as a ZIP file.GET /runs/{run_id}/outputs/{path_to_file}
should allow downloading a specific output file.The text was updated successfully, but these errors were encountered: