-
Notifications
You must be signed in to change notification settings - Fork 0
How to Submit Data (API)
Data can be submitted through scripted methods to the API.
You can explore the data submission API through the Swagger API Docs.
In order to submit data to the API, you must have a token encoding access. To retrieve your personal token:
- Login to the Virus-Seq portal and click the
Profile and Token
page from the top right user navigation bar. - Click the
Copy
button next to your personal token.
!! NOTE !!
- Your access token is associated with your user credentials and should NEVER be shared with anyone.
- Your access token lasts only for 24 hours.
After preparing your data files, you can use the API and your personal token to format submission requests.
For example, using curl:
curl --location --request POST 'https://muse.virusseq-dataportal.ca/submissions' \
--header 'Authorization: Bearer <token goes here>' \
--form 'files=@"/path/to/fasta/file-or-files/L00212401.fasta"' \
--form 'files=@"/path/to/metadata/file/metadata.tsv"'
If your files were formatted correctly, you will receive a submission id in response:
{
"submissionId": "a941f97f-6408-4886-b9ca-d852606e3072"
}
If there was an issue with the format of the files, or if your # of viral genomes in the metadata TSV does not match the number of viral genomes submitted in fasta files, then you will receive an error. For example:
{
"status": "BAD_REQUEST",
"message": "Headers are incorrect!",
"errorInfo": {
"unknownHeaders": [],
"missingHeaders": [
"GISAID accession",
"diagnostic pcr Ct value null reason"
]
}
}
Troubleshoot the issues with the file until the upload proceed. Common things to check include:
- make sure all the required headers are present. The latest example TSV can be found here.
- make sure the samples listed in the metadata file match the samples in the provided fasta
curl --location --request GET 'https://muse.virusseq-dataportal.ca/uploads?page=0&size=100&sortDirection=DESC&sortField=createdAt&submissionId={submission id goes here}' \
--header 'Authorization: Bearer <token goes here>'
For each viral genome that was included in the submission, you will see an object in an array. Each viral genome payload has a:
- status
- list of errors if any
- a unique id called
analysis-id
You can see an example of an ERROR
payload below (something was wrong with the submitted data) versus a successful COMPLETE
upload below.
"data": [
{
"submissionId": "be938d36-3614-410c-baae-b514daf1c4ab",
"studyId": "DRGN-INTL",
"submitterSampleId": "DRGN_45596",
"status": "ERROR",
"originalFilePair": [
"DRGNtest.fasta",
"DRGN_metadata.tsv"
],
"analysisId": null,
"error": "400 BAD_REQUEST - [SubmitService::schema.violation] - #/host/host_age_unit: years is not a valid enum value"
},
{
"submissionId": "be938d36-3614-410c-baae-b514daf1c4ab",
"studyId": "DRGN-INTL",
"submitterSampleId": "DRGN_45601",
"status": "COMPLETE",
"originalFilePair": [
"DRGNtest.fasta",
"DRGN_metadata.tsv"
],
"analysisId": "610db281-393f-4c29-8db2-81393fcc29b0",
"error": null
}]
- If your token is expired, your submission will not work.
- Only one TSV can be uploaded. One or more
.fasta
,.fa
, or.gz
file(s) can be uploaded. - Please limit individual submissions to 5000 samples or less.