You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It would also be useful to have a column that indicates if the genotype is complete or partial. There are some users that have uploaded only chromosome 1 or Y.
Perhaps also a column indicating in which format it is. There are some users that uploaded pdf (scan) of the paper genotype document.
Both would be cool. For 2) I think we'd just need to check the format during upload and reject everything where the type is not text or zip. 1) could be more tricky, as in principle there's no "completeness" due to the varying nature of the input data. But could at least count how many chromosome are represented?
The text was updated successfully, but these errors were encountered:
Don't we have the format in the database already? And if we have other formats than the ones we can handle, we should probably delete them or mark them as invalid, when the parsing fails.
if we have other formats than the ones we can handle, we should probably delete them or mark them as invalid, when the parsing fails.
Yes, that's what I tried to say. Our parsing routines seem to be too eager to accept data right now. For the files we already have we can check which ones don't fit. 👍
After sleeping over this: Don't think it makes much sense for us to check how complete a genotyping file is, i.e. the tipsy tests have only 5 SNPs or so they test, still they are complete in the sense that they tested all what they wanted to test.
The format problem should rather be tackled during the upload -> sometimes our parsers are too lenient in accepting data. Should be it's own issue, see #371
Further suggestions that came via email:
Both would be cool. For 2) I think we'd just need to check the format during upload and reject everything where the type is not text or zip. 1) could be more tricky, as in principle there's no "completeness" due to the varying nature of the input data. But could at least count how many chromosome are represented?
The text was updated successfully, but these errors were encountered: