Give "completeness" of genotyping file and file format. #358

gedankenstuecke · 2017-05-24T23:01:47Z

Further suggestions that came via email:

It would also be useful to have a column that indicates if the genotype is complete or partial. There are some users that have uploaded only chromosome 1 or Y.

Perhaps also a column indicating in which format it is. There are some users that uploaded pdf (scan) of the paper genotype document.

Both would be cool. For 2) I think we'd just need to check the format during upload and reject everything where the type is not text or zip. 1) could be more tricky, as in principle there's no "completeness" due to the varying nature of the input data. But could at least count how many chromosome are represented?

tsujigiri · 2017-05-25T14:06:49Z

Don't we have the format in the database already? And if we have other formats than the ones we can handle, we should probably delete them or mark them as invalid, when the parsing fails.

gedankenstuecke · 2017-05-25T14:31:59Z

if we have other formats than the ones we can handle, we should probably delete them or mark them as invalid, when the parsing fails.

Yes, that's what I tried to say. Our parsing routines seem to be too eager to accept data right now. For the files we already have we can check which ones don't fit. 👍

gedankenstuecke · 2017-06-01T12:35:55Z

After sleeping over this: Don't think it makes much sense for us to check how complete a genotyping file is, i.e. the tipsy tests have only 5 SNPs or so they test, still they are complete in the sense that they tested all what they wanted to test.

The format problem should rather be tackled during the upload -> sometimes our parsers are too lenient in accepting data. Should be it's own issue, see #371

philippbayer · 2017-06-01T12:41:04Z

👍

gedankenstuecke added feature mozsprint labels May 24, 2017

gedankenstuecke closed this as completed Jun 1, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Give "completeness" of genotyping file and file format. #358

Give "completeness" of genotyping file and file format. #358

gedankenstuecke commented May 24, 2017

tsujigiri commented May 25, 2017

gedankenstuecke commented May 25, 2017 via email

gedankenstuecke commented Jun 1, 2017

philippbayer commented Jun 1, 2017

Give "completeness" of genotyping file and file format. #358

Give "completeness" of genotyping file and file format. #358

Comments

gedankenstuecke commented May 24, 2017

tsujigiri commented May 25, 2017

gedankenstuecke commented May 25, 2017 via email

gedankenstuecke commented Jun 1, 2017

philippbayer commented Jun 1, 2017