Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Differentiate between errors and restricted datasets #67

Open
bschilder opened this issue Nov 17, 2021 · 5 comments
Open

Differentiate between errors and restricted datasets #67

bschilder opened this issue Nov 17, 2021 · 5 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@bschilder
Copy link
Collaborator

bschilder commented Nov 17, 2021

Some OpenGWAS datasets don't show any data links on their website, I think because they're restricted use (at least currently). e.g. https://gwas.mrcieu.ac.uk/datasets/finn-b-G6_ALZHEIMER/

Currently, MungeSumstats gives a message saying this this might be due to an incorrect ID or URL. However, it would be nice to let the user know it's simply restricted and won't be downloaded.

We could either do this:

  • in the beginning of the import_sumstats step, or
  • the end of find_sumstats step, and provide a message to users that X, Y, Z GWAS were dropped from the results due to lack of accessibility (default). If so, we can add an arg to just show all results regardless of accessibility (non-default).

On that note, I think there is a way you can provide your OpenGWAS access token and get access to additional GWAS you've been authorized for. We just need to pass up the arg to supply the token to import_sumstats.

@bschilder
Copy link
Collaborator Author

Just checked if the OpenGWAS metadata can be used for this. They give a column called "group_name", which at first glance seems to be what I want.

But then i realised they always seem to be set to "public", even for the example above that we know is missing.
Screenshot 2021-11-17 at 15 57 15

@bschilder bschilder added the enhancement New feature or request label Nov 17, 2021
@bschilder
Copy link
Collaborator Author

bschilder commented Nov 17, 2021

Regarding the access token, i just remembered that we don't use OpenGWAS's code for downloading VCFs, bc ieugwasr/gwasvcf only let you query a small subset of a given VCF, not download the entire VCF (at least last time I checked).
That's why i made all the downloader functions and constructed the VCF URLs inside MungeSumstats.

So I'm not sure if there's currently an API-accessible means of downloading private VCF from OpenGWAS. Something to discuss with their team.

@bschilder bschilder added the help wanted Extra attention is needed label Nov 17, 2021
@NathanSkene
Copy link

NathanSkene commented Nov 18, 2021 via email

@Al-Murphy
Copy link
Owner

My understanding was we couldn't use it since the package isn't on CRAN/Bioconductor. @bschilder can you confirm?

@bschilder
Copy link
Collaborator Author

Yeah, there were two reasons:

  1. Exactly as @Al-Murphy mentioned, these are not CRAN/Bioc-distributed, so we can't make them deps for MungeSumstats (a Bioc restriction). The way i got the metadata search functions to work was by copying and pasting the relevant code.
  2. Neither of the packages allows you to download whole-VCF files, which is what we need for MungeSumstats

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants