-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error in step #3 of projecting PC scores on new data #496
Comments
Here's the line by line debugging, problem seems to stem from big_parallelize(). What would you like me to provide next?
|
Thanks for reporting. Does this work for you? library(bigsnpr)
bedfile <- download_1000G("tmp-data")
obj.bed.aric <- obj.bed <- bed(bedfile)
bed_projectPCA(obj.bed, obj.bed.aric, k = 20, ind.row.new = rows_along(obj.bed.aric),
strand_flip = TRUE, join_by_pos = TRUE, verbose = TRUE, ncores = nb_cores()) |
It does! With the following object as a result:
List of 3
$ obj.svd.ref:List of 7
..$ d : num [1:20] 8238 5427 2966 2589 1288 ...
..$ u : num [1:2490, 1:20] -0.013 -0.0129 -0.013 -0.0128 -0.0129 ...
..$ v : num [1:378519, 1:20] 2.57e-03 -2.84e-05 2.64e-03 -1.46e-03 -1.19e-03 ...
..$ niter : num 7
..$ nops : num 234
..$ center: num [1:378519] 0.0691 0.0538 0.0703 0.3827 0.1522 ...
..$ scale : num [1:378519] 0.258 0.229 0.26 0.556 0.375 ...
..- attr(*, "class")= chr "big_SVD"
..- attr(*, "subset")= int [1:378519] 1 3 8 9 10 12 14 16 18 19 ...
..- attr(*, "lrldr")='data.frame': 5 obs. of 3 variables:
.. ..$ Chr : num [1:5] 6 6 6 15 17
.. ..$ Start: num [1:5] 30468791 33260215 33684313 25677831 1863804
.. ..$ Stop : num [1:5] 32800427 33512731 36288421 25823561 2359952
$ simple_proj: num [1:2490, 1:20] -107 -106 -107 -105 -106 ...
$ OADP_proj : num [1:2490, 1:20] -107 -107 -108 -106 -107 ... Any suggestions on next steps? |
What do you get for |
Here it is |
FYI, when I also do the following with my obj.bed.aric (after following QC, removing related, restrict to HapMap3 etc.), it works with no issues. Is the issue something to do with subsetting datasets to the common SNPs between the reference and target?
|
You're using I'm not sure I understood everything you tried to do (QC, etc). Could you try only one of those at once to see which one solves the problem? We could then have a better guess on finding what's going on (I currently have no idea). |
No, it was just a test. I still receive the original error if I use 1000G as the reference and ARIC as the target dataset. QC stuff I mentioned was just going through what you have done on 1000G on your PCA vignette, nothing more than that. In any case, I still receive the error when the reference and target datasets are different. |
But you don't when using only ARIC? |
Yes, if I use either 1000G OR ARIC for both reference and target dataset within |
Would you be able to share the |
After you requested the bim file, I figured out the problem. Reference dataset |
Variant names should not matter. library(bigsnpr)
bedfile <- download_1000G("tmp-data")
obj.bed <- bed(bedfile)
bigsnp <- snp_attach(snp_readBed2(bedfile, tempfile(), ind.col = 1:1000))
bigsnp$map$marker.ID <-
with(bigsnp$map, paste(chromosome, physical.pos, allele1, allele2, sep = ":"))
obj.bed.aric <- bed(snp_writeBed(bigsnp, tempfile(fileext = ".bed")))
bed_projectPCA(obj.bed, obj.bed.aric, k = 20, ind.row.new = rows_along(obj.bed.aric),
strand_flip = TRUE, join_by_pos = TRUE, verbose = TRUE, ncores = nb_cores()) |
Interesting, that is the only thing I changed. I will look into it a bit more to see whether I am missing something. |
If we can understand what was going on, that would be great. |
Any update on this? |
Hello,
I receive the following error upon trying to project a new dataset onto 1KG reference panel.
[Step 1/3] Matching variants of reference with target data..
1,233,111 variants to be matched.
6 ambiguous SNPs have been removed.
1,228,268 variants have been matched; 0 were flipped and 317,288 were reversed.
[Step 2/3] Computing (auto) SVD of reference..
Phase of clumping (on MAC) at r^2 > 0.2.. keep 302930 variants.
Discarding 0 variant with MAC < 10.
Iteration 1:
Computing SVD..
0 outlier variant detected..
Converged!
[Step 3/3] Projecting PC scores on new data..
Error in { : task 1 failed - "'...' used in an incorrect context"
Have you encountered this issue before (or the specific error I am receiving at the end)? Thank you for your help!
The text was updated successfully, but these errors were encountered: