-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error in labRcpp(ncol(lr)) : negative length vectors are not allowed #13
Comments
Hey! Thanks for your interest in propr! Hm, my first guess is that you have run out of RAM. Rcpp will try to make an object size n.features * n.features, which takes up a lot of space. Usually this gives an error like "Error: cannot allocate vector of size 3.8 Gb". When I call "labRcpp(47000)" directly... I get the same error as you. Perhaps see here -- alyssafrazee/polyester#41 (comment) -- indeed, this may be a generic way that R says "no more memory!" |
If you have some programming skills, I got a very basic script that divides up the propr analysis into a few chunks, then glues those chunks back together at the end. It was intended to be used for parallelization, but could be used to reduce RAM overhead too. I think you'd just need to replace the foreach() call with a basic for-loop. |
@tpq Thanks for replying. I doubt this is a memory issue. I've been running this on a machine with 0.5TB of RAM, and memory consumption never went past 100GB. |
Hm. I'm not sure if I can fix this quickly, but I will have a deeper think after the weekend.
|
@tpq Yes, I thought this might be an overflow somewhere in the cpp code, though 47k aren't enough to overflow a 32 bit signed integer, if we are looking at
Thanks for taking care of this issue. Update I was wrong. While the result should fit into a 32-bit int, intermediate values can overflow making the end result negative. |
@tpq seems like one can get the same error message whilst trying to create a vector with length exceeding a 31-bit number https://stackoverflow.com/a/48676389/3846213 and https://stackoverflow.com/a/5234293/3846213. |
@tpq Sorry for spamming so hard. Here I've tested the out-of-memory assumption by writing a function that should've worked with ncol = 47000. The function does nothing but allocating an integer vector. Assuming 64-bit integers, we are dealing with mere ~8GB of RAM. Yet, we still get the same error. I am not exactly sure, why this is happening, though. Given ncol = 47000, ncol * (ncol - 1) / 2 is two times smaller than the 31 uint vector-size limit.
|
No need to apologize -- thanks heaps for the reproducible example!! I'm actually a bit of a C++ newb, but I'm thinking that if we could replace with long int, it might work. It looks like it helps with the test function. I can try something similar with
|
@tpq I went over the codebase and refactored all integer arithmetic operations to avoid overflows in intermediate results. This allows us to avoid 64-bit integers. You should still add a warning in |
I have retracted my PR, because the overflow fix brought out another problem with integer division (#15). At this point the only sensible thing I can think of is to switch the argument type in |
@tpq I am under the impression, that making the package compatible with the number of features I'm interested in will be a lot more complicated than changing a couple of 32-bit arguments to 64-bits. I've found other places that can overflow. For example,
Here |
I assume it should be possible to define index as a
I'll try to have a closer look at this soon -- sorry, it's been a busy week! |
@tpq I guess there are many other places like this one. And I'm not sure how this will play out in the end. I've already tried changing all ints to |
@tpq my understanding is that R has a very complicated relationship with 64-bit integers. There is no native 64-bit integer type, and R integer vectors are indexed by 32-bit signed ints, which limits their effective size (you can still have larger vectors, but there are indexing problems). I believe that's the reason, why a simple swap of integer type doesn't work on a package level. |
Greetings. I'm experiencing what seems to be a size-dependent bug in
propr
. Here is a reproducible exampleThe traceback doesn't show anything particularly useful:
Reducing the number of features to 46000 (and anything below that) "solves" the issue.
The text was updated successfully, but these errors were encountered: