-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Debugging and Continued Stability of RAREsim v2 #6
Comments
Hi @JessMurphy , Sorry for the delay in response. We are fairly full on workload right now, so we may not be able to help in an ongoing manner at this time. However we may be able to provide a few consultations. At the very least, we can sit down with you in an initial meeting to see exactly what your needs are, in detail. Would you mind finding a time in our booking tool here where we can chat: https://outlook.office365.com/owa/calendar/[email protected]/bookings/ |
Attached is data and example code to generate the three errors discussed above. |
Hi Jess, we have some insights on these problems. A side note, I'd recommend formatting all the Python with Black and the C with something equivalent. @falquaddoomi and @d33bs can feel free to comment with more insights. Problem 1Skipped for now since it seems you at least have a work-around in place. I'll continue to investigate, but this problem might require more knowledge about the science and thus require a sit-down. I'm not a biologist or data scientist really, so I don't know what this comment in the bash script means:
Problem 2I don't think this is actually a problem, I think the warning is correct.
Looking in the legend file, it is indeed 19029 long: (minus the first header row) Then let's try to trace back the matrix row count of 19027:
TL;DR the number of rows/cols in the matrix is read directly from the first 8 bytes of the So, no error in the C code. I think it is a problem with whatever generated the matrix file. Or perhaps the matrix really was 19027 rows and this is just a human mistake? Side note Here: if M.num_rows() != len(legend):
# TODO: This check has a bug in it somewhere. Likely in the C code
# raise DifferingLengths(f"Lengths of legend {len(legend)} and hap {M.num_rows()} files do not match")
print(f"WARNING: Lengths of legend {len(legend)} and hap {M.num_rows()} files do not match") You can uncomment that Problem 3This boils down to a list being sorted in
TL;DR I think the intent is to get all the lists (of matrix indices, presumably) in if z:
# from itertools import chain
all_kept_rows = list(merge(all_kept_rows, sorted(chain(*R.values()))))
# OR
all_kept_rows = list(merge(all_kept_rows, sorted([item for sublist in R.values() for item in sublist]))) I believe the Faisal suggested that maybe the Dave brings up that older versions of Python treated |
Thanks, Vince! Yes, Problem 1 is not that urgent, but I would be more than happy to meet to try to further explain it. For Problem 2, both the legend file and initial haplotype file (.gz) have 19029 rows but when the initial haplotype file is converted into a sparse matrix (.sm) using convert.py it somehow only has 19027 rows (when it should have 19029). So, we think it is an issue with the sparse function called in the convert.py script. And I will look further into Problem 3. |
Ah okay, I didn't notice that that So, at least in the specific case of the In // if (m->rows < row + 1)
// m->rows = row + 1; and in if ( (int)buffer[i] == NEWLINE ) {
// ...
M->rows += 1;
continue;
} Please test this out with various different files to make sure it works reliably. I'm a little perplexed at why the row count was conditionally incremented the way it was, in that separate function. It seems like the parsing could be done in a simpler way with less code repetition. |
Thanks! So would you be able to update the code with the potential fixes for Problems 2 and 3 and we can test it out? No one on our team currently knows C and I have minimal experience with python. |
For the z flag, in the PR I forgot to do For the sparse matrix length difference, are you still running that If I first generate the python3 ${WD}/raresim/convert.py \
-i ${WD}/example_code/chr19.block37.NFE.sim3.controls.haps.gz \
-o ${WD}/example_code/chr19.block37.NFE.sim3.controls.haps.sm ...it generates a |
I'll probably need more details to be able to help, I didn't know about a singularity or a server running this code. Is that the ultimate intended environment for this to run in? The readme doesn't mention it. Can you verify the server has the latest version of the C code, with the added I'm running it locally with Python v3.11.2, and getting the following output: Log(base) Vincents-MacBook-Pro:raresim vincerubinetti$ python3 setup.py install
/Users/vincerubinetti/Desktop/raresim/raresim/setup.py:7: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html
import pkg_resources
/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/setuptools/_distutils/cmd.py:66: SetuptoolsDeprecationWarning: setup.py install is deprecated.
!!
********************************************************************************
Please avoid running ``setup.py`` directly.
Instead, use pypa/build, pypa/installer or other
standards-based tools.
See https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.html for details.
********************************************************************************
!!
self.initialize_options()
/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/setuptools/_distutils/cmd.py:66: EasyInstallDeprecationWarning: easy_install command is deprecated.
!!
********************************************************************************
Please avoid running ``setup.py`` and ``easy_install``.
Instead, use pypa/build, pypa/installer or other
standards-based tools.
See https://github.com/pypa/setuptools/issues/917 for details.
********************************************************************************
!!
self.initialize_options()
Compiling rareSim.pyx because it changed.
[1/1] Cythonizing rareSim.pyx
/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/Cython/Compiler/Main.py:381: FutureWarning: Cython directive 'language_level' not set, using '3str' for now (Py3). This has changed from earlier releases! File: /Users/vincerubinetti/Desktop/raresim/raresim/rareSim.pyx
tree = Parsing.p_module(s, pxd, full_module_name)
warning: rsdec.pxd:11:28: Non-trivial type declarators in shared declaration (e.g. mix of pointers and values). Each pointer declaration should be on its own line.
warning: rareSim.pyx:9:30: Unknown type declaration 'void' in annotation, ignoring
warning: rareSim.pyx:49:31: Unknown type declaration 'void' in annotation, ignoring
warning: rareSim.pyx:62:32: Unknown type declaration 'void' in annotation, ignoring
lib/raresim/src/lists.c:223:18: warning: comparison of integers of different signs: 'int' and 'uint32_t' (aka 'unsigned int') [-Wsign-compare]
for (i =0; i < rows; ++i) {
~ ^ ~~~~
lib/raresim/src/lists.c:238:19: warning: comparison of integers of different signs: 'int' and 'uint32_t' (aka 'unsigned int') [-Wsign-compare]
for (i = 0; i < (*m)->size; ++i) {
~ ^ ~~~~~~~~~~
lib/raresim/src/lists.c:276:14: warning: unused variable 'ret' [-Wunused-variable]
uint32_t ret = uint32_t_array_add(m->data[row], val);
^
lib/raresim/src/lists.c:264:30: warning: comparison of integers of different signs: 'int' and 'uint32_t' (aka 'unsigned int') [-Wsign-compare]
for (i = old_size; i < m->size; ++i) {
~ ^ ~~~~~~~
lib/raresim/src/lists.c:327:19: warning: comparison of integers of different signs: 'int' and 'uint32_t' (aka 'unsigned int') [-Wsign-compare]
for (i = 0; i < m->rows; ++i) {
~ ^ ~~~~~~~
lib/raresim/src/lists.c:338:19: warning: comparison of integers of different signs: 'int' and 'uint32_t' (aka 'unsigned int') [-Wsign-compare]
for (i = 0; i < m->rows; ++i) {
~ ^ ~~~~~~~
lib/raresim/src/lists.c:384:19: warning: comparison of integers of different signs: 'int' and 'uint32_t' (aka 'unsigned int') [-Wsign-compare]
for (i = 0; i < m->rows; ++i) {
~ ^ ~~~~~~~
lib/raresim/src/lists.c:513:11: warning: unused variable 'r' [-Wunused-variable]
char *r = strcpy(last_3, file_name + strlen(file_name) - 3);
^
lib/raresim/src/lists.c:641:17: warning: unused variable 'r' [-Wunused-variable]
int r = uint32_t_sparse_matrix_add(M, *row, *col);
^
lib/raresim/src/lists.c:659:14: warning: unused variable 'ret' [-Wunused-variable]
uint32_t ret = uint32_t_sparse_matrix_write(m, fp);
^
10 warnings generated.
lib/raresim/src/lists.c:223:18: warning: comparison of integers of different signs: 'int' and 'uint32_t' (aka 'unsigned int') [-Wsign-compare]
for (i =0; i < rows; ++i) {
~ ^ ~~~~
lib/raresim/src/lists.c:238:19: warning: comparison of integers of different signs: 'int' and 'uint32_t' (aka 'unsigned int') [-Wsign-compare]
for (i = 0; i < (*m)->size; ++i) {
~ ^ ~~~~~~~~~~
lib/raresim/src/lists.c:276:14: warning: unused variable 'ret' [-Wunused-variable]
uint32_t ret = uint32_t_array_add(m->data[row], val);
^
lib/raresim/src/lists.c:264:30: warning: comparison of integers of different signs: 'int' and 'uint32_t' (aka 'unsigned int') [-Wsign-compare]
for (i = old_size; i < m->size; ++i) {
~ ^ ~~~~~~~
lib/raresim/src/lists.c:327:19: warning: comparison of integers of different signs: 'int' and 'uint32_t' (aka 'unsigned int') [-Wsign-compare]
for (i = 0; i < m->rows; ++i) {
~ ^ ~~~~~~~
lib/raresim/src/lists.c:338:19: warning: comparison of integers of different signs: 'int' and 'uint32_t' (aka 'unsigned int') [-Wsign-compare]
for (i = 0; i < m->rows; ++i) {
~ ^ ~~~~~~~
lib/raresim/src/lists.c:384:19: warning: comparison of integers of different signs: 'int' and 'uint32_t' (aka 'unsigned int') [-Wsign-compare]
for (i = 0; i < m->rows; ++i) {
~ ^ ~~~~~~~
lib/raresim/src/lists.c:513:11: warning: unused variable 'r' [-Wunused-variable]
char *r = strcpy(last_3, file_name + strlen(file_name) - 3);
^
lib/raresim/src/lists.c:641:17: warning: unused variable 'r' [-Wunused-variable]
int r = uint32_t_sparse_matrix_add(M, *row, *col);
^
lib/raresim/src/lists.c:659:14: warning: unused variable 'ret' [-Wunused-variable]
uint32_t ret = uint32_t_sparse_matrix_write(m, fp);
^
10 warnings generated.
lib/zlib-1.2.11/adler32.c:63:15: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
uLong ZEXPORT adler32_z(adler, buf, len)
^
lib/zlib-1.2.11/adler32.c:134:15: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
uLong ZEXPORT adler32(adler, buf, len)
^
lib/zlib-1.2.11/adler32.c:143:13: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
local uLong adler32_combine_(adler1, adler2, len2)
^
lib/zlib-1.2.11/adler32.c:172:15: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
uLong ZEXPORT adler32_combine(adler1, adler2, len2)
^
lib/zlib-1.2.11/adler32.c:180:15: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
uLong ZEXPORT adler32_combine64(adler1, adler2, len2)
^
5 warnings generated.
lib/zlib-1.2.11/adler32.c:63:15: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
uLong ZEXPORT adler32_z(adler, buf, len)
^
lib/zlib-1.2.11/adler32.c:134:15: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
uLong ZEXPORT adler32(adler, buf, len)
^
lib/zlib-1.2.11/adler32.c:143:13: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
local uLong adler32_combine_(adler1, adler2, len2)
^
lib/zlib-1.2.11/adler32.c:172:15: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
uLong ZEXPORT adler32_combine(adler1, adler2, len2)
^
lib/zlib-1.2.11/adler32.c:180:15: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
uLong ZEXPORT adler32_combine64(adler1, adler2, len2)
^
5 warnings generated.
lib/zlib-1.2.11/compress.c:22:13: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
int ZEXPORT compress2 (dest, destLen, source, sourceLen, level)
^
lib/zlib-1.2.11/compress.c:68:13: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
int ZEXPORT compress (dest, destLen, source, sourceLen)
^
lib/zlib-1.2.11/compress.c:81:15: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
uLong ZEXPORT compressBound (sourceLen)
^
3 warnings generated.
lib/zlib-1.2.11/compress.c:22:13: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
int ZEXPORT compress2 (dest, destLen, source, sourceLen, level)
^
lib/zlib-1.2.11/compress.c:68:13: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
int ZEXPORT compress (dest, destLen, source, sourceLen)
^
lib/zlib-1.2.11/compress.c:81:15: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
uLong ZEXPORT compressBound (sourceLen)
^
3 warnings generated.
lib/zlib-1.2.11/crc32.c:202:23: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
unsigned long ZEXPORT crc32_z(crc, buf, len)
^
lib/zlib-1.2.11/crc32.c:237:23: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
unsigned long ZEXPORT crc32(crc, buf, len)
^
lib/zlib-1.2.11/crc32.c:266:21: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
local unsigned long crc32_little(crc, buf, len)
^
lib/zlib-1.2.11/crc32.c:306:21: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
local unsigned long crc32_big(crc, buf, len)
^
lib/zlib-1.2.11/crc32.c:344:21: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
local unsigned long gf2_matrix_times(mat, vec)
^
lib/zlib-1.2.11/crc32.c:361:12: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
local void gf2_matrix_square(square, mat)
^
lib/zlib-1.2.11/crc32.c:372:13: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
local uLong crc32_combine_(crc1, crc2, len2)
^
lib/zlib-1.2.11/crc32.c:428:15: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
uLong ZEXPORT crc32_combine(crc1, crc2, len2)
^
lib/zlib-1.2.11/crc32.c:436:15: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
uLong ZEXPORT crc32_combine64(crc1, crc2, len2)
^
9 warnings generated.
lib/zlib-1.2.11/crc32.c:202:23: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
unsigned long ZEXPORT crc32_z(crc, buf, len)
^
lib/zlib-1.2.11/crc32.c:237:23: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
unsigned long ZEXPORT crc32(crc, buf, len)
^
lib/zlib-1.2.11/crc32.c:266:21: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
local unsigned long crc32_little(crc, buf, len)
^
lib/zlib-1.2.11/crc32.c:306:21: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
local unsigned long crc32_big(crc, buf, len)
^
lib/zlib-1.2.11/crc32.c:344:21: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
local unsigned long gf2_matrix_times(mat, vec)
^
lib/zlib-1.2.11/crc32.c:361:12: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
local void gf2_matrix_square(square, mat)
^
lib/zlib-1.2.11/crc32.c:372:13: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
local uLong crc32_combine_(crc1, crc2, len2)
^
lib/zlib-1.2.11/crc32.c:428:15: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
uLong ZEXPORT crc32_combine(crc1, crc2, len2)
^
lib/zlib-1.2.11/crc32.c:436:15: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
uLong ZEXPORT crc32_combine64(crc1, crc2, len2)
^
9 warnings generated.
lib/zlib-1.2.11/deflate.c:201:12: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
local void slide_hash(s)
^
lib/zlib-1.2.11/deflate.c:228:13: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
int ZEXPORT deflateInit_(strm, level, version, stream_size)
^
lib/zlib-1.2.11/deflate.c:240:13: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
int ZEXPORT deflateInit2_(strm, level, method, windowBits, memLevel, strategy,
^
lib/zlib-1.2.11/deflate.c:353:11: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
local int deflateStateCheck (strm)
^
lib/zlib-1.2.11/deflate.c:376:13: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
int ZEXPORT deflateSetDictionary (strm, dictionary, dictLength)
^
lib/zlib-1.2.11/deflate.c:445:13: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
int ZEXPORT deflateGetDictionary (strm, dictionary, dictLength)
^
lib/zlib-1.2.11/deflate.c:467:13: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
int ZEXPORT deflateResetKeep (strm)
^
lib/zlib-1.2.11/deflate.c:505:13: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
int ZEXPORT deflateReset (strm)
^
lib/zlib-1.2.11/deflate.c:517:13: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
int ZEXPORT deflateSetHeader (strm, head)
^
lib/zlib-1.2.11/deflate.c:528:13: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
int ZEXPORT deflatePending (strm, pending, bits)
^
lib/zlib-1.2.11/deflate.c:542:13: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
int ZEXPORT deflatePrime (strm, bits, value)
^
lib/zlib-1.2.11/deflate.c:568:13: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
int ZEXPORT deflateParams(strm, level, strategy)
^
lib/zlib-1.2.11/deflate.c:617:13: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
int ZEXPORT deflateTune(strm, good_length, max_lazy, nice_length, max_chain)
^
lib/zlib-1.2.11/deflate.c:652:15: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
uLong ZEXPORT deflateBound(strm, sourceLen)
^
lib/zlib-1.2.11/deflate.c:716:12: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
local void putShortMSB (s, b)
^
lib/zlib-1.2.11/deflate.c:730:12: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
local void flush_pending(strm)
^
lib/zlib-1.2.11/deflate.c:763:13: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
int ZEXPORT deflate (strm, flush)
^
lib/zlib-1.2.11/deflate.c:1076:13: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
int ZEXPORT deflateEnd (strm)
^
lib/zlib-1.2.11/deflate.c:1102:13: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
int ZEXPORT deflateCopy (dest, source)
^
lib/zlib-1.2.11/deflate.c:1164:16: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
local unsigned read_buf(strm, buf, size)
^
lib/zlib-1.2.11/deflate.c:1194:12: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
local void lm_init (s)
^
lib/zlib-1.2.11/deflate.c:1236:12: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
local uInt longest_match(s, cur_match)
^
lib/zlib-1.2.11/deflate.c:1482:12: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
local void fill_window(s)
^
lib/zlib-1.2.11/deflate.c:1643:19: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
local block_state deflate_stored(s, flush)
^
lib/zlib-1.2.11/deflate.c:1824:19: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
local block_state deflate_fast(s, flush)
^
lib/zlib-1.2.11/deflate.c:1926:19: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
local block_state deflate_slow(s, flush)
^
lib/zlib-1.2.11/deflate.c:2057:19: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
local block_state deflate_rle(s, flush)
^
lib/zlib-1.2.11/deflate.c:2130:19: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
local block_state deflate_huff(s, flush)
^
28 warnings generated.
lib/zlib-1.2.11/deflate.c:201:12: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
local void slide_hash(s)
^
lib/zlib-1.2.11/deflate.c:228:13: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
int ZEXPORT deflateInit_(strm, level, version, stream_size)
^
lib/zlib-1.2.11/deflate.c:240:13: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
int ZEXPORT deflateInit2_(strm, level, method, windowBits, memLevel, strategy,
^
lib/zlib-1.2.11/deflate.c:353:11: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
local int deflateStateCheck (strm)
^
lib/zlib-1.2.11/deflate.c:376:13: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
int ZEXPORT deflateSetDictionary (strm, dictionary, dictLength)
^
lib/zlib-1.2.11/deflate.c:445:13: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
int ZEXPORT deflateGetDictionary (strm, dictionary, dictLength)
^
lib/zlib-1.2.11/deflate.c:467:13: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
int ZEXPORT deflateResetKeep (strm)
^
lib/zlib-1.2.11/deflate.c:505:13: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
int ZEXPORT deflateReset (strm)
^
lib/zlib-1.2.11/deflate.c:517:13: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
int ZEXPORT deflateSetHeader (strm, head)
^
lib/zlib-1.2.11/deflate.c:528:13: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
int ZEXPORT deflatePending (strm, pending, bits)
^
lib/zlib-1.2.11/deflate.c:542:13: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
int ZEXPORT deflatePrime (strm, bits, value)
^
lib/zlib-1.2.11/deflate.c:568:13: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
int ZEXPORT deflateParams(strm, level, strategy)
^
lib/zlib-1.2.11/deflate.c:617:13: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
int ZEXPORT deflateTune(strm, good_length, max_lazy, nice_length, max_chain)
^
lib/zlib-1.2.11/deflate.c:652:15: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
uLong ZEXPORT deflateBound(strm, sourceLen)
^
lib/zlib-1.2.11/deflate.c:716:12: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
local void putShortMSB (s, b)
^
lib/zlib-1.2.11/deflate.c:730:12: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
local void flush_pending(strm)
^
lib/zlib-1.2.11/deflate.c:763:13: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
int ZEXPORT deflate (strm, flush)
^
lib/zlib-1.2.11/deflate.c:1076:13: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
int ZEXPORT deflateEnd (strm)
^
lib/zlib-1.2.11/deflate.c:1102:13: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
int ZEXPORT deflateCopy (dest, source)
^
lib/zlib-1.2.11/deflate.c:1164:16: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
local unsigned read_buf(strm, buf, size)
^
lib/zlib-1.2.11/deflate.c:1194:12: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
local void lm_init (s)
^
lib/zlib-1.2.11/deflate.c:1236:12: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
local uInt longest_match(s, cur_match)
^
lib/zlib-1.2.11/deflate.c:1482:12: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
local void fill_window(s)
^
lib/zlib-1.2.11/deflate.c:1643:19: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
local block_state deflate_stored(s, flush)
^
lib/zlib-1.2.11/deflate.c:1824:19: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
local block_state deflate_fast(s, flush)
^
lib/zlib-1.2.11/deflate.c:1926:19: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
local block_state deflate_slow(s, flush)
^
lib/zlib-1.2.11/deflate.c:2057:19: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
local block_state deflate_rle(s, flush)
^
lib/zlib-1.2.11/deflate.c:2130:19: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
local block_state deflate_huff(s, flush)
^
28 warnings generated.
lib/zlib-1.2.11/gzclose.c:11:13: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
int ZEXPORT gzclose(file)
^
1 warning generated.
lib/zlib-1.2.11/gzclose.c:11:13: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
int ZEXPORT gzclose(file)
^
1 warning generated.
lib/zlib-1.2.11/gzlib.c:75:12: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
local void gz_reset(state)
^
lib/zlib-1.2.11/gzlib.c:91:14: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
local gzFile gz_open(path, fd, mode)
^
lib/zlib-1.2.11/gzlib.c:252:9: error: call to undeclared function 'lseek'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
LSEEK(state->fd, 0, SEEK_END); /* so gzoffset() is correct */
^
lib/zlib-1.2.11/gzlib.c:14:17: note: expanded from macro 'LSEEK'
# define LSEEK lseek
^
lib/zlib-1.2.11/gzlib.c:252:9: note: did you mean 'fseek'?
lib/zlib-1.2.11/gzlib.c:14:17: note: expanded from macro 'LSEEK'
# define LSEEK lseek
^
/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/stdio.h:154:6: note: 'fseek' declared here
int fseek(FILE *, long, int);
^
lib/zlib-1.2.11/gzlib.c:258:24: error: call to undeclared function 'lseek'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
state->start = LSEEK(state->fd, 0, SEEK_CUR);
^
lib/zlib-1.2.11/gzlib.c:14:17: note: expanded from macro 'LSEEK'
# define LSEEK lseek
^
lib/zlib-1.2.11/gzlib.c:270:16: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
gzFile ZEXPORT gzopen(path, mode)
^
lib/zlib-1.2.11/gzlib.c:278:16: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
gzFile ZEXPORT gzopen64(path, mode)
^
lib/zlib-1.2.11/gzlib.c:286:16: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
gzFile ZEXPORT gzdopen(fd, mode)
^
lib/zlib-1.2.11/gzlib.c:316:13: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
int ZEXPORT gzbuffer(file, size)
^
lib/zlib-1.2.11/gzlib.c:343:13: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
int ZEXPORT gzrewind(file)
^
lib/zlib-1.2.11/gzlib.c:359:9: error: call to undeclared function 'lseek'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
if (LSEEK(state->fd, state->start, SEEK_SET) == -1)
^
lib/zlib-1.2.11/gzlib.c:14:17: note: expanded from macro 'LSEEK'
# define LSEEK lseek
^
lib/zlib-1.2.11/gzlib.c:366:19: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
z_off64_t ZEXPORT gzseek64(file, offset, whence)
^
lib/zlib-1.2.11/gzlib.c:400:15: error: call to undeclared function 'lseek'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
ret = LSEEK(state->fd, offset - state->x.have, SEEK_CUR);
^
lib/zlib-1.2.11/gzlib.c:14:17: note: expanded from macro 'LSEEK'
# define LSEEK lseek
^
lib/zlib-1.2.11/gzlib.c:443:17: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
z_off_t ZEXPORT gzseek(file, offset, whence)
^
lib/zlib-1.2.11/gzlib.c:455:19: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
z_off64_t ZEXPORT gztell64(file)
^
lib/zlib-1.2.11/gzlib.c:472:17: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
z_off_t ZEXPORT gztell(file)
^
lib/zlib-1.2.11/gzlib.c:482:19: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
z_off64_t ZEXPORT gzoffset64(file)
^
lib/zlib-1.2.11/gzlib.c:496:14: error: call to undeclared function 'lseek'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
offset = LSEEK(state->fd, 0, SEEK_CUR);
^
lib/zlib-1.2.11/gzlib.c:14:17: note: expanded from macro 'LSEEK'
# define LSEEK lseek
^
lib/zlib-1.2.11/gzlib.c:505:17: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
z_off_t ZEXPORT gzoffset(file)
^
lib/zlib-1.2.11/gzlib.c:515:13: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
int ZEXPORT gzeof(file)
^
lib/zlib-1.2.11/gzlib.c:532:22: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
const char * ZEXPORT gzerror(file, errnum)
^
lib/zlib-1.2.11/gzlib.c:553:14: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
void ZEXPORT gzclearerr(file)
^
lib/zlib-1.2.11/gzlib.c:579:20: warning: a function definition without a prototype is deprecated in all versions of C and is not supported in C2x [-Wdeprecated-non-prototype]
void ZLIB_INTERNAL gz_error(state, err, msg)
^
17 warnings and 5 errors generated.
error: command '/usr/bin/clang' failed with exit code 1
(base) Vincents-MacBook-Pro:raresim vincerubinetti$ Note that you're seeing the I'm guessing the issue is that the C code is not being finished compiling because of those errors. If I add a test |
Well I don't know why this changed over the last few weeks. But I found a hack online, which is to just turn off that C error so the compile can succeed: export CFLAGS='-Wno-implicit-function-declaration' && python3 setup.py install This makes it work for me, locally. Probably won't fix your other errors. |
So yes, raresim will probably be run on a computing server or cluster because genetic data is usually too large to be stored locally. But I'm currently just cloning the updated code into my home drive (and deleting the previous files) and it does have the latest version of the C code. However, I think a previous version of raresim was added to the singularity container so I wonder if that could be messing things up (though I'm specifically referencing the raresim folder in my home drive and just using the singularity to call python3). I did run your hack, which got rid of warnings 14, 252, 89, and 661 from the setup.py output (everything else looked the exact same), but the |
If there are still other errors being shown, they might be blocking the build from finishing. Try adding a As far as the differences between running local and running on the server, we'd probably need to have another sit down with screen-sharing to troubleshoot this, and I'd probably need help from my colleagues. I think the code is fairly brittle and sensitive to version and environment differences. Python is not my main language but from what I've heard, that applies to a lot of Python code... and then adding compiled C (with included external libraries) into the mix makes it extra complicated. Please schedule something with the booking page: Just FYI, it's possible that an actual robust fix (to make it reliably run on the server or any environment) could take a significant effort, and we have a limit on what we can do without making an MOU ("official" agreement for work/time commitment between our supervisor and your lab). |
Assuming I added the print statement in the right place (see below), it did not print when running
|
In thinking about this, discussing with the software team, and taking a brief look at the code I wanted to mention that using a packaging and environment management tool could possibly assist with reproducibility when it comes to how the work is deployed + tested. There's a helpful guide at pyOpenSci.org on Python Packaging Tools which might influence the decision-making in this space. Feel free to ignore this comment if it's way off-base or unhelpful. |
Hey @JessMurphy, sorry for the delay. So, I got access to your compute cluster and was able to get your Changes to pruning_code.shI had to make some tweaks to the #!/bin/bash
# FA: the `set` command applies bash settings; the following settings make it easier to debug the script:
# -e: abort on any program returning an error, i.e. a code != 0
# -o pipefail: also abort on programs that error in pipes, e.g. `broken_program | working_program` will fail when it attempts to run `broken_program`
# -x: echo lines that are run with a `+` in front of them
set -e -o pipefail -x
pop=NFE
nsim=20000
pcase=100
pconf=90
rep=3
# change the file path to where the example_code folder is stored
# (and where the raresim folder will be stored)
# FA: changed this to taking the WD as the first parameter so i could test it in my home directory on clas-compute
# (it defaults to the value it was hardcoded to before if you don't specify the first parameter)
WD=${1:-/home/math/murphjes}
cd ${WD}
# clone raresim from Github
# FA: i first remove the raresim folder, if it exists, so we know we're starting with a fresh copy
rm -rf raresim || echo "No existing raresim folder found, continuing..."
git clone https://github.com/RMBarnard/raresim.git
cd raresim/
# FA: i added `--user` here to install the package into a folder that's writeable by a regular user
python3 setup.py install --user
cd ..
# prune functional and synonymous variants down to pcase %
python3 ${WD}/raresim/sim.py \
-m ${WD}/example_code/chr19.block37.${pop}.sim${rep}.controls.haps.sm \
--functional_bins ${WD}/example_code/MAC_bin_estimates_${nsim}_${pop}_fun_${pcase}.txt \
--synonymous_bins ${WD}/example_code/MAC_bin_estimates_${nsim}_${pop}_syn_${pcase}.txt \
-l ${WD}/example_code/chr19.block37.${pop}.sim${rep}.copy.legend \
-L ${WD}/example_code/chr19.block37.${pop}.sim${rep}.${pcase}fun.${pcase}syn.legend \
-H ${WD}/example_code/chr19.block37.${pop}.sim${rep}.controls.${pcase}fun.${pcase}syn.haps.gz
# produces WARNING: Lengths of legend 19029 and hap 19027 files do not match
# convert the resulting -H haplotype file to a sparse matrix
python3 ${WD}/raresim/convert.py \
-i ${WD}/example_code/chr19.block37.${pop}.sim${rep}.controls.${pcase}fun.${pcase}syn.haps.gz \
-o ${WD}/example_code/chr19.block37.${pop}.sim${rep}.controls.${pcase}fun.${pcase}syn.haps.sm
# prune functional and synonymous variants down again to pconf % (sometimes doesn't work because of the MAC bins)
python3 ${WD}/raresim/sim.py \
-m ${WD}/example_code/chr19.block37.${pop}.sim${rep}.controls.${pcase}fun.${pcase}syn.haps.sm \
--functional_bins ${WD}/example_code/MAC_bin_estimates_${nsim}_${pop}_fun_${pconf}.txt \
--synonymous_bins ${WD}/example_code/MAC_bin_estimates_${nsim}_${pop}_syn_${pconf}.txt \
-l ${WD}/example_code/chr19.block37.${pop}.sim${rep}.${pcase}fun.${pcase}syn.legend \
-L ${WD}/example_code/chr19.block37.${pop}.sim${rep}.${pconf}fun.${pconf}syn.legend \
-H ${WD}/example_code/chr19.block37.${pop}.sim${rep}.controls.${pconf}fun.${pconf}syn.haps.gz || \
echo "* NOTE: Expected error (code: $?), continuing..."
# FA: since we know the above line is supposed to fail, i added `|| echo "* expected..." so that the command as a whole succeeds and we can continue, despite `set -eo pipefail` being enabled
# produces the following error if the number of observed functional variants (0) is less than the number
# of expected functional variants (0.36) for the [201,400] MAC bin from the first pruning step above
# (may need to rerun the first pruning step to reproduce the error)
#Traceback (most recent call last):
# File "/home/math/murphjes/raresim/sim.py", line 100, in <module>
# if __name__ == '__main__': main()
# File "/home/math/murphjes/raresim/sim.py", line 69, in main
# print_frequency_distribution(bins, bin_h, func_split, fun_only, syn_only)
# File "/home/math/murphjes/raresim/header.py", line 304, in print_frequency_distribution
# print_bin(bin_h['fun'], bins['fun'])
# File "/home/math/murphjes/raresim/header.py", line 142, in print_bin
# + str(len(bin_h[bin_id])))
#KeyError: 6
# we circumvented this error by combining the last two MAC bins into a [21,400] bin
# prune functional and synonymous variants down again to pconf % but don't remove the rows of zeros (doesn't work because of -z flag)
python3 ${WD}/raresim/sim.py \
-m ${WD}/example_code/chr19.block37.${pop}.sim${rep}.controls.${pcase}fun.${pcase}syn.haps.sm \
--functional_bins ${WD}/example_code/MAC_bin_estimates_${nsim}_${pop}_fun_${pconf}_6bins.txt \
--synonymous_bins ${WD}/example_code/MAC_bin_estimates_${nsim}_${pop}_syn_${pconf}_6bins.txt \
-l ${WD}/example_code/chr19.block37.${pop}.sim${rep}.${pcase}fun.${pcase}syn.legend \
-L ${WD}/example_code/chr19.block37.${pop}.sim${rep}.${pconf}fun.${pconf}syn.legend \
-H ${WD}/example_code/chr19.block37.${pop}.sim${rep}.controls.${pconf}fun.${pconf}syn.haps.gz \
-z
# produces the following error
#Traceback (most recent call last):
# File "/home/math/murphjes/raresim/sim.py", line 100, in <module>
# if __name__ == '__main__': main()
# File "/home/math/murphjes/raresim/sim.py", line 90, in main
# all_kept_rows = get_all_kept_rows(bin_h, R, func_split, fun_only, syn_only, args.z, args.keep_protected, legend)
# File "/home/math/murphjes/raresim/header.py", line 338, in get_all_kept_rows
# all_kept_rows = list(merge(all_kept_rows, sorted(R)))
# File "/usr/lib/python3.10/heapq.py", line 353, in merge
# _heapify(h)
#TypeError: '<' not supported between instances of 'int' and 'str' Opening a shell into the container(The following should be done after SSHing into your cluster, i.e. on clas-compute or alderaan) I find it a lot easier to debug things if I can get an interactive shell into the container. I'd start in your home directory, with the contents of the zip you sent, I don't know if it's necessary, but you can get into your singularity shell --writable-tmpfs /storage/singularity/mixtures.sif The Once you run the above command (and after a short wait as the image is launched as a container), you'll be at a prompt that looks like:
but it's otherwise a normal bash shell. From that shell, you can run Hope that helps, and let me know if you have issues or questions, and of course if it ends up working for you. |
Sorry for my delayed response, but thanks @falquaddoomi! The differing lengths warning is no longer produced and the sparse matrix is of the correct size. And yes, the last error isn't produced but it's not doing what it's supposed to be doing so I need to follow-up with @vincerubinetti and provide examples. |
Hey @vincerubinetti, attached is an updated example_code folder with included output. The current output when using the z flag is Please let me know if you have any questions or if it would be easier to meet to further explain. |
Hi Jessica, I haven’t had a chance to look at this yet, no. I’m out sick today and most likely will be out several days this week. If you’d like to schedule a sit down, that would be helpful. It’d be good to have all three of the SET team members there so we can all help, so if you could schedule through the booking page that would be great. Generally speaking, it’s probably going to be hard (for me at least) to find what needs fixing just by looking at what the expected output is. At some point someone has to understand what each line of code is supposed to be doing, both from a biological and programming standpoint. It also seems like this would've been broken long before I fixed the error message for the z flag, unless there was some undocumented way of using the z flag (passing in differently formatted data) that I'm unaware of. |
No worries and I hope you feel better! Yes, the z flag definitely wasn't working properly even before you fixed the error message. I was able to get ahold of Ryan, the maintainer of raresim, and it sounds like he might have some time to look into it. But if he doesn't or he runs into issues, I'll reach back out and schedule a meeting. Thanks! |
Group
The Hendricks Research Group
Contact info
Audrey Hendricks, PI, [email protected]
Jessica Murphy, Lead Contact, [email protected]
Ryan Bernard, Lead Developer, [email protected]
Megan Null, Senior Author, [email protected]
Links to code
https://github.com/RMBarnard/raresim/tree/main
Workflow
RAREsim is a python interface for performing scalable rare variant simulations. It is essentially a few python scripts backed by some C code to implement data structures. Git and Github have been used to track the different versions of the package and it is currently being utilized on a computing cluster in a linux environment.
Work description
We need assistance debugging a few specific errors:
We would also like continued stability and maintenance of RAREsim.
All data we are using is publicly available so there is no PHI.
Timeline
We are currently running simulations and writing 2-3 papers based on the use of this software. So, we hope the debugging can be done as soon as possible and an initial plan for long term maintenance can be developed within the next couple of months.
Funding
No response
The text was updated successfully, but these errors were encountered: