Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CouchDB Error #5129

Open
job-isabai opened this issue Jul 11, 2024 · 7 comments
Open

CouchDB Error #5129

job-isabai opened this issue Jul 11, 2024 · 7 comments

Comments

@job-isabai
Copy link

Hello,
Am running CouchDB on docker container. CouchDB crashes after encountering the error below:
`[error] 2024-07-11T07:47:49.418872Z [email protected] <0.13560.3> -------- rexi_server: from: [email protected](<0.13559.3>) mfa: fabric_rpc:all_docs/3 error:badarg [{erlang,binary_to_term,[<<131,0,104,2,100,0,7,107,112,95,110,111,100,108,0,0,0,3,104,2,109,0,0,0,58,99,114,101,97,116,101,100,58,109,101,100,105,99,45,112,117,114,103,101,100,45,114,111,108,101,45,100,99,54,97,101,102,50,102,53,98,98,97,100,49,55,97,53,49,100,102,51,99,98,102,53,101,101,97,49,48,53,97,104,3,98,72,173,49,215,104,3,97,2,97,0,104,3,100,0,9,115,105,122,101,95,105,110,102,111,97,90,97,6,98,0,0,1,77,104,2,109,0,0,0,34,117,112,100,97,116,101,100,58,109,101,100,105,99,45,117,115,101,114,45,97,109,97,100,111,117,45,107,111,110,45,109,101,116,97,104,3,98,72,194,14,52,104,3,97,48,97,0,104,3,100,0,9,115,105,122,101,95,105,110,102,111,98,0,0,29,91,97,188,98,0,0,147,170,104,2,109,0,0,0,38,117,112,100,97,116,101,100,58,109,101,100,105,99,45,117,115,101,114,45,121,111,117,99,101,102,95,100,97,104,109,97,110,101,45,109,101,116,97,104,3,98,72,191,21,174,104,3,97,23,97,0,104,3,100,0,9,115,105,122,101,95,105,110,102,111,98,0,0,27,63,97,90,98,0,0,93,0,106>>],[{error_info,#{module => erl_erts_errors}}]},{couch_compress,decompress,1,[{file,"src/couch_compress.erl"},{line,65}]},{couch_file,pread_term,2,[{file,"src/couch_file.erl"},{line,156}]},{couch_btree,get_node,2,[{file,"src/couch_btree.erl"},{line,474}]},{couch_btree,stream_node,8,[{file,"src/couch_btree.erl"},{line,1069}]},{couch_btree,fold,4,[{file,"src/couch_btree.erl"},{line,242}]},{couch_bt_engine,fold_docs_int,5,[{file,"src/couch_bt_engine.erl"},{line,1129}]},{couch_mrview,get_total_rows,2,[{file,"src/couch_mrview.erl"},{line,704}]}]
[error] 2024-07-11T07:47:49.421107Z [email protected] <0.13556.3> -------- could not load validation funs {{badmatch,{error,{badarg,nil,[{erlang,binary_to_term,[<<131,0,104,2,100,0,7,107,112,95,110,111,100,108,0,0,0,3,104,2,109,0,0,0,58,99,114,101,97,116,101,100,58,109,101,100,105,99,45,112,117,114,103,101,100,45,114,111,108,101,45,100,99,54,97,101,102,50,102,53,98,98,97,100,49,55,97,53,49,100,102,51,99,98,102,53,101,101,97,49,48,53,97,104,3,98,72,173,49,215,104,3,97,2,97,0,104,3,100,0,9,115,105,122,101,95,105,110,102,111,97,90,97,6,98,0,0,1,77,104,2,109,0,0,0,34,117,112,100,97,116,101,100,58,109,101,100,105,99,45,117,115,101,114,45,97,109,97,100,111,117,45,107,111,110,45,109,101,116,97,104,3,98,72,194,14,52,104,3,97,48,97,0,104,3,100,0,9,115,105,122,101,95,105,110,102,111,98,0,0,29,91,97,188,98,0,0,147,170,104,2,109,0,0,0,38,117,112,100,97,116,101,100,58,109,101,100,105,99,45,117,115,101,114,45,121,111,117,99,101,102,95,100,97,104,109,97,110,101,45,109,101,116,97,104,3,98,72,191,21,174,104,3,97,23,97,0,104,3,100,0,9,115,105,122,101,95,105,110,102,111,98,0,0,27,63,97,90,98,0,0,93,0,106>>],[{error_info,#{module => erl_erts_errors}}]},{couch_compress,decompress,1,[{file,"src/couch_compress.erl"},{line,65}]},{couch_file,pread_term,2,[{file,"src/couch_file.erl"},{line,156}]},{couch_btree,get_node,2,[{file,"src/couch_btree.erl"},{line,474}]},{couch_btree,stream_node,8,[{file,"src/couch_btree.erl"},{line,1069}]},{couch_btree,fold,4,[{file,"src/couch_btree.erl"},{line,242}]},{couch_bt_engine,fold_docs_int,5,[{file,"src/couch_bt_engine.erl"},{line,1129}]},{couch_mrview,get_total_rows,2,[{file,"src/couch_mrview.erl"},{line,704}]}]}}},[{ddoc_cache_entry_validation_funs,recover,1,[{file,"src/ddoc_cache_entry_validation_funs.erl"},{line,29}]},{ddoc_cache_entry,do_open,1,[{file,"src/ddoc_cache_entry.erl"},{line,275}]}]}
[error] 2024-07-11T07:47:49.421587Z [email protected] emulator -------- Error in process <0.13557.3> on node '[email protected]' with exit value:
{{badmatch,{error,{badarg,nil,[{erlang,binary_to_term,[<<131,0,104,2,100,0,7,107,112,95,110,111,100,108,0,0,0,3,104,2,109,0,0,0,58,99,114,101,97,116,101,100,58,109,101,100,105,99,45,112,117,114,103,101,100,45,114,111,108,101,45,100,99,54,97,101,102,50,102,53,98,98,97,100,49,55,97,53,49,100,102,51,99,98,102,53,101,101,97,49,48,53,97,104,3,98,72,173,49,215,104,3,97,2,97,0,104,3,100,0,9,115,105,122,101,95,105,110,102,111,97,90,97,6,98,0,0,1,77,104,2,109,0,0,0,34,117,112,100,97,116,101,100,58,109,101,100,105,99,45,117,115,101,114,45,97,109,97,100,111,117,45,107,111,110,45,109,101,116,97,104,3,98,72,194,14,52,104,3,97,48,97,0,104,3,100,0,9,115,105,122,101,95,105,110,102,111,98,0,0,29,91,97,188,98,0,0,147,170,104,2,109,0,0,0,38,117,112,100,97,116,101,100,58,109,101,100,105,99,45,117,115,101,114,45,121,111,117,99,101,102,95,100,97,104,109,97,110,101,45,109,101,116,97,104,3,98,72,191,21,174,104,3,97,23,97,0,104,3,100,0,9,115,105,122,101,95,105,110,102,111,98,0,0,27,63,97,90,98,0,0,93,0,106>>],[{error_info,#{module => erl_erts_errors}}]},{couch_compress,decompress,1,[{file,"src/couch_compress.erl"},{line,65}]},{couch_file,pread_term,2,[{file,"src/couch_file.erl"},{line,156}]},{couch_btree,get_node,2,[{file,"src/couch_btree.erl"},{line,474}]},{couch_btree,stream_node,8,[{file,"src/couch_btree.erl"},{line,1069}]},{couch_btree,fold,4,[{file,"src/couch_btree.erl"},{line,242}]},{couch_bt_engine,fold_docs_int,5,[{file,"src/couch_bt_engine.erl"},{line,1129}]},{couch_mrview,get_total_rows,2,[{file,"src/couch_mrview.erl"},{line,704}]}]}}},[{ddoc_cache_entry_validation_funs,recover,1,[{file,"src/ddoc_cache_entry_validation_funs.erl"},{line,29}]},{ddoc_cache_entry,do_open,1,[{file,"src/ddoc_cache_entry.erl"},{line,275}]}]}

[error] 2024-07-11T07:47:49.421804Z [email protected] emulator -------- Error in process <0.13557.3> on node '[email protected]' with exit value:
{{badmatch,{error,{badarg,nil,[{erlang,binary_to_term,[<<131,0,104,2,100,0,7,107,112,95,110,111,100,108,0,0,0,3,104,2,109,0,0,0,58,99,114,101,97,116,101,100,58,109,101,100,105,99,45,112,117,114,103,101,100,45,114,111,108,101,45,100,99,54,97,101,102,50,102,53,98,98,97,100,49,55,97,53,49,100,102,51,99,98,102,53,101,101,97,49,48,53,97,104,3,98,72,173,49,215,104,3,97,2,97,0,104,3,100,0,9,115,105,122,101,95,105,110,102,111,97,90,97,6,98,0,0,1,77,104,2,109,0,0,0,34,117,112,100,97,116,101,100,58,109,101,100,105,99,45,117,115,101,114,45,97,109,97,100,111,117,45,107,111,110,45,109,101,116,97,104,3,98,72,194,14,52,104,3,97,48,97,0,104,3,100,0,9,115,105,122,101,95,105,110,102,111,98,0,0,29,91,97,188,98,0,0,147,170,104,2,109,0,0,0,38,117,112,100,97,116,101,100,58,109,101,100,105,99,45,117,115,101,114,45,121,111,117,99,101,102,95,100,97,104,109,97,110,101,45,109,101,116,97,104,3,98,72,191,21,174,104,3,97,23,97,0,104,3,100,0,9,115,105,122,101,95,105,110,102,111,98,0,0,27,63,97,90,98,0,0,93,0,106>>],[{error_info,#{module => erl_erts_errors}}]},{couch_compress,decompress,1,[{file,"src/couch_compress.erl"},{line,65}]},{couch_file,pread_term,2,[{file,"src/couch_file.erl"},{line,156}]},{couch_btree,get_node,2,[{file,"src/couch_btree.erl"},{line,474}]},{couch_btree,stream_node,8,[{file,"src/couch_btree.erl"},{line,1069}]},{couch_btree,fold,4,[{file,"src/couch_btree.erl"},{line,242}]},{couch_bt_engine,fold_docs_int,5,[{file,"src/couch_bt_engine.erl"},{line,1129}]},{couch_mrview,get_total_rows,2,[{file,"src/couch_mrview.erl"},{line,704}]}]}}},[{ddoc_cache_entry_validation_funs,recover,1,[{file,"src/ddoc_cache_entry_validation_funs.erl"},{line,29}]},{ddoc_cache_entry,do_open,1,[{file,"src/ddoc_cache_entry.erl"},{line,275}]}]}
[error] 2024-07-11T07:59:28.760469Z [email protected] <0.17943.0> -------- rexi_server: from: [email protected](<0.17939.0>) mfa: fabric_rpc:all_docs/3 error:badarg [{erlang,binary_to_term,[<<131,0,104,2,100,0,7,107,112,95,110,111,100,108,0,0,0,3,104,2,109,0,0,0,58,99,114,101,97,116,101,100,58,109,101,100,105,99,45,112,117,114,103,101,100,45,114,111,108,101,45,100,99,54,97,101,102,50,102,53,98,98,97,100,49,55,97,53,49,100,102,51,99,98,102,53,101,101,97,49,48,53,97,104,3,98,72,173,49,215,104,3,97,2,97,0,104,3,100,0,9,115,105,122,101,95,105,110,102,111,97,90,97,6,98,0,0,1,77,104,2,109,0,0,0,34,117,112,100,97,116,101,100,58,109,101,100,105,99,45,117,115,101,114,45,97,109,97,100,111,117,45,107,111,110,45,109,101,116,97,104,3,98,72,194,14,52,104,3,97,48,97,0,104,3,100,0,9,115,105,122,101,95,105,110,102,111,98,0,0,29,91,97,188,98,0,0,147,170,104,2,109,0,0,0,38,117,112,100,97,116,101,100,58,109,101,100,105,99,45,117,115,101,114,45,121,111,117,99,101,102,95,100,97,104,109,97,110,101,45,109,101,116,97,104,3,98,72,191,21,174,104,3,97,23,97,0,104,3,100,0,9,115,105,122,101,95,105,110,102,111,98,0,0,27,63,97,90,98,0,0,93,0,106>>],[{error_info,#{module => erl_erts_errors}}]},{couch_compress,decompress,1,[{file,"src/couch_compress.erl"},{line,65}]},{couch_file,pread_term,2,[{file,"src/couch_file.erl"},{line,156}]},{couch_btree,get_node,2,[{file,"src/couch_btree.erl"},{line,474}]},{couch_btree,stream_node,8,[{file,"src/couch_btree.erl"},{line,1069}]},{couch_btree,fold,4,[{file,"src/couch_btree.erl"},{line,242}]},{couch_bt_engine,fold_docs_int,5,[{file,"src/couch_bt_engine.erl"},{line,1129}]},{couch_mrview,get_total_rows,2,[{file,"src/couch_mrview.erl"},{line,704}]}]
[error] 2024-07-11T07:59:28.778328Z [email protected] emulator -------- Error in process <0.17937.0> on node '[email protected]' with exit value:
{{badmatch,{error,{badarg,nil,[{erlang,binary_to_term,[<<131,0,104,2,100,0,7,107,112,95,110,111,100,108,0,0,0,3,104,2,109,0,0,0,58,99,114,101,97,116,101,100,58,109,101,100,105,99,45,112,117,114,103,101,100,45,114,111,108,101,45,100,99,54,97,101,102,50,102,53,98,98,97,100,49,55,97,53,49,100,102,51,99,98,102,53,101,101,97,49,48,53,97,104,3,98,72,173,49,215,104,3,97,2,97,0,104,3,100,0,9,115,105,122,101,95,105,110,102,111,97,90,97,6,98,0,0,1,77,104,2,109,0,0,0,34,117,112,100,97,116,101,100,58,109,101,100,105,99,45,117,115,101,114,45,97,109,97,100,111,117,45,107,111,110,45,109,101,116,97,104,3,98,72,194,14,52,104,3,97,48,97,0,104,3,100,0,9,115,105,122,101,95,105,110,102,111,98,0,0,29,91,97,188,98,0,0,147,170,104,2,109,0,0,0,38,117,112,100,97,116,101,100,58,109,101,100,105,99,45,117,115,101,114,45,121,111,117,99,101,102,95,100,97,104,109,97,110,101,45,109,101,116,97,104,3,98,72,191,21,174,104,3,97,23,97,0,104,3,100,0,9,115,105,122,101,95,105,110,102,111,98,0,0,27,63,97,90,98,0,0,93,0,106>>],[{error_info,#{module => erl_erts_errors}}]},{couch_compress,decompress,1,[{file,"src/couch_compress.erl"},{line,65}]},{couch_file,pread_term,2,[{file,"src/couch_file.erl"},{line,156}]},{couch_btree,get_node,2,[{file,"src/couch_btree.erl"},{line,474}]},{couch_btree,stream_node,8,[{file,"src/couch_btree.erl"},{line,1069}]},{couch_btree,fold,4,[{file,"src/couch_btree.erl"},{line,242}]},{couch_bt_engine,fold_docs_int,5,[{file,"src/couch_bt_engine.erl"},{line,1129}]},{couch_mrview,get_total_rows,2,[{file,"src/couch_mrview.erl"},{line,704}]}]}}},[{ddoc_cache_entry_validation_funs,recover,1,[{file,"src/ddoc_cache_entry_validation_funs.erl"},{line,29}]},{ddoc_cache_entry,do_open,1,[{file,"src/ddoc_cache_entry.erl"},{line,275}]}]}

[error] 2024-07-11T07:59:28.780745Z [email protected] emulator -------- Error in process <0.17937.0> on node '[email protected]' with exit value:
{{badmatch,{error,{badarg,nil,[{erlang,binary_to_term,[<<131,0,104,2,100,0,7,107,112,95,110,111,100,108,0,0,0,3,104,2,109,0,0,0,58,99,114,101,97,116,101,100,58,109,101,100,105,99,45,112,117,114,103,101,100,45,114,111,108,101,45,100,99,54,97,101,102,50,102,53,98,98,97,100,49,55,97,53,49,100,102,51,99,98,102,53,101,101,97,49,48,53,97,104,3,98,72,173,49,215,104,3,97,2,97,0,104,3,100,0,9,115,105,122,101,95,105,110,102,111,97,90,97,6,98,0,0,1,77,104,2,109,0,0,0,34,117,112,100,97,116,101,100,58,109,101,100,105,99,45,117,115,101,114,45,97,109,97,100,111,117,45,107,111,110,45,109,101,116,97,104,3,98,72,194,14,52,104,3,97,48,97,0,104,3,100,0,9,115,105,122,101,95,105,110,102,111,98,0,0,29,91,97,188,98,0,0,147,170,104,2,109,0,0,0,38,117,112,100,97,116,101,100,58,109,101,100,105,99,45,117,115,101,114,45,121,111,117,99,101,102,95,100,97,104,109,97,110,101,45,109,101,116,97,104,3,98,72,191,21,174,104,3,97,23,97,0,104,3,100,0,9,115,105,122,101,95,105,110,102,111,98,0,0,27,63,97,90,98,0,0,93,0,106>>],[{error_info,#{module => erl_erts_errors}}]},{couch_compress,decompress,1,[{file,"src/couch_compress.erl"},{line,65}]},{couch_file,pread_term,2,[{file,"src/couch_file.erl"},{line,156}]},{couch_btree,get_node,2,[{file,"src/couch_btree.erl"},{line,474}]},{couch_btree,stream_node,8,[{file,"src/couch_btree.erl"},{line,1069}]},{couch_btree,fold,4,[{file,"src/couch_btree.erl"},{line,242}]},{couch_bt_engine,fold_docs_int,5,[{file,"src/couch_bt_engine.erl"},{line,1129}]},{couch_mrview,get_total_rows,2,[{file,"src/couch_mrview.erl"},{line,704}]}]}}},[{ddoc_cache_entry_validation_funs,recover,1,[{file,"src/ddoc_cache_entry_validation_funs.erl"},{line,29}]},{ddoc_cache_entry,do_open,1,[{file,"src/ddoc_cache_entry.erl"},{line,275}]}]}

[error] 2024-07-11T07:59:28.797215Z [email protected] <0.17935.0> -------- could not load validation funs {{badmatch,{error,{badarg,nil,[{erlang,binary_to_term,[<<131,0,104,2,100,0,7,107,112,95,110,111,100,108,0,0,0,3,104,2,109,0,0,0,58,99,114,101,97,116,101,100,58,109,101,100,105,99,45,112,117,114,103,101,100,45,114,111,108,101,45,100,99,54,97,101,102,50,102,53,98,98,97,100,49,55,97,53,49,100,102,51,99,98,102,53,101,101,97,49,48,53,97,104,3,98,72,173,49,215,104,3,97,2,97,0,104,3,100,0,9,115,105,122,101,95,105,110,102,111,97,90,97,6,98,0,0,1,77,104,2,109,0,0,0,34,117,112,100,97,116,101,100,58,109,101,100,105,99,45,117,115,101,114,45,97,109,97,100,111,117,45,107,111,110,45,109,101,116,97,104,3,98,72,194,14,52,104,3,97,48,97,0,104,3,100,0,9,115,105,122,101,95,105,110,102,111,98,0,0,29,91,97,188,98,0,0,147,170,104,2,109,0,0,0,38,117,112,100,97,116,101,100,58,109,101,100,105,99,45,117,115,101,114,45,121,111,117,99,101,102,95,100,97,104,109,97,110,101,45,109,101,116,97,104,3,98,72,191,21,174,104,3,97,23,97,0,104,3,100,0,9,115,105,122,101,95,105,110,102,111,98,0,0,27,63,97,90,98,0,0,93,0,106>>],[{error_info,#{module => erl_erts_errors}}]},{couch_compress,decompress,1,[{file,"src/couch_compress.erl"},{line,65}]},{couch_file,pread_term,2,[{file,"src/couch_file.erl"},{line,156}]},{couch_btree,get_node,2,[{file,"src/couch_btree.erl"},{line,474}]},{couch_btree,stream_node,8,[{file,"src/couch_btree.erl"},{line,1069}]},{couch_btree,fold,4,[{file,"src/couch_btree.erl"},{line,242}]},{couch_bt_engine,fold_docs_int,5,[{file,"src/couch_bt_engine.erl"},{line,1129}]},{couch_mrview,get_total_rows,2,[{file,"src/couch_mrview.erl"},{line,704}]}]}}},[{ddoc_cache_entry_validation_funs,recover,1,[{file,"src/ddoc_cache_entry_validation_funs.erl"},{line,29}]},{ddoc_cache_entry,do_open,1,[{file,"src/ddoc_cache_entry.erl"},{line,275}]}]}`

Please help.

@rnewson
Copy link
Member

rnewson commented Jul 11, 2024

hi, this looks like data corruption to me. The binary in question is a kp_node but it is somehow truncated or otherwise invalid.

@nickva
Copy link
Contributor

nickva commented Jul 11, 2024

Agree with @rnewson.

<<131,0,104,2,100,0,7,107,112,95...

The first byte 131 looked like a proper initial marker of an uncompressed term.

It's not followed by 80, so it's not compressed

-define(SNAPPY_PREFIX, 1).
% Term prefixes documented at:
% http://www.erlang.org/doc/apps/erts/erl_ext_dist.html
-define(TERM_PREFIX, 131).
-define(COMPRESSED_TERM_PREFIX, 131, 80).

0 following 131 seems odd on first look at https://www.erlang.org/doc/apps/erts/erl_ext_dist.html#introduction. The next 104,2 looks like a proper small tuple

Screenshot 2024-07-11 at 5 42 53 PM

Which is probably what we might expect in a kp node.

But turning a tuple into a binary doesn't show a 0 after 131

> erlang:term_to_binary({a, b}).
<<131,104,2,100,0,1,97,100,0,1,98>>

@job-isabai

What version of CouchDB, Erlang, OS, architecture you're running? Wonder if you backed up, or restored the data at any point. Of if there is any way to reproduce the issue?

@job-isabai
Copy link
Author

Hello,
Thanks for the feedback.
I am using CouchDB via Community Health Toolkit (CHT).
Docker image can be found here: https://staging.dev.medicmobile.org/_couch/builds_4/medic:medic:4.5.0/docker-compose/cht-couchdb.yml but I can't tell the version.
This error emerged from a system upgrade which required all views to be indexed before migration to the new version. This takes place automatically in the backend but there was crash a couple of times during migration that resulted me to revert to a backed up version (Whole Image & Files of the VM).
Afterwards, the upgrade was successful, all views were indexed and the system started running on a new version. After a couple of hours this error started popping up, which resulted to CouchDB container restarting unexpectedly. My database size is more than 2GB and growing.

@rnewson
Copy link
Member

rnewson commented Jul 12, 2024

That image contains CouchDB 3.3.2.

@rnewson
Copy link
Member

rnewson commented Jul 12, 2024

so I think this is data corruption somehow, you'll need to try earlier backups until something works but we're very curious as to how this might have happened. If you have the details of the storage subsystem (filesystem, disks, any virtualisation between couchdb and the storage device, and any relevant settings on reordering or fsyncing) we'd love to hear them.

@job-isabai
Copy link
Author

Actually reverting back to earlier backups might not be an option for me since it has been a month and I might loose the current state of the database. Is there a means of repairing the corrupted data? Can I adjust the configuration to make CouchDB container error tolerant to prevent failure/restart?
Running on Ubuntu VM, docker system where everything is stored on the local disk.

@rnewson
Copy link
Member

rnewson commented Jul 12, 2024

CouchDB is built as a "crash only" system, meaning that the couchdb process is always ready to be killed, there's no shutdown code, no need to call sync manually, etc. When a document is written, and the 200 OK returned, CouchDB has already done everything it can to persist the data to disk (including fsync() calls). At startup CouchDB will read from the end of each file looking for the latest valid header.

Without knowing how the files were corrupted it is hard to know what to recommend and, unfortunately, there are no tools we publish to repair a corrupted .couch file. At best we might be able to build an erlang script that would attempt to extract the document bodies inside the .couch files, though that would be shorn of a number of details (the doc id being the most significant as it is stored in a different location to the body, the corrupted btree index would be able to find it).

Have you perhaps replicated this database elsewhere recently? that could be another source of backup.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants