Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

segfault in ld-musl-x86_64.so.1 #127

Open
KomtGoed opened this issue Feb 24, 2024 · 3 comments · May be fixed by #131
Open

segfault in ld-musl-x86_64.so.1 #127

KomtGoed opened this issue Feb 24, 2024 · 3 comments · May be fixed by #131

Comments

@KomtGoed
Copy link

Hi,

I am using Autoheal together with Watchtower.

Watchtower is scheduled to run at 05:00 in the morning. Since the day I added Autoheal to the herd, I get the following error messages in the logs:

Feb 24 05:00:19 server kernel: [981482.676478] sh[388687]: segfault at 7ffcb2888eb8 ip 00007f56ada70318 sp 00007ffcb2888eb0 error 6 in ld-musl-x86_64.so.1[7f56ada35000+4c000] likely on CPU 0 (core 0, socket 0)
Feb 24 05:00:19 server kernel: [981482.677822] Code: 41 5d 41 5e 41 5f c3 41 57 31 c0 b9 0a 00 00 00 41 56 41 55 49 89 fd 41 54 55 83 cd ff 53 48 81 ec 58 01 00 00 48 8d 7c 24 38 <48> 89 74 24 08 4c 8d 7c 24 38 f3 ab 48 8d 44 24 20 4d 89 f8 31 ff

This is what's happening around that time in the watchtower logs:
time="2024-02-24T05:00:19+01:00" level=info msg="Stopping /autoheal (574bcbae8d52) with SIGTERM"
time="2024-02-24T05:00:20+01:00" level=debug msg="Removing container 574bcbae8d52"
time="2024-02-24T05:00:20+01:00" level=info msg="Creating /autoheal"
time="2024-02-24T05:00:20+01:00" level=debug msg="Starting container /autoheal (b7a538093001)"
time="2024-02-24T05:00:21+01:00" level=info msg="Removing image 408724c998f0"

Is it correct to assume that it's actually Autoheal causing the segfault? If so, what to do about it?

Thanks!

@KomtGoed
Copy link
Author

KomtGoed commented Mar 9, 2024

Can reliably reproduce the segfault by stopping the autoheal container through the portainer UI.

Did some searching and came across the links included below.
lovell/sharp#2570
Is Alpine up to date in the latest Autoheal container?

https://serverfault.com/questions/1097705/segmentation-fault-of-bacula-on-apline-linux
Is MUSL up to date?

https://users.rust-lang.org/t/segfault-when-linking-against-shared-libraries-on-alpine/98629
Is Rust compiled/linked as mentioned?

Or can it be any other setting of my docker environment?

Thanks for any help!

@rahmanrafi
Copy link

Ran into this issue as well (Ubuntu 22.04.4 LTS/Linux 5.15.0-101-generic x86_64).

It seems to be caused line 128 in docker-entrypoint. My guess is it's a race condition caused by kill $$; term_handler which kills the process then tries to call term_handler(); if you modify this to kill the background process instead (kill $!; term handler), stopping the container shouldn't produce any segfault errors in syslog, and I can confirm this is the case on my machine.

@rahmanrafi rahmanrafi linked a pull request Mar 30, 2024 that will close this issue
@KomtGoed
Copy link
Author

Nicely spotted, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants