-
-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARM64: unbound 1.19 up but unhealthy #5649
Comments
Hi! Can you check inside the unbound container at /var/logs/healthcheck.log which check fails? |
Hi Niklas,
results in:
That's strange, because...
results in:
Can I just ignore it? Thank you! |
Ok i don't get this then :D Normally if this message shows up docker should exit this script with 0 = everything is fine, 1 = something is broken. As the script is generating this message it should exit with 0. Can you down and up again? And if it shows up unhealthy again visit the log file again? |
OK, done, but now I get...
and...
results in:
I will reboot the server so see if it comes up again. Edit: even after a reboot unbound doesn't start anymore due to unhealthy container. Thanks! |
I have the same issue with an on-prem machine. Unbound is unhealthy. The healthcheck does not show any errors. But docker inspect shows the following: [ In my case the unhealthy unbound prevents other containers from starting. I mitigated the issue by reverting to unbound 1.18 |
Here are also my results of docker inspect: root@m:/opt/mailcow-dockerized # docker inspect ff3aaab746e2 |
Ah yes that is very helpful! The healthcheck timeout value is to short. Docker is set to await a answer from the healthcheck within 10s. It seems to take longer on some machines. That is a easy fix and will be fixed within 2024-01a |
Fixed with 2024-01a (just released) |
Sorry Niklas, but the problem remains. At least for me. After the fix I now get the following error:
That's strange! There is no firewall in place. |
It cannot ping 1.1.1.1, 8.8.8.8 and 9.9.9.9 that's all the first check does. |
Hi, I am also experiencing issue with the unbound health check, looks like it takes just over 30s in my case (however I'm on
I created a compose override as a workaround which seemed to have worked:
|
I got the same problem, in my case, my server just cannot ping to 9.9.9.9, all packages are lost, so I delete it from data/Dockerfiles/unbound/healthcheck.sh, but it seems didn't change? it still tried to ping 9.9.9.9 for checking and I just got unhealthy error. ➜ mailcow-dockerized git:(master) ✗ time docker exec -it mailcowdockerized-unbound-mailcow-1 /healthcheck.sh
PING 1.1.1.1 (1.1.1.1): 56 data bytes
--- 1.1.1.1 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 0.669/1.615/2.283 ms
PING 8.8.8.8 (8.8.8.8): 56 data bytes
--- 8.8.8.8 ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 0.967/0.992/1.028 ms
PING 9.9.9.9 (9.9.9.9): 56 data bytes
--- 9.9.9.9 ping statistics ---
3 packets transmitted, 0 packets received, 100% packet loss
docker exec -it mailcowdockerized-unbound-mailcow-1 /healthcheck.sh 0.03s user 0.01s system 0% cpu 16.140 total |
If you delete the check you need to rebuild the container. Or you look why you cannot ping 9.9.9.9 though. |
you mean docker compose down and re up again? I did, but seems not the "rebuild" you said, what should I do ? |
Rebuild is done by using docker build. But this is only a workaround around your problem you should definitely analyse. |
thanks for your reply and all your contributions to the open source community, I have changed to another open source mail server and it worked. |
Changing health check timeout to even 1m doesn't help. Still I receive same error: |
If your server is located in a region where you must use a proxy to access github.com (such as China), then increasing the health check timeout is useless. In this case, you can only turn off the health check. |
Server is in EU, and everything been working until 2024 update... |
Could you check what the healthcheck logs say? (Located at /var/log/healthcheck.log) inside the container. Use docker compose exec unbound-mailcow cat /var/log/healthcheck.log |
Yes that maybe the case but then something is wrong with your system which has been hidden before. These healthchecks are simple network checks such as ping, netcat and dns resolving and if those don't work inside the main dns container it shouldn't be ignored. If you are based in China for example like @KagurazakaNyaa says thats a whole different story and completely reasonable (actually i forgot to think about that) |
It seems like the issue on my side is I can't ping 1.1.1.1 but there's no problem with pinging other DNS (8.8.8.8)
|
Looks like it yes. Maybe try to see why you can't ping 1.1.1.1 |
Has this been changed recently (2024) in unbound container? What address was used to ping before? |
BTW, in my case, it cannot ping 9.9.9.9 in the origin server. |
None. That is the "problem" you are facing. We've implemented this to make sure the container works as we want him to. So we strictly created first some simple DNS Tests but these were not failproof enough so we added ping and port checks too. |
My main problem is that I don't have any DNS blocking rules in firewall (opnsense), server can ping 8.8.8.8 and 9.9.9.9 but not 1.1.1.1 |
I ran a test with unbound 1.19.1 root@gaia:/opt/mailcow-dockerized# docker inspect 919ac36911e4 [..] The healthcheck.log shows not error: root@gaia:/opt/mailcow-dockerized# docker compose exec unbound-mailcow cat /var/log/healthcheck.log With manuel execution of the healthcheck script also no errors... root@gaia:/opt/mailcow-dockerized# docker exec -it mailcowdockerized-unbound-mailcow-1 /healthcheck.sh --- 1.1.1.1 ping statistics --- --- 8.8.8.8 ping statistics --- --- 9.9.9.9 ping statistics --- |
Extending the timeout to 1m with a docker compose override file (as shown in the comment from @KarolKozlowski) works for me too. |
Maybe this helps someone. I managed to pinpoint the issue to DNS, my primary server was inaccessible from within the docker network. There are 2 solutions (for me):
Health check run time is now down to 6 seconds:
|
I had the idea to add a hidden variable which controls the check timeout like UNBOUND_HEALTHCHECK_THRESHOLD which can be modified according to the user needs. The default value would still be 30s. What do you think about it? |
Seems ok, is it possible to customize also DNS address for unbound container?
|
Unbound itself resolves the DNS queries so you can't change that per se. However though maybe take a look at this doc article regarding unbound DNS Resolving ports in the firewall: https://docs.mailcow.email/getstarted/prerequisite-system/#important-for-hetzner-firewalls |
Thanks but it doesn't help in my case. I restored mailcow from 2023 and for a time being, I will stop doing updates... |
This functionality can be achieved by compose overrides, so in my opinion it is redundant (might require documenting it though). I think it would benefit everyone if we knew what is causing the check to fail. In my case all checks were successful, but took too long to respond. The delay was caused by long DNS queries (fail-over to secondary server) which technically should not impair the functionality of the service, but is rather an anomaly that should be investigated. What do you think? |
If the healthcheck took to long you have to adjust it manually or analyse your system why it does this in general. A standardized mailcow installation can easily complete the healthcheck within less then 10s. I do agree with the redundant thing and how to document it. |
Increase timeout or change DNS doesn't work in China. since the GFW blocks github.com and sometime even hub.docker.com, so allow user to ignore this checking is the only solution. |
I'm on a Hetzner VM with the same issue. Switching "dns-nameservers" in network interface config under Debian to IPv4 fixed the problem.. https://docs.hetzner.com/cloud/servers/static-configuration/ |
What is full set of unbound checks? I managed to establish connection to required DNS:
But I don't understand the last test failure; all commands are issued in unbound container:
when dig command produces result:
Why it needs to check connectivity to mailcow.email host at port 80? |
Hi, looks to me that docker-compose.yml has as prerequisit "condition: service_healthy" for unbound and that condition is not met at startup of docker-compose and hence the error. See https://community.mailcow.email/d/2977-container-unbound-unhealthy/18 I'm no docker expert, so I just changed it to service_started for myself. |
TL;DR: also verify ping from your mailcow host, OUTSIDE the Docker containers. I was having this problem, and found my firewall was blocking 1.1.1.1. Presumably the blocker rule was in one of the dynamic block lists, for example anti-spam, etc. I permitted ping from my mailcow host to 1.1.1.1 and then the unbound check passed immediately. |
Hello All, I had the same issue and fixed by adding outbound ruled to my firewall:
Note I am using Hetzner to host my docker. Hope this helps |
I know I'm re-treading some ground here, but I offer this for the benefit of others coming here for answers. For me, the The For now, to achieve a working state, I've had to resort to skipping the health check entirely (as per #5652), even though 2 of 3 ping tests are successful. mailcow.conf (line 256) |
I have just upgraded to the latest mailcow version, disabling unbound has fixed my send/receive email issues. |
|
Contribution guidelines
I've found a bug and checked that ...
Description
Logs:
Steps to reproduce:
Which branch are you using?
master
Operating System:
Debian 12
Server/VM specifications:
8GB, 4 cores
Is Apparmor, SELinux or similar active?
no
Virtualization technology:
KVM
Docker version:
24.0.7
docker-compose version or docker compose version:
v2.21.0
mailcow version:
2024-01
Reverse proxy:
no reverse proxy
Logs of git diff:
Logs of iptables -L -vn:
Logs of ip6tables -L -vn:
Logs of iptables -L -vn -t nat:
Logs of ip6tables -L -vn -t nat:
DNS check:
The text was updated successfully, but these errors were encountered: