Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

confd does not re-resolve SRV records after startup #859

Open
cdanis opened this issue Aug 3, 2022 · 0 comments
Open

confd does not re-resolve SRV records after startup #859

cdanis opened this issue Aug 3, 2022 · 0 comments

Comments

@cdanis
Copy link

cdanis commented Aug 3, 2022

It appears that confd does not re-resolve etcd SRV records after startup. This makes the SRV support actually quite dangerous to use if you ever intend to change the contents of the record...

We had many confds across our fleet 'stuck' attempting to look up the previous, decommissioned members of our etcd cluster:
Aug 03 13:43:57 deploy1002 confd[32274]: 2022-08-03T13:43:57Z deploy1002 /usr/bin/confd[32274]: ERROR client: etcd cluster is unavailable or misconfigured; error #0: dial tcp: lookup conf1004.eqiad.wmnet on 10.3.0.1:53: no such host
Aug 03 13:43:57 deploy1002 confd[32274]: ; error #1: dial tcp: lookup conf1006.eqiad.wmnet on 10.3.0.1:53: no such host
Aug 03 13:43:57 deploy1002 confd[32274]: ; error #2: dial tcp: lookup conf1005.eqiad.wmnet on 10.3.0.1:53: no such host

Meanwhile the SRV record had looked like this in our DNS for at least a week's time:
_etcd._tcp.eqiad.wmnet has SRV record 0 1 4001 conf1008.eqiad.wmnet.
_etcd._tcp.eqiad.wmnet has SRV record 0 1 4001 conf1009.eqiad.wmnet.
_etcd._tcp.eqiad.wmnet has SRV record 0 1 4001 conf1007.eqiad.wmnet.

This is with version:
confd 0.16.0 (Git SHA: , Go Version: go1.11.6)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant