Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unresolved address; Host Details : local host is: "hdfs-namenode-0.hdfs-namenode.default.svc.cluster.local"; destination host is: (unknown):0; #53

Open
ALiBaBa-Jimmy opened this issue Jul 2, 2018 · 5 comments

Comments

@ALiBaBa-Jimmy
Copy link

When I use helm install the namenode accroding to your docs

Errors log appear follow, and namode restart again:

java.io.IOException: Failed on local exception: java.net.SocketException: Unresolved address; Host Details : local host is: "hdfs-namenode-0.hdfs-namenode.default.svc.cluster.local"; destination host is: (unknown):0;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:773)
at org.apache.hadoop.ipc.Server.bind(Server.java:425)
at org.apache.hadoop.ipc.Server$Listener.(Server.java:574)
at org.apache.hadoop.ipc.Server.(Server.java:2215)
at org.apache.hadoop.ipc.RPC$Server.(RPC.java:938)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server.(ProtobufRpcEngine.java:534)
at org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:509)
at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:783)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.(NameNodeRpcServer.java:344)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createRpcServer(NameNode.java:673)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:646)
at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:811)
at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:795)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1488)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1554)
Caused by: java.net.SocketException: Unresolved address
at sun.nio.ch.Net.translateToSocketException(Net.java:131)
at sun.nio.ch.Net.translateException(Net.java:157)
at sun.nio.ch.Net.translateException(Net.java:163)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:76)
at org.apache.hadoop.ipc.Server.bind(Server.java:408)
... 13 more
Caused by: java.nio.channels.UnresolvedAddressException
at sun.nio.ch.Net.checkAddress(Net.java:101)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:218)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
... 14 more

hdfs-namenode-0 0/1 CrashLoopBackOff 22 9h 10.196.36.165 10.196.36.165
hdfs-namenode-1 0/1 CrashLoopBackOff 22 9h 10.196.36.162 10.196.36.162

this is my hosts on my node machine:
[wangdanfeng5@A01-R20-I36-165-0964488 ~]$ cat /etc/hosts

#127.0.0.1 A01-R20-I36-165-0964488.JD.LOCAL localhost.localdomain localhost
127.0.0.1 localhost.localdomain localhost
10.196.36.162 A01-R20-I36-162-0964483.JD.LOCAL
10.196.36.165 A01-R20-I36-165-0964488.JD.LOCAL

Could you give me some advice about this ?

@ALiBaBa-Jimmy
Copy link
Author

@kimoonkim

@kimoonkim
Copy link
Member

Hi @ALiBaBa-Jimmy, thanks for trying out k8s HDFS and sorry about the trouble you went through.

This seems like kube-dns issue. Do you know if your cluster has healthy kube-dns? If you're not clear, you may want to try steps in https://kubernetes.io/docs/tasks/debug-application-cluster/debug-service/#does-the-service-work-by-dns

Also, can you post the exact command line you used to launch the helm chart?

Thanks.

@nenggangpan
Copy link

@kimoonkim I met exactly the same issue and I am pretty sure my dns is correct. my k8s version is 1.12, and the dns is core-dns.

@cosmin-ionita
Copy link

I have the exact same issue

@grmaltby
Copy link

I encountered this same issue. In my case it was caused by the cluster domainname not being the default/common "cluster", which appears to be the expectation in file charts/hdfs-k8s/templates/_helpers.tpl. My "fix" replaced one hardcoded value another:

--- a/charts/hdfs-k8s/templates/_helpers.tpl
+++ b/charts/hdfs-k8s/templates/_helpers.tpl
@@ -163,7 +163,7 @@ The HDFS config file should specify FQDN of services. Otherwise, Kerberos
 login may fail.
 */}}
 {{- define "svc-domain" -}}
-{{- printf "%s.svc.cluster.local" .Release.Namespace -}}
+{{- printf "%s.svc.aiscluster.local" .Release.Namespace -}}
 {{- end -}}

The incorrect value contributes to the hdfs-config configmap and the default custom core-site.xml it delivers (which run.sh unconditionally copies over any other tweaks to /etc/hadoop/core-site.xml)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants