Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1623129

Summary: When using --public-hostname `oc cluster up` fails with error 'no route to host' and log 'Unable to create storage backend'
Product: OpenShift Container Platform Reporter: Xingxing Xia <xxia>
Component: MasterAssignee: Michal Fojtik <mfojtik>
Status: CLOSED WONTFIX QA Contact: Xingxing Xia <xxia>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.11.0CC: aos-bugs, bparees, hongli, jokerman, mmccomas, wzheng
Target Milestone: ---   
Target Release: 3.11.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-03-07 11:24:50 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
docker logs of container k8s_api_master-api-localhost_kube-system...
none
docker logs of container origin none

Description Xingxing Xia 2018-08-28 14:19:49 UTC
Created attachment 1479279 [details]
docker logs of container k8s_api_master-api-localhost_kube-system...

Description of problem:
When using --public-hostname `oc cluster up` fails with error 'no route to host' and log 'Unable to create storage backend'.
WITHOUT --public-hostname, it succeeds.

Version-Release number of selected component (if applicable):
oc v3.11.0-0.24.0

How reproducible:
Always

Steps to Reproduce:
1. oc cluster up --public-hostname=10.8.241.46 --image='brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/ose-${component}:${version}' --loglevel 6

2. Check docker logs:

Actual results:
1. Failed with messages:
...
I0828 09:54:08.673178    5549 run_self_hosted.go:154] started kubelet in container "d30800259a107c01b5dfec95c2869a30130148ddcaaabc4f45eab29b0bf74705"
I0828 09:54:08.674040    5549 loader.go:359] Config loaded from file /root/openshift.local.clusterup/kube-apiserver/admin.kubeconfig
I0828 09:54:08.674331    5549 run_self_hosted.go:179] Waiting for the kube-apiserver to be ready ...
I0828 09:54:08.677234    5549 round_trippers.go:405] GET https://10.8.241.46:8443/healthz?timeout=32s  in 2 milliseconds
I0828 09:54:08.677316    5549 run_self_hosted.go:536] Server isn't healthy yet.  Waiting a little while. Get https://10.8.241.46:8443/healthz?timeout=32s: dial tcp 10.8.241.46:8443: connect: no route to host
...
ial tcp 10.8.241.46:8443: connect: no route to host
E0828 09:59:09.681611    5549 run_self_hosted.go:550] API server error: Get https://10.8.241.46:8443/healthz?timeout=32s: dial tcp 10.8.241.46:8443: connect: no route to host ()
Error: timed out waiting for the condition

2. [root@preserved-cluster-up-ui-long-term-use ~]$ docker ps -a
54f9e07b4e50        brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/ose-hypershift@sha256:ff53a10c5c88cf3afe88cf2c3ae1c91f7ce1befa950ca2699e079e8ef89727df      "/bin/bash -c '#!/bin"   13 seconds ago      Exited (255) 1 seconds ago                       k8s_api_master-api-localhost_kube-system_e7c4b8d03187d573a119b33f189e26f1_5
...
d30800259a10        brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/ose-node:v3.11                                                                              "hyperkube kubelet --"   3 minutes ago       Up 3 minutes                            origin

[root@preserved-cluster-up-ui-long-term-use ~]$ docker logs 54f9e07b4e50
...
F0828 13:58:00.745427       1 storage_decorator.go:57] Unable to create storage backend: config (&{etcd3 kubernetes.io [https://10.8.241.46:4001] /etc/origin/master/master.etcd-client.key /etc/origin/master/master.etcd-client.crt /etc/origin/master/ca.crt true false 1000 0xc4216fa3f0 <nil> 5m0s 1m0s}), err (context deadline exceeded)

[root@preserved-cluster-up-ui-long-term-use ~]$ docker logs d30800259a10
E0828 14:10:52.394510    6011 reflector.go:136] k8s.io/kubernetes/pkg/kubelet/kubelet.go:464: Failed to list *v1.Node: Get https://localhost:8443/api/v1/nodes?fieldSelector=metadata.name%3Dlocalhost&limit=500&resourceVersion=0: dial tcp [::1]:8443: connect: connection refused
E0828 14:10:52.395663    6011 reflector.go:136] k8s.io/kubernetes/pkg/kubelet/kubelet.go:455: Failed to list *v1.Service: Get https://localhost:8443/api/v1/services?limit=500&resourceVersion=0: dial tcp [::1]:8443: connect: connection refused

Expected results:
1. Succeed

Additional info:
For more logs, see attachments

Comment 1 Xingxing Xia 2018-08-28 14:22:53 UTC
Created attachment 1479280 [details]
docker logs of container origin

Comment 2 Xingxing Xia 2018-08-28 15:03:28 UTC
Additional info:
above --public-hostname=10.8.241.46, the IP is the oc cluster up host's IP; the host had no above issue in 3.10; it has done necessary configuration (including https://github.com/openshift/origin/blob/master/docs/cluster_up_down.md#linux ). Tried another host shared in bug 1622106, that host reproduced the same issue too.

Comment 3 Xingxing Xia 2018-08-30 07:59:31 UTC
Without --public-hostname, `oc cluster up` server is https://127.0.0.1:8443, making clients from other host cannot connect the server, leading to `oc cluster up` not usable for user. Therefore this issue needs be resolved in 3.11.0 release.
PS: checked LATEST 3.10 oc (v3.10.35), found it has same issue.

Comment 6 Xingxing Xia 2018-09-14 14:42:37 UTC
The host 10.8.250.232 was created by following https://github.com/openshift/origin/blob/master/docs/cluster_up_down.md and had above problem.
Based on following the doc, later I tried:
# iptables -I INPUT 1 -p tcp --dport 8443 -j ACCEPT
It can solve above 1st error "dial tcp xxx:8443: connect: no route to host". But the cluster still cannot be up, check the k8s_api_master-api-localhost_kube-system... container log, it shows error "dial tcp xxx:4001: connect: no route to host". Then I also tried:
# iptables -I INPUT 1 -p tcp --dport 4001 -j ACCEPT
Then, `oc cluster up --public-hostname=xxx --image='brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/ose-${component}:${version}' ` succeeds!

Seems the firewall-cmd setting part of the doc https://github.com/openshift/origin/blob/master/docs/cluster_up_down.md is not enough. If so, the doc needs update.