Description of problem: from the help info: --stats-port=1936: If the underlying router implementation can provide statistics this is a hint to expose it on this port. Specify 0 if you want to turn off exposing the statistics. When set the stats-port to 0. the router cannot be running with error: Readiness probe failed: Get http://localhost:1936/healthz: dial tcp [::1]:1936: getsockopt: connection refused Version-Release number of selected component (if applicable): openshift v3.6.126.1 kubernetes v1.6.1+5115d708d7 etcd 3.2.0 How reproducible: always Steps to Reproduce: 1. oadm router router2 --stats-port=0 2. oc describe pod routerxxx 3. Actual results: step 2: ROUTER_EXTERNAL_HOST_PARTITION_PATH: ROUTER_EXTERNAL_HOST_PASSWORD: ROUTER_EXTERNAL_HOST_PRIVKEY: /etc/secret-volume/router.pem ROUTER_EXTERNAL_HOST_USERNAME: ROUTER_EXTERNAL_HOST_VXLAN_GW_CIDR: ROUTER_LISTEN_ADDR: 0.0.0.0:0 ROUTER_METRICS_TYPE: haproxy ROUTER_SERVICE_HTTPS_PORT: 443 ROUTER_SERVICE_HTTP_PORT: 80 ROUTER_SERVICE_NAME: router2 ROUTER_SERVICE_NAMESPACE: default ROUTER_SUBDOMAIN: STATS_PASSWORD: RYWpLfG6SR STATS_PORT: 0 STATS_USERNAME: admin Mounts: /etc/pki/tls/private from server-certificate (ro) /var/run/secrets/kubernetes.io/serviceaccount from router-token-2vntm (ro) Conditions: Type Status Initialized True Ready False PodScheduled True Volumes: server-certificate: Type: Secret (a volume populated by a Secret) SecretName: router2-certs Optional: false router-token-2vntm: Type: Secret (a volume populated by a Secret) SecretName: router-token-2vntm Optional: false QoS Class: Burstable Node-Selectors: <none> Tolerations: <none> Events: FirstSeen LastSeen Count From SubObjectPath Type Reason Message --------- -------- ----- ---- ------------- -------- ------ ------- 7m 6m 3 default-scheduler Warning FailedScheduling No nodes are available that match all of the following predicates:: PodFitsHostPorts (1). 6m 6m 1 default-scheduler Normal Scheduled Successfully assigned router2-2-2jwx5 to host-8-174-59.host.centralci.eng.rdu2.redhat.com 6m 1m 7 kubelet, host-8-174-59.host.centralci.eng.rdu2.redhat.com spec.containers{router} Normal Pulled Container image "brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/ose-haproxy-router:v3.6.126.1" already present on machine 6m 1m 7 kubelet, host-8-174-59.host.centralci.eng.rdu2.redhat.com spec.containers{router} Normal Created Created container 6m 1m 7 kubelet, host-8-174-59.host.centralci.eng.rdu2.redhat.com spec.containers{router} Normal Started Started container 6m 1m 15 kubelet, host-8-174-59.host.centralci.eng.rdu2.redhat.com spec.containers{router} Warning Unhealthy Readiness probe failed: Get http://localhost:1936/healthz: dial tcp [::1]:1936: getsockopt: connection refused 6m 1m 15 kubelet, host-8-174-59.host.centralci.eng.rdu2.redhat.com spec.containers{router} Warning Unhealthy Liveness probe failed: Get http://localhost:1936/healthz: dial tcp [::1]:1936: getsockopt: connection refused 6m 1m 7 kubelet, host-8-174-59.host.centralci.eng.rdu2.redhat.com spec.containers{router} Normal Killing Killing container with id docker://router:pod "router2-2-2jwx5_default(2298d32b-5c78-11e7-9c5a-fa163ed597dc)" container "router" is unhealthy, it will be killed and re-created. 4m 5s 18 kubelet, host-8-174-59.host.centralci.eng.rdu2.redhat.com spec.containers{router} Warning BackOff Back-off restarting failed container 4m 5s 18 kubelet, host-8-174-59.host.centralci.eng.rdu2.redhat.com Warning FailedSync Error syncing pod Expected results: router should can be running. Additional info:
ROUTER_METRICS_TYPE=haproxy and statsPort=0 are not supported together. I'll fix that, not a release blocker.
Per Clayton: The template router command line validation should reject these options.
PR 16621 https://github.com/openshift/origin/pull/16621
Docs PR 5446 https://github.com/openshift/openshift-docs/pull/5446
Closed docs PR 5446 Revised PR 16621 to provide valid listening port.
Commits pushed to master at https://github.com/openshift/origin https://github.com/openshift/origin/commit/54ec92533ad37ae99887e9794b60da0391e529e3 Router stats-port=0 error stats-port=0 properly disables statistics. Fixes bug: 1466133 https://bugzilla.redhat.com/show_bug.cgi?id=1466133 https://github.com/openshift/origin/commit/0c514c9ec63f59a138c0a05ea2e011b0b7496953 Merge pull request #16621 from pecameron/bz1466133 Automatic merge from submit-queue. Router stats-port=0 error When type=haproxy-router, stats-port must not be 0. Fixes bug: 1466133 https://bugzilla.redhat.com/show_bug.cgi?id=1466133
Tested this bug on v3.7.0-0.143.2 it still can be reproduced with same error: urstable Node-Selectors: <none> Tolerations: <none> Events: FirstSeen LastSeen Count From SubObjectPath Type Reason Message --------- -------- ----- ---- ------------- -------- ------ ------- 8m 8m 2 default-scheduler Warning FailedScheduling No nodes are available that match all of the following predicates:: PodFitsHostPorts (1). 8m 8m 1 default-scheduler Normal Scheduled Successfully assigned rotuer2-2-z6gtz to ip-172-18-13-227.ec2.internal 8m 8m 1 kubelet, ip-172-18-13-227.ec2.internal Normal SuccessfulMountVolume MountVolume.SetUp succeeded for volume "server-certificate" 8m 8m 1 kubelet, ip-172-18-13-227.ec2.internal Normal SuccessfulMountVolume MountVolume.SetUp succeeded for volume "router-token-cp848" 8m 7m 3 kubelet, ip-172-18-13-227.ec2.internal spec.containers{router} Normal Pulled Container image "registry.ops.openshift.com/openshift3/ose-haproxy-router:v3.7.0-0.143.2" already present on machine 7m 7m 2 kubelet, ip-172-18-13-227.ec2.internal spec.containers{router} Normal Killing Killing container with id docker://router:pod "rotuer2-2-z6gtz_default(8461a823-ae4a-11e7-ab8a-0e432c832c92)" container "router" is unhealthy, it will be killed and re-created. 8m 6m 3 kubelet, ip-172-18-13-227.ec2.internal spec.containers{router} Normal Created Created container 8m 6m 3 kubelet, ip-172-18-13-227.ec2.internal spec.containers{router} Normal Started Started container 7m 6m 6 kubelet, ip-172-18-13-227.ec2.internal spec.containers{router} Warning Unhealthy Liveness probe failed: Get http://localhost:1936/healthz: dial tcp [::1]:1936: getsockopt: connection refused 7m 6m 6 kubelet, ip-172-18-13-227.ec2.internal spec.containers{router} Warning Unhealthy Readiness probe failed: Get http://localhost:1936/healthz: dial tcp [::1]:1936: getsockopt: connection refused 5m 2m 10 kubelet, ip-172-18-13-227.ec2.internal Warning FailedSync Error syncing pod [root@ip-172-18-7-84 ~]# oc version oc v3.7.0-0.143.2 kubernetes v1.7.0+80709908fd features: Basic-Auth GSSAPI Kerberos SPNEGO
Please verify that the router pod with commit: https://github.com/openshift/origin/commit/54ec92533ad37ae99887e9794b60da0391e529e3 is running. We need to track down what is going on with this. This works on my cluster. - name: STATS_PASSWORD value: sLzdR6SgDJ - name: STATS_PORT value: "0" - name: STATS_USERNAME value: admin # oc rsh router-ab-17-sxwgb env | grep -e STAT -e LISTEN STATS_PASSWORD=sLzdR6SgDJ STATS_PORT=0 STATS_USERNAME=admin # oc logs router-ab-17-sxwgb I1011 12:37:50.742015 1 template.go:246] Starting template router (v3.7.0-alpha.1+5d7f1b8-859-dirty) I1011 12:37:51.346813 1 router.go:441] Router reloaded: - Checking http://localhost:80 ... - Health check ok : 0 retry attempt(s). I1011 12:37:51.346849 1 router.go:230] Router is including routes in all namespaces E1011 12:37:51.549196 1 router_controller.go:174] route route-secure already exposes www.example.com and is older E1011 12:37:51.576546 1 router_controller.go:174] route route-secure already exposes www.example.com and is older I1011 12:37:51.610080 1 router.go:441] Router reloaded: - Checking http://localhost:80 ... - Health check ok : 0 retry attempt(s).
Changed to modified, merged 16621 Sat Oct 7 16:59:05 2017 -0700 In OSE v3.7.0-0.145.0
Thanks for your comment, I will retest this bug once v3.7.0-0.145.0 is came out.
Verified this bug on v3.7.0-0.153.0, it works well.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2017:3188