Description of problem: When deploying prometheus (same issue seems present in master) -- https://prometheus-openshift-metrics.<cluster fqdn>/targets shows "getsockopt: no route to host" when trying to scrape the /metrics endpoint on the OpenShift hosted routers Version-Release number of selected component (if applicable): Seen in release-3.7 but no fundamental changes in master that were evident that might change this. IPTables on nodes where the hosted router is deployed are not updated to expose this port How reproducible: Consistent Steps to Reproduce: 1. Deploy openshift-ansible with prometheus and hosted router 2. Check prometheus target status Actual results: http://<node ip>:1936/metrics DOWN instance="<node ip>:1936" ... Get http://<node ip>:1936/metrics: dial tcp <node ip>:1936: getsockopt: no route to host Expected results: Expect that all "kubernetes-service-endpoints" scrape targets are Green Additional info: Initial PR proposed https://github.com/openshift/openshift-ansible/pull/6636/files Some concerns raised with the way the firewall module interacts with the actual hosts where the hosted router runs but needs feedback on how the deployment team wants to see this executed or any proposed alternative
*** Bug 1589023 has been marked as a duplicate of this bug. ***
*** Bug 1625510 has been marked as a duplicate of this bug. ***
The upstream issue was closed, this is not correct. I still can't access all routers in a multi-infrastructure node setup. It can only access one router -- my guess: the one thats running on the same node as prometheus.
Hello Team, Any updates on this issue Regards, Kedar
I don't see an easy way to open the router metrics port (1936) during install for only the router nodes since the node firewall configuration takes place mostly before anything is done with the routers. Also, even if we could do that, I'm not sure how it would work post install if for example you wanted to move a router to a different node, you'd still need to manually open that port. So I've created a PR against 3.10 to optionally open that port for all nodes during install. https://github.com/openshift/openshift-ansible/pull/11052
Tested with openshift-ansible-3.10.110-1.git.0.1e03ab3.el7.noarch.rpm openshift-ansible-docs-3.10.110-1.git.0.1e03ab3.el7.noarch.rpm openshift-ansible-playbooks-3.10.110-1.git.0.1e03ab3.el7.noarch.rpm openshift-ansible-roles-3.10.110-1.git.0.1e03ab3.el7.noarch.rpm oopenshift-ansible-test-3.10.110-1.git.0.1e03ab3.el7.noarch.rpm 1936 port are opened for all nodes after install # iptables-save | grep 1936 -A KUBE-SEP-DFSWOTRTOQBRAYA4 -s 10.0.77.74/32 -m comment --comment "default/router:1936-tcp" -j KUBE-MARK-MASQ -A KUBE-SEP-DFSWOTRTOQBRAYA4 -p tcp -m comment --comment "default/router:1936-tcp" -m tcp -j DNAT --to-destination 10.0.77.74:1936 -A KUBE-SERVICES ! -s 10.128.0.0/14 -d 172.30.139.20/32 -p tcp -m comment --comment "default/router:1936-tcp cluster IP" -m tcp --dport 1936 -j KUBE-MARK-MASQ -A KUBE-SERVICES -d 172.30.139.20/32 -p tcp -m comment --comment "default/router:1936-tcp cluster IP" -m tcp --dport 1936 -j KUBE-SVC-4JCRTMMYZAAYMIJ2 -A KUBE-SVC-4JCRTMMYZAAYMIJ2 -m comment --comment "default/router:1936-tcp" -j KUBE-SEP-DFSWOTRTOQBRAYA4 -A OS_FIREWALL_ALLOW -p tcp -m state --state NEW -m tcp --dport 1936 -j ACCEPT
Created attachment 1533903 [details] router targets are up
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:0328