Description of problem:
Have noticed that pods that are connected to pod network will sometimes not be unable to reach the internal Ingress VIP hosted by keepalived.
For example, the openshift-console pods will get “connection refused” when trying to reach the OAuth endpoint to validate logins.
We run a test from fluentd pod on each node to try and hit the console endpoint and some show it cannot connect.
Test apps VIP for cld-paas-d-eusw1b-2-d9mlc-worker-storage-q5hfg
* Rebuilt URL to: https://console-openshift-console.apps.cld-paas-d-eusw1b-2.phx.aexp.com/
* Uses proxy env variable NO_PROXY == ‘.aexp.com,.cluster.local,.svc,10.10.60.0/23,127.0.0.1,172.28.128.0/17,192.168.0.0/16,api-int.cld-paas-d-eusw1b-2.phx.aexp.com,localhost’
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Trying 10.10.60.19...
* TCP_NODELAY set
* connect to 10.10.60.19 port 443 failed: Connection refused
* Failed to connect to console-openshift-console.apps.cld-paas-d-eusw1b-2.phx.aexp.com port 443: Connection refused
* Closing connection 0
curl: (7) Failed to connect to console-openshift-console.apps.cld-paas-d-eusw1b-2.phx.aexp.com port 443: Connection refused
command terminated with exit code 7
Have found that bouncing the ovnkube-node for the node the pod is running on will clear things up, but will some times the issue will appear again.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
intermittently, pods are unable to connect to the internal VIP; at times restarting the ovnkube-node container will work, but the issue reasserts itself and increasingly the workaround is losing effectiveness.
pods will be able to connect over podnetwork
there are sosreports and gathers attached to the ticket. AMEX has been removed from BZ2055251 and this BZ opened for the cu for further investigation on the issue.
*** This bug has been marked as a duplicate of bug 2022042 ***