Bug 2056735 - Pods on pod network lose ability to connect to internal Ingress VIP until ovnkube-node is restarted
Summary: Pods on pod network lose ability to connect to internal Ingress VIP until ovn...
Keywords:
Status: CLOSED DUPLICATE of bug 2022042
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.8
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: ---
Assignee: Mohamed Mahmoud
QA Contact: Anurag saxena
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-02-22 00:57 UTC by milti leonard
Modified: 2023-09-15 01:19 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-04-05 12:16:57 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description milti leonard 2022-02-22 00:57:46 UTC
Description of problem:
Have noticed that pods that are connected to pod network will sometimes not be unable to reach the internal Ingress VIP hosted by keepalived.
For example, the openshift-console pods will get “connection refused” when trying to reach the OAuth endpoint to validate logins.
We run a test from fluentd pod on each node to try and hit the console endpoint and some show it cannot connect.
Test apps VIP for cld-paas-d-eusw1b-2-d9mlc-worker-storage-q5hfg
* Rebuilt URL to: https://console-openshift-console.apps.cld-paas-d-eusw1b-2.phx.aexp.com/
* Uses proxy env variable NO_PROXY == ‘.aexp.com,.cluster.local,.svc,10.10.60.0/23,127.0.0.1,172.28.128.0/17,192.168.0.0/16,api-int.cld-paas-d-eusw1b-2.phx.aexp.com,localhost’
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying 10.10.60.19...
* TCP_NODELAY set
* connect to 10.10.60.19 port 443 failed: Connection refused
* Failed to connect to console-openshift-console.apps.cld-paas-d-eusw1b-2.phx.aexp.com port 443: Connection refused
* Closing connection 0
curl: (7) Failed to connect to console-openshift-console.apps.cld-paas-d-eusw1b-2.phx.aexp.com port 443: Connection refused
command terminated with exit code 7
Have found that bouncing the ovnkube-node for the node the pod is running on will clear things up, but will some times the issue will appear again.

Version-Release number of selected component (if applicable):
OCPv4.8.2

How reproducible:
N/A

Steps to Reproduce:
1.
2.
3.

Actual results:
intermittently, pods are unable to connect to the internal VIP; at times restarting the ovnkube-node container will work, but the issue reasserts itself and increasingly the workaround is losing effectiveness.

Expected results:
pods will be able to connect over podnetwork

Additional info:
there are sosreports and gathers attached to the ticket. AMEX has been removed from BZ2055251 and this BZ opened for the cu for further investigation on the issue.

Comment 7 Mohamed Mahmoud 2022-04-05 12:16:57 UTC

*** This bug has been marked as a duplicate of bug 2022042 ***

Comment 8 Red Hat Bugzilla 2023-09-15 01:19:42 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days


Note You need to log in before you can comment on or make changes to this bug.