Bug 1637389

Summary: Connectivity problems from pod to pod : Network is unreachable
Product: OpenShift Container Platform Reporter: Viacheslav Zak <vizak>
Component: NetworkingAssignee: Ben Bennett <bbennett>
Status: CLOSED INSUFFICIENT_DATA QA Contact: zhaozhanqi <zzhao>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 3.9.0CC: andre.esser, aos-bugs, bbennett, maupadhy, sponnaga, trankin, vizak
Target Milestone: ---Keywords: NeedsTestCase, OpsBlocker, Reopened
Target Release: 3.9.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-06-27 02:28:10 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Viacheslav Zak 2018-10-09 07:52:46 UTC
Description of problem:

Intermittent problem with network connection loss.
The problems do *not* occur and have *never*
occurred when accessing the mongo or postgres pods from the lb
pod.

The connectivity problems occur only when customer is  accessing mongo or
postgres pods from the app pod. Due to those connectivity problems
the app pod fails after a short time, so customer will have to re-deploy
it first and then do the tests before it fails and gets restarted.

Version-Release number of selected component (if applicable):
3.9.41

How reproducible:
Customer send us a video with this problem :
https://www.dropbox.com/s/ivpowk4iewetc1n/RedHatSupportCase02168013.mp4

Steps to Reproduce:
1. While the app-a application is down, connecting repeatedly
from the lb-0 pod to postgres-1, which works as expected
2. Deploying the app-a application 
3. Waiting for app-a to start up, then connecting repeatedly
from app-a to postgres-1. Here about half the connect
attempts trigger 'Network is unreachable' errors in a
seemingly random pattern 
4. Connecting again from lb-0 while app-a is still running,
no problems observed here 

Actual results:
App 

Expected results:


Additional info:

Comment 11 Ben Bennett 2018-11-02 14:22:34 UTC
Can you please find the pod IP address for the postgres db and connect to that from the application pod and see if that makes a difference?  (Assuming that postgres-1 was the service name).

Comment 14 zhaozhanqi 2019-06-27 02:28:10 UTC
close this bug since no any enough logs and no any fixed for long time. 

please feel free reopen it when this issue reproduced. thanks.

Comment 15 Red Hat Bugzilla 2023-09-14 04:39:37 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days