Bug 1860200 - local (/etc/hosts) name resolution failing intermittently in application pods
Summary: local (/etc/hosts) name resolution failing intermittently in application pods
Keywords:
Status: CLOSED DUPLICATE of bug 1860201
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Containers
Version: 3.11.0
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: ---
Assignee: Tom Sweeney
QA Contact: Sunil Choudhary
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-07-24 01:18 UTC by Anand Paladugu
Modified: 2020-07-24 18:52 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-07-24 18:52:32 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Anand Paladugu 2020-07-24 01:18:11 UTC
Description of problem:


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. run curl command to reach the endpoint from inside the pod in a loop
2.
3.

Actual results:

curl fails sometimes

Expected results:

curl should pass all the time

Additional info:

customer is having intermittent pod egress connectivity issues in an OCP 3.11 with egress router in a proxied environment.

Use Case:  Application pod connects to an external endpoint URL (via proxy) to upload files.

Proxy has multiple interfaces and it is defined as  148.171.179.249 in the corporate DNS,  and it is defined as   192.168.219.120 in /etc/hosts in the app pod.  App pod needs to connect to proxy @ 192.168.219.120, to establish a session with endpoint URL,  otherwise, the endpoint URL blocks the connection.  

App pod /etc/nsswitch.conf has  "hosts:  files dns"

Issue:  10% of app pod requests to the endpoint URL are failing

Observations:

1. TCP dump shows that sometimes the DNS resolution for the proxy is happening upstream (as if local /etc/hosts resolution is failing) which results in pod connecting to 148.171.179.249 and subsequently fails to connect with endpoint URL.

2.  Strace of curl in the pod shows that /etc/hosts is not red the same way every time.  Some times the contents of the /etc/hosts do not have the proxy line, which is resulting in DNS resolution, and subsequent failures.

3. No resolution related errors are seen in the nodes SOS report.



sosreport, tcpdump and strace outputs are available in the case

Comment 1 Tom Sweeney 2020-07-24 18:52:32 UTC
This appears to be a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1860201.  As such, I'm going to close this one.  If I'm mistaken, or information that should have been entered into the BZ is missing, please feel free to reopen this BZ and update it.

*** This bug has been marked as a duplicate of bug 1860201 ***


Note You need to log in before you can comment on or make changes to this bug.