Summary: | haproxy pod on one of masters in CrashLoopBackOff after deploy of OpenShift BareMetall ipv6 | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Amit Ugol <augol> |
Component: | Machine Config Operator | Assignee: | Yossi Boaron <yboaron> |
Status: | CLOSED ERRATA | QA Contact: | Aleksandra Malykhin <amalykhi> |
Severity: | high | Docs Contact: | |
Priority: | unspecified | ||
Version: | 4.5 | CC: | asegurap, bperkins, lshilin, miabbott, mnguyen, rbartal, shardy, yboaron |
Target Milestone: | --- | Keywords: | Triaged |
Target Release: | 4.5.z | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: |
Cause:
In baremetal platform, an infra-dns container runs in each host to support node names resolution and other internal DNS records. to complete the picture, an NM script updates host's resolv.conf to point to the infra-dns container.
Additionally, when pods created they inherit their DNS configuration file (/etc/resolv.conf) from the host.
In this case, the HAProxy pod was created before NM scripts update the host's resolv.conf.
Consequence:
HAProxy pod repeatedly failed because the api-int internal DNS record is not resolvable.
Fix:
Verify that resov.conf of HAProxy pod is identical to host's resolv.conf file.
Result:
HAProxy container runs with no error.
|
Story Points: | --- |
Clone Of: | 1849432 | Environment: | |
Last Closed: | 2020-10-19 14:54:24 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Bug Depends On: | 1849432 | ||
Bug Blocks: |
Comment 2
Micah Abbott
2020-10-05 13:34:48 UTC
OCP 4.5: 4.5.0-0.nightly-2020-10-05-204452 Steps to reproduce: 1. Change resolv.conf file on one of the masters (I addede the line "nameserver 127.0.0.1") [core@master-0-0 ~]$ sudo vi /etc/resolv.conf 2. Verify that the haproxy-monitor pod was restarted: [core@master-0-0 ~]$ sudo crictl ps | grep haproxy-monitor f5f09a971e61f 9fae0d9500dcd3c705e17d6c9c7afc41adb7713de390ae7cc751e5408201798e 3 seconds ago Running haproxy-monitor 4 4c9f0b437e7c9 3. Be sure that the haproxy-monitor pod was restarted because the "failed liveness probe": [core@master-0-0 ~]$ journalctl -u kubelet | grep "liveness probe" Oct 06 14:24:55 master-0-0 hyperkube[2343]: I1006 14:24:55.565294 2343 event.go:278] Event(v1.ObjectReference{Kind:"Pod", Namespace:"openshift-kni-infra", Name:"haproxy-master-0-0", UID:"ccda7847ab5cefa71f1f91518c07131c", APIVersion:"v1", ResourceVersion:"", FieldPath:"spec.containers{haproxy-monitor}"}): type: 'Normal' reason: 'Killing' Container haproxy-monitor failed liveness probe, will be restarted 4. Verify that container resolv.conf file is the same as on the master node (first line was added on the step 1): [core@master-0-0 ~]$ sudo crictl exec -it 2a94810301fa6 /bin/sh sh-4.2# cat /etc/resolv.conf # Generated by KNI resolv prepender NM dispatcher script search ocp-edge-cluster-0.qe.lab.redhat.com nameserver 192.168.123.112 nameserver 127.0.0.1 nameserver 192.168.123.1 5. The pod was restarted and in the Running state: [kni@provisionhost-0-0 ~]$ oc get pods -n openshift-kni-infra | grep haproxy-master-0-0 haproxy-master-0-0 2/2 Running 5 78m Before backport: When resolv.conf file changed on one of the masters (step 1) nothing happens. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.5.15 bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:4228 |