Hide Forgot
Description of problem: Adding a nameserver to a node's NM resolv.conf does not add the nameserver to the Corefile if the keepalived constainer was restarted very recently Version-Release number of selected component (if applicable): 4.9.0-0.nightly-2021-08-14-065522 / Steps to Reproduce: On any master node: 1.Get the keepalived container's id by using: sudo crictl ps --name keepalived 2. stop the container by using: sudo crictl stop *CONTAINER ID* 3. Modify /var/run/NetworkManager/resolv.conf by adding a nameserver (example: 'nameserver 8.8.8.8') 4. Check and see if the nameserver you added was added to cat /etc/coredns/Corefile or not. Actual results: [core@master-0-1 ~]$ cat /var/run/NetworkManager/resolv.conf # Generated by NetworkManager search ocp-edge-cluster-0.qe.lab.redhat.com nameserver fe80::5054:ff:fe08:ccbe%br-ex nameserver fd2e:6f44:5dd8::1 nameserver 8.8.8.8 [core@master-0-1 ~]$ cat /etc/coredns/Corefile . { errors health :18080 forward . fe80::5054:ff:fe08:ccbe%br-ex fd2e:6f44:5dd8::1 { policy sequential } Expected results: [core@master-0-1 ~]$ cat /etc/coredns/Corefile . { errors health :18080 forward . fe80::5054:ff:fe08:ccbe%br-ex fd2e:6f44:5dd8::1 8.8.8.8 { policy sequential } Additional info: 1) This can also happen when removing a nameserver and waiting for it to be removed from the corefile. 2) At times, the sync does happen in the above condition, but takes several minutes.
Please provide a must gather if possible along with information about the deployment
Hi Ben, Are you or your team still looking at this bug?
Sorry, I think we missed this one because it wasn't in the baremetal subcomponent. I'm going to move it so we catch it in our triage meeting tomorrow.
Issue is resolved. Expected results: Corefile is synced with the resolv.conf by getting the resolv.conf addition in it's "forward" section, with the sync only taking a a few seconds. Version-Release number of selected component (if applicable), verified on: 4.10.0-0.ci-2021-12-15-195801 Actual results: I've added "8.8.8.6" to the nameserver: [core@master-0-0 ~]$ date Thu Dec 16 15:20:46 UTC 2021 [core@master-0-0 ~]$ sudo vi /var/run/NetworkManager/resolv.conf [core@master-0-0 ~]$ cat vi /var/run/NetworkManager/resolv.conf cat: vi: No such file or directory # Generated by NetworkManager search ocp-edge-cluster-0.qe.lab.redhat.com nameserver fe80::5054:ff:fe62:929f%br-ex nameserver fd2e:6f44:5dd8::1 nameserver 8.8.8.6 Then checked the corefile to make sure it added the nameserver [core@master-0-0 ~]$ cat /etc/coredns/Corefile | grep forward forward . fe80::5054:ff:fe62:929f%br-ex fd2e:6f44:5dd8::1 8.8.8.6 { [core@master-0-0 ~]$ date Thu Dec 16 15:21:18 UTC 2021 Took less than a minute. This should be backported ASAP.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0056