Bug 1801638 - On a DHCP6 lease renew, the node gets in NotReady state
Summary: On a DHCP6 lease renew, the node gets in NotReady state
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 4.3.z
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: 4.3.z
Assignee: Antoni Segura Puimedon
QA Contact: Victor Voronkov
URL:
Whiteboard:
Depends On: 1801662
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-02-11 12:43 UTC by Juan Manuel Parrilla Madrid
Modified: 2020-03-10 23:54 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1801662 (view as bug list)
Environment:
Last Closed: 2020-03-10 23:53:52 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Github openshift machine-config-operator pull 1470 None closed Bug 1801638: [release-4.3] baremetal: all resolvconf editing to NM dispatcher 2020-04-15 17:05:45 UTC
Red Hat Product Errata RHBA-2020:0676 None None None 2020-03-10 23:54:00 UTC

Description Juan Manuel Parrilla Madrid 2020-02-11 12:43:19 UTC
Description of problem:

When a master wants to renew the DHCP6 lease, it loses sone entries from resolv.conf which are the NS entry and the search XXXX statement. This causes that kubelet sets the node as NotReady with all the consecuences...

Version-Release number of selected component (if applicable):


How reproducible:

Deploy a IPv6 Disconnected Baremental cluster using this build: 4.3.0-0.nightly-2020-02-06-120247-ipv6.6, then wait, it will happen eventually.

Actual results:

NetworkManager triggers the prepender script (/etc/NetworkManager/dispatcher.d/30-resolv-prepender) and loses the main entries from resolv.conf

Expected results:

Have the resolv.conf like this:

# Generated by KNI resolv prepender NM dispatcher script
search xxx.xxx.xxx.xxx.xxx.redhat.com
nameserver fd35:919d:4042:2:c7ed:9a9f:a9ec:7  # <== VIP
nameserver fd35:919d:4042:2::1000             # <== dnsmasq


Additional info:

Build info: 4.3.0-0.nightly-2020-02-06-120247-ipv6.6

- 30-resolv-prepender:
===========
#!/bin/bash
IFACE=$1
STATUS=$2
# If $DHCP6_FQDN_FQDN contains a "-"
[[ "$DHCP6_FQDN_FQDN" =~ - ]] && hostname $DHCP6_FQDN_FQDN
case "$STATUS" in
    up|down|dhcp4-change|dhcp6-change)
    logger -s "NM resolv-prepender triggered by ${1} ${2}."
    NAMESERVER_IP="fd35:919d:4042:2:c7ed:9a9f:a9ec:7"
    set +e
    if [[ -n "$NAMESERVER_IP" ]]; then
        logger -s "NM resolv-prepender: Prepending 'nameserver $NAMESERVER_IP' to /etc/resolv.conf (other nameservers from /var/run/NetworkManager/resolv.conf)"
        sed "/^search .*$/a nameserver $NAMESERVER_IP" /var/run/NetworkManager/resolv.conf > /etc/resolv.conf
    else
        logger -s "NM resolv-prepender: Couldn't find a Virtual IP, just updating resolv.conf"
        cp /var/run/NetworkManager/resolv.conf /etc/resolv.conf
    fi
    ;;
    *)
    ;;
esac
===========

- resolv.conf:
===========
# Generated by NetworkManager
nameserver fd35:919d:4042:2::1000
===========

- dhclient.conf:
===========
supersede domain-search "kni7.cloud.lab.eng.bos.redhat.com";
===========

Comment 1 Juan Manuel Parrilla Madrid 2020-02-12 16:29:40 UTC
Working fine on: 4.3.0-0.nightly-2020-02-10-055634-ipv6.3 (validated by my side)

Comment 4 Victor Voronkov 2020-02-26 16:10:07 UTC
Verified on 4.3.0-0.nightly-2020-02-21-091838-ipv6.3

After few hours and few lease renew, all seem fine:

[core@master-0 ~]$ sudo cat /var/lib/NetworkManager/dhclient6-6b03ac63-8515-451d-88c1-3e51b630a8b1-ens4.lease | grep ens4
  interface "ens4";
  interface "ens4";
  interface "ens4";
  interface "ens4";
  interface "ens4";
  interface "ens4";
  interface "ens4";
  interface "ens4";
[core@master-0 ~]$ cat /etc/resolv.conf 
# Generated by KNI resolv prepender NM dispatcher script
search vvoron-cluster.qe.lab.redhat.com
nameserver fd2e:6f44:5dd8:c956:0:0:0:2
nameserver fe80::5054:ff:fe20:d53b%ens4
nameserver fd2e:6f44:5dd8:c956::1

[kni@provisionhost-0 ~]$ oc get nodes
NAME                                        STATUS   ROLES    AGE    VERSION
master-0.vvoron-cluster.qe.lab.redhat.com   Ready    master   176m   v1.16.2
master-1.vvoron-cluster.qe.lab.redhat.com   Ready    master   176m   v1.16.2
master-2.vvoron-cluster.qe.lab.redhat.com   Ready    master   176m   v1.16.2
worker-0.vvoron-cluster.qe.lab.redhat.com   Ready    worker   158m   v1.16.2
worker-1.vvoron-cluster.qe.lab.redhat.com   Ready    worker   158m   v1.16.2

Comment 6 errata-xmlrpc 2020-03-10 23:53:52 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0676


Note You need to log in before you can comment on or make changes to this bug.