RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2174710 - failures when DNS is set to auto with DHCP and there is a static DNS search string defined
Summary: failures when DNS is set to auto with DHCP and there is a static DNS search s...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: nmstate
Version: 8.8
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Gris Ge
QA Contact: Mingyu Shi
URL:
Whiteboard:
Depends On: 2177762
Blocks: 2186178 2186179 2186180
TreeView+ depends on / blocked
 
Reported: 2023-03-02 08:20 UTC by qiowang
Modified: 2023-11-14 16:12 UTC (History)
9 users (show)

Fixed In Version: nmstate-1.4.4-1.el8
Doc Type: No Doc Update
Doc Text:
Clone Of:
: 2186178 2186179 2186180 (view as bug list)
Environment:
Last Closed: 2023-11-14 15:26:03 UTC
Type: Bug
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github nmstate nmstate pull 2274 0 None open [nmstate-1.4] DNS: fix error when server has static DNS search with auto DNS server 2023-03-16 07:23:14 UTC
Github nmstate nmstate pull 2308 0 None open [nmstate-1.4] dns: Support static DNS search with auto DNS nameserver 2023-04-11 09:22:38 UTC
Red Hat Issue Tracker NMT-355 0 None None None 2023-03-02 08:21:23 UTC
Red Hat Issue Tracker OCPBUGS-7761 0 None None None 2023-03-02 08:34:31 UTC
Red Hat Issue Tracker RHELPLAN-150390 0 None None None 2023-03-02 08:21:27 UTC
Red Hat Product Errata RHBA-2023:6918 0 None None None 2023-11-14 15:26:35 UTC

Description qiowang 2023-03-02 08:20:54 UTC
Description of problem:
create dummy interface with nmstate nncp failed, producing stack trace, nncp and nnce is degraded


Version-Release number of selected component (if applicable):
OCP version: 4.13.0-0.nightly-2023-02-17-090603
knmstate operator version: 4.13.0-202212050852
node version: Red Hat Enterprise Linux 8.6 (Ootpa)


How reproducible:
100%(seems only on one specified env now)


Steps to Reproduce:
1.install knmstate operator

2.apply below nncp to create a dummy interface
---
apiVersion: nmstate.io/v1
kind: NodeNetworkConfigurationPolicy
metadata:
  name: dummy-policy
spec:
  nodeSelector:
    kubernetes.io/hostname: qiowang-02200-n26p4-compute-0
  desiredState:
    interfaces:
    - name: dummy0
      description: config ethernet
      type: dummy
      state: up

3.check nncp status and nnce status


Actual results:
nncp and nnce is degraded, create dummy interface failed


Expected results:
nncp and nnce is Available, create dummy interface successfully


Additional info:only found this issue on one specified CI template now, I'm still not sure what is special in it:
upi-on-baremetal/versioned-installer-openstack-https_proxy-remove_rhcos_worker-fips-ci
https://gitlab.cee.redhat.com/aosqe/flexy-templates/-/blob/master/functionality-testing/aos-4_13/upi-on-baremetal/versioned-installer-openstack-https_proxy-remove_rhcos_worker-fips-ci 


run nmstatectl in the nmstate-handler pod, got stack trace

% oc rsh nmstate-handler-kcqht
sh-4.4# vi /usr/tmp/dummy.yaml
sh-4.4# cat /usr/tmp/dummy.yaml 
interfaces:
- name: dummy0
  description: config ethernet
  type: dummy
  state: up
sh-4.4# nmstatectl apply /usr/tmp/dummy.yaml --no-commit --timeout 120
2023-02-20 10:16:15,431 root         DEBUG    Nmstate version: 1.2.1
2023-02-20 10:16:15,431 root         DEBUG    Applying desire state: {'interfaces': [{'name': 'dummy0', 'description': 'config ethernet', 'type': 'dummy', 'state': 'up'}]}
2023-02-20 10:16:15,432 root         WARNING  Failed to load plugin nmstate_plugin_ovsdb: No module named 'ovs'
2023-02-20 10:16:15,592 root         DEBUG    NetworkManager version 1.36.0
2023-02-20 10:16:15,603 root         DEBUG    Async action: Retrieve applied config: ethernet ens3 started
2023-02-20 10:16:15,605 root         DEBUG    Async action: Retrieve applied config: ethernet ens3 finished
2023-02-20 10:16:15,607 root         DEBUG    Interface ethernet.ens3 found. Merging the interface information.
2023-02-20 10:16:15,644 root         DEBUG    Interface lo is type unknown and will be ignored during the activation
2023-02-20 10:16:15,644 root         DEBUG    The current route {'table-id': 254, 'destination': '10.128.0.0/14', 'next-hop-interface': 'tun0', 'next-hop-address': '0.0.0.0', 'metric': 0} has been discarded due to Route {'table-id': 254, 'metric': 0, 'destination': '10.128.0.0/14', 'next-hop-address': '0.0.0.0', 'next-hop-interface': 'tun0'} next hop to down/absent interface
2023-02-20 10:16:15,645 root         DEBUG    The current route {'table-id': 254, 'destination': '172.30.0.0/16', 'next-hop-interface': 'tun0', 'next-hop-address': '0.0.0.0', 'metric': 0} has been discarded due to Route {'table-id': 254, 'metric': 0, 'destination': '172.30.0.0/16', 'next-hop-address': '0.0.0.0', 'next-hop-interface': 'tun0'} next hop to down/absent interface
Traceback (most recent call last):
  File "/usr/bin/nmstatectl", line 11, in <module>
    load_entry_point('nmstate==1.2.1', 'console_scripts', 'nmstatectl')()
  File "/usr/lib/python3.6/site-packages/nmstatectl/nmstatectl.py", line 74, in main
    return args.func(args)
  File "/usr/lib/python3.6/site-packages/nmstatectl/nmstatectl.py", line 355, in apply
    args.save_to_disk,
  File "/usr/lib/python3.6/site-packages/nmstatectl/nmstatectl.py", line 419, in apply_state
    save_to_disk=save_to_disk,
  File "/usr/lib/python3.6/site-packages/libnmstate/netapplier.py", line 86, in apply
    desired_state, ignored_ifnames, current_state, save_to_disk
  File "/usr/lib/python3.6/site-packages/libnmstate/net_state.py", line 72, in __init__
    self._ifaces.gen_dns_metadata(self._dns, self._route)
  File "/usr/lib/python3.6/site-packages/libnmstate/ifaces/ifaces.py", line 670, in gen_dns_metadata
    self._kernel_ifaces[iface_name].store_dns_metadata(dns_metadata)
KeyError: None

Comment 1 Till Maas 2023-03-02 08:40:03 UTC
Is this a problem in RHEL 9, too?

Comment 2 Gris Ge 2023-03-02 10:52:31 UTC
RHEL 9.2 will not have this problem.

Comment 3 Gris Ge 2023-03-02 10:55:56 UTC
The key of reproducer is having a interface set as DHCPv4 with static dns search like `example.org`, then try to create new interface(any type is OK, vlan, dummy, bridge, bond and etc).

It is a bug in nmstate that we should not raise error which not caused by user's desire state.
But to fix in RHEL 8, we need clarification from openshift team that it is valid and supported setup -- a DHCP interface holding static DNS search domains.

Comment 4 Gris Ge 2023-03-06 02:02:51 UTC
Forgot to mention, this is unlike `auto_dns: false` in nmstate, the problematic environment is `semi` auto DNS: static DNS search domains + auto DHCP dns nameserver.

Comment 5 qiowang 2023-03-07 03:44:52 UTC
(In reply to Gris Ge from comment #3)
> The key of reproducer is having a interface set as DHCPv4 with static dns
> search like `example.org`, then try to create new interface(any type is OK,
> vlan, dummy, bridge, bond and etc).
> 
> It is a bug in nmstate that we should not raise error which not caused by
> user's desire state.
> But to fix in RHEL 8, we need clarification from openshift team that it is
> valid and supported setup -- a DHCP interface holding static DNS search
> domains.

Thanks @fge for helping find the key of reproducer.

I think this is valid and supported setup for openshift, @mifiedle @zzhao could you please help to check and correct me if i'm wrong. Thanks a lot!

Comment 6 qiowang 2023-03-15 09:23:30 UTC
Hi @fge , per the comments in the JIRA issue https://issues.redhat.com/browse/OCPBUGS-7761 comes from @bnemec , we have quite a few customers who use DHCP for addressing but want to manage DNS with nmstate. Although there will not such issue in OCP4.13(RHEL 9.2), if the customer cannot to upgrade to 4.13 for some reasons, there may be a problem for using knmstate. So could you please help to fix this bug in OCP4.12? Please let me know if there are any concerns. Thanks so much.

Comment 7 Gris Ge 2023-03-15 12:55:58 UTC
Thanks! I will prepare scratch build for testing.

Comment 8 Gris Ge 2023-03-16 07:22:48 UTC
Hi Wang Qiong,

I have built the scratch build, could you try it again after upgraded the nmstate in the openshift nmstate handler container?

https://people.redhat.com/fge/bz_2174710/

Thank you!

Comment 9 qiowang 2023-03-17 04:21:48 UTC
Hi @fge thanks for your quick supporting, i have tried with nmstatectl in the handler container and succeeded.

sh-4.4# rpm -qa | grep nmstate
python3-libnmstate-1.2.1-15.bz2174710.el8.noarch
nmstate-plugin-ovsdb-1.2.1-15.bz2174710.el8.noarch
nmstate-1.2.1-15.bz2174710.el8.x86_64
nmstate-libs-1.2.1-15.bz2174710.el8.x86_64
sh-4.4# 
sh-4.4# 
sh-4.4# 
sh-4.4# cat /usr/tmp/dummy.yaml 
interfaces:
- name: dummy0
  description: config ethernet
  type: dummy
  state: up
sh-4.4# nmstatectl apply /usr/tmp/dummy.yaml --no-commit --timeout 120
2023-03-17 04:12:34,729 root         DEBUG    Nmstate version: 1.2.2
2023-03-17 04:12:34,730 root         DEBUG    Applying desire state: {'interfaces': [{'name': 'dummy0', 'description': 'config ethernet', 'type': 'dummy', 'state': 'up'}]}
2023-03-17 04:12:34,730 root         WARNING  Failed to load plugin nmstate_plugin_ovsdb: No module named 'ovs'
2023-03-17 04:12:34,963 root         DEBUG    NetworkManager version 1.36.0
2023-03-17 04:12:34,979 root         DEBUG    Async action: Retrieve applied config: ethernet ens3 started
2023-03-17 04:12:34,981 root         DEBUG    Async action: Retrieve applied config: ethernet ens3 finished
2023-03-17 04:12:34,984 root         DEBUG    Interface ethernet.ens3 found. Merging the interface information.
2023-03-17 04:12:35,049 root         DEBUG    Interface lo is type unknown and will be ignored during the activation
2023-03-17 04:12:35,049 root         DEBUG    The current route {'table-id': 254, 'destination': '10.128.0.0/14', 'next-hop-interface': 'tun0', 'next-hop-address': '0.0.0.0', 'metric': 0} has been discarded due to Route {'table-id': 254, 'metric': 0, 'destination': '10.128.0.0/14', 'next-hop-address': '0.0.0.0', 'next-hop-interface': 'tun0'} next hop to down/absent interface
2023-03-17 04:12:35,049 root         DEBUG    The current route {'table-id': 254, 'destination': '172.30.0.0/16', 'next-hop-interface': 'tun0', 'next-hop-address': '0.0.0.0', 'metric': 0} has been discarded due to Route {'table-id': 254, 'metric': 0, 'destination': '172.30.0.0/16', 'next-hop-address': '0.0.0.0', 'next-hop-interface': 'tun0'} next hop to down/absent interface
2023-03-17 04:12:35,052 root         DEBUG    Async action: Create checkpoint started
2023-03-17 04:12:35,055 root         DEBUG    Checkpoint /org/freedesktop/NetworkManager/Checkpoint/1 created for all devices
2023-03-17 04:12:35,055 root         DEBUG    Async action: Create checkpoint finished
...
...
2023-03-17 04:12:35,143 root         DEBUG    Connection activation initiated: iface=dummy0 type=dummy con-state=<enum NM_ACTIVE_CONNECTION_STATE_ACTIVATING of type NM.ActiveConnectionState>
2023-03-17 04:12:35,326 root         DEBUG    Connection activation succeeded: iface=dummy0, type=dummy, con_state=<enum NM_ACTIVE_CONNECTION_STATE_ACTIVATED of type NM.ActiveConnectionState>, dev_state=<enum NM_DEVICE_STATE_ACTIVATED of type NM.DeviceState>, state_flags=<flags NM_ACTIVATION_STATE_FLAG_LAYER2_READY | NM_ACTIVATION_STATE_FLAG_IP4_READY | NM_ACTIVATION_STATE_FLAG_IP6_READY of type NM.ActivationStateFlags>
2023-03-17 04:12:35,326 root         DEBUG    Async action: Activate profile uuid:dffb464b-28ab-4f0d-9cca-ac5c5f75fe69 iface:dummy0 type: dummy finished
2023-03-17 04:12:35,343 root         DEBUG    Async action: Retrieve applied config: ethernet ens3 started
2023-03-17 04:12:35,344 root         DEBUG    Async action: Retrieve applied config: dummy dummy0 started
2023-03-17 04:12:35,346 root         DEBUG    Async action: Retrieve applied config: ethernet ens3 finished
2023-03-17 04:12:35,346 root         DEBUG    Async action: Retrieve applied config: dummy dummy0 finished
2023-03-17 04:12:35,347 root         DEBUG    Interface dummy.dummy0 found. Merging the interface information.
2023-03-17 04:12:35,347 root         DEBUG    Interface ethernet.ens3 found. Merging the interface information.
Desired state applied: 
---
interfaces:
- name: dummy0
  type: dummy
  state: up
  description: config ethernet
Checkpoint: NetworkManager|/org/freedesktop/NetworkManager/Checkpoint/1


Check on the node, the dummy0 is created:
% oc debug node/qeci-d52067-fg742-compute-1 -n openshift-infra
Starting pod/qeci-d52067-fg742-compute-1-debug ...
To use host binaries, run `chroot /host`
Pod IP: 10.0.96.73
If you don't see a command prompt, try pressing enter.
sh-4.4# chroot /host
sh-4.4# nmcli con show
NAME    UUID                                  TYPE      DEVICE 
ens3    21d47e65-8523-1a06-af22-6f121086f085  ethernet  ens3   
dummy0  dffb464b-28ab-4f0d-9cca-ac5c5f75fe69  dummy     dummy0

Comment 10 qiowang 2023-03-17 04:23:04 UTC
I will also run some auto for knmstate operator, will give the update when i'm done, thanks.

Comment 12 qiowang 2023-03-17 09:53:17 UTC
Found one issue during auto testing.
Try nncp as below, to disable ipv6, but failed:

% cat disablev6.yaml 
apiVersion: nmstate.io/v1
kind: NodeNetworkConfigurationPolicy
metadata:
  name: disable-v6
spec:
  nodeSelector:
    kubernetes.io/hostname: qiowang-03166-tldrt-compute-1
  desiredState:
    interfaces:
    - name: ens3
      description: disable ipv6
      type: ethernet
      state: up
      ipv6:
        enabled: false

% oc apply -f disablev6.yaml 
nodenetworkconfigurationpolicy.nmstate.io/disable-v6 created
% oc get nncp
NAME         STATUS     REASON
disable-v6   Degraded   FailedToConfigure


message in nnce shows:
      message: |
        error reconciling NodeNetworkConfigurationPolicy at desired state apply: ,
        failed to execute nmstatectl set --no-commit --timeout 480: 'exit status 1'
        libnmstate.error.NmstateVerificationError:
        desired
        =======
        ---
        dns-resolver:
          search:
          - qiowang-03166.qe.devcluster.openshift.com
          server: []
        current
        =======
        ---
        dns-resolver:
          search: []
          server: []
        difference
        ==========
        --- desired
        +++ current
        @@ -1,5 +1,4 @@
         ---
         dns-resolver:
        -  search:
        -  - qiowang-03166.qe.devcluster.openshift.com
        +  search: []
           server: []

full content of nnce is attached.

Comment 13 Gris Ge 2023-03-20 11:16:14 UTC
The second failure is because this semi-auto DNS config is on the desired interface and nmstate discarded it DNS config.

To fix this use case(static DNS search with auto DHCP name server), we need to wait https://bugzilla.redhat.com/show_bug.cgi?id=2177762
finished. Considering the massive work required, this will take at least 1 month to finish.

Comment 14 Gris Ge 2023-03-20 11:50:28 UTC
I have created RHEL 9 bug https://bugzilla.redhat.com/show_bug.cgi?id=2179916 to make sure the same use case will be also supported in RHEL 9.2+.

Comment 15 Gris Ge 2023-04-11 09:22:38 UTC
Patch sent to upstream: https://github.com/nmstate/nmstate/pull/2308


With this patch, nmstate now support static DNS search domains along with dynamic DNS nameserver learn from DHCP/autoconf.

To test this problem:

Case 1:
 * Use `nmcli c modify <connection_name> ipv4.dns-search example.org` to set a static DNS search domains on auto IP interface.
 * Use nmstatectl to create a dummy interface

Case 2:
 * Set up a full auto IP interface.
 * apply this desire state via nmstatectl:

---
dns-resolver:
  config:
    search:
      - example.com
      - example.org

 * Then verify `nmcli c show <connection_name>` has to static DNS search stored along all other dhcp option untouched.

Scratch build for RHEL 8.6/8.8/8.9 are stored in: https://people.redhat.com/fge/bz_2174710/

Comment 16 Gris Ge 2023-04-12 09:52:19 UTC
Above scratch build nmstate-1.4.4-0.20230411.1796git59a6a39f.el8 has been tested in cluster which reproduce this problem initially by Qiong Wang

Comment 26 Mingyu Shi 2023-05-21 09:00:45 UTC
Verified with:
nmstate-1.4.4-2.el8.x86_64
nispor-1.2.10-1.el8.x86_64
NetworkManager-1.40.16-4.el8.x86_64
DISTRO=RHEL-8.9.0-updates-20230520.49

Same as the above.

Comment 28 errata-xmlrpc 2023-11-14 15:26:03 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (nmstate bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:6918


Note You need to log in before you can comment on or make changes to this bug.