Bug 2174710
| Summary: | failures when DNS is set to auto with DHCP and there is a static DNS search string defined | |||
|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux 8 | Reporter: | qiowang | |
| Component: | nmstate | Assignee: | Gris Ge <fge> | |
| Status: | VERIFIED --- | QA Contact: | Mingyu Shi <mshi> | |
| Severity: | unspecified | Docs Contact: | ||
| Priority: | unspecified | |||
| Version: | 8.8 | CC: | ferferna, fge, jiji, jishi, mifiedle, network-qe, sfaye, till, zzhao | |
| Target Milestone: | rc | Keywords: | Triaged, ZStream | |
| Target Release: | --- | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | nmstate-1.4.4-1.el8 | Doc Type: | No Doc Update | |
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 2186178 2186179 2186180 (view as bug list) | Environment: | ||
| Last Closed: | Type: | Bug | ||
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | 2177762 | |||
| Bug Blocks: | 2186178, 2186179, 2186180 | |||
Is this a problem in RHEL 9, too? RHEL 9.2 will not have this problem. The key of reproducer is having a interface set as DHCPv4 with static dns search like `example.org`, then try to create new interface(any type is OK, vlan, dummy, bridge, bond and etc). It is a bug in nmstate that we should not raise error which not caused by user's desire state. But to fix in RHEL 8, we need clarification from openshift team that it is valid and supported setup -- a DHCP interface holding static DNS search domains. Forgot to mention, this is unlike `auto_dns: false` in nmstate, the problematic environment is `semi` auto DNS: static DNS search domains + auto DHCP dns nameserver. (In reply to Gris Ge from comment #3) > The key of reproducer is having a interface set as DHCPv4 with static dns > search like `example.org`, then try to create new interface(any type is OK, > vlan, dummy, bridge, bond and etc). > > It is a bug in nmstate that we should not raise error which not caused by > user's desire state. > But to fix in RHEL 8, we need clarification from openshift team that it is > valid and supported setup -- a DHCP interface holding static DNS search > domains. Thanks @fge for helping find the key of reproducer. I think this is valid and supported setup for openshift, @mifiedle @zzhao could you please help to check and correct me if i'm wrong. Thanks a lot! Hi @fge , per the comments in the JIRA issue https://issues.redhat.com/browse/OCPBUGS-7761 comes from @bnemec , we have quite a few customers who use DHCP for addressing but want to manage DNS with nmstate. Although there will not such issue in OCP4.13(RHEL 9.2), if the customer cannot to upgrade to 4.13 for some reasons, there may be a problem for using knmstate. So could you please help to fix this bug in OCP4.12? Please let me know if there are any concerns. Thanks so much. Thanks! I will prepare scratch build for testing. Hi Wang Qiong, I have built the scratch build, could you try it again after upgraded the nmstate in the openshift nmstate handler container? https://people.redhat.com/fge/bz_2174710/ Thank you! Hi @fge thanks for your quick supporting, i have tried with nmstatectl in the handler container and succeeded. sh-4.4# rpm -qa | grep nmstate python3-libnmstate-1.2.1-15.bz2174710.el8.noarch nmstate-plugin-ovsdb-1.2.1-15.bz2174710.el8.noarch nmstate-1.2.1-15.bz2174710.el8.x86_64 nmstate-libs-1.2.1-15.bz2174710.el8.x86_64 sh-4.4# sh-4.4# sh-4.4# sh-4.4# cat /usr/tmp/dummy.yaml interfaces: - name: dummy0 description: config ethernet type: dummy state: up sh-4.4# nmstatectl apply /usr/tmp/dummy.yaml --no-commit --timeout 120 2023-03-17 04:12:34,729 root DEBUG Nmstate version: 1.2.2 2023-03-17 04:12:34,730 root DEBUG Applying desire state: {'interfaces': [{'name': 'dummy0', 'description': 'config ethernet', 'type': 'dummy', 'state': 'up'}]} 2023-03-17 04:12:34,730 root WARNING Failed to load plugin nmstate_plugin_ovsdb: No module named 'ovs' 2023-03-17 04:12:34,963 root DEBUG NetworkManager version 1.36.0 2023-03-17 04:12:34,979 root DEBUG Async action: Retrieve applied config: ethernet ens3 started 2023-03-17 04:12:34,981 root DEBUG Async action: Retrieve applied config: ethernet ens3 finished 2023-03-17 04:12:34,984 root DEBUG Interface ethernet.ens3 found. Merging the interface information. 2023-03-17 04:12:35,049 root DEBUG Interface lo is type unknown and will be ignored during the activation 2023-03-17 04:12:35,049 root DEBUG The current route {'table-id': 254, 'destination': '10.128.0.0/14', 'next-hop-interface': 'tun0', 'next-hop-address': '0.0.0.0', 'metric': 0} has been discarded due to Route {'table-id': 254, 'metric': 0, 'destination': '10.128.0.0/14', 'next-hop-address': '0.0.0.0', 'next-hop-interface': 'tun0'} next hop to down/absent interface 2023-03-17 04:12:35,049 root DEBUG The current route {'table-id': 254, 'destination': '172.30.0.0/16', 'next-hop-interface': 'tun0', 'next-hop-address': '0.0.0.0', 'metric': 0} has been discarded due to Route {'table-id': 254, 'metric': 0, 'destination': '172.30.0.0/16', 'next-hop-address': '0.0.0.0', 'next-hop-interface': 'tun0'} next hop to down/absent interface 2023-03-17 04:12:35,052 root DEBUG Async action: Create checkpoint started 2023-03-17 04:12:35,055 root DEBUG Checkpoint /org/freedesktop/NetworkManager/Checkpoint/1 created for all devices 2023-03-17 04:12:35,055 root DEBUG Async action: Create checkpoint finished ... ... 2023-03-17 04:12:35,143 root DEBUG Connection activation initiated: iface=dummy0 type=dummy con-state=<enum NM_ACTIVE_CONNECTION_STATE_ACTIVATING of type NM.ActiveConnectionState> 2023-03-17 04:12:35,326 root DEBUG Connection activation succeeded: iface=dummy0, type=dummy, con_state=<enum NM_ACTIVE_CONNECTION_STATE_ACTIVATED of type NM.ActiveConnectionState>, dev_state=<enum NM_DEVICE_STATE_ACTIVATED of type NM.DeviceState>, state_flags=<flags NM_ACTIVATION_STATE_FLAG_LAYER2_READY | NM_ACTIVATION_STATE_FLAG_IP4_READY | NM_ACTIVATION_STATE_FLAG_IP6_READY of type NM.ActivationStateFlags> 2023-03-17 04:12:35,326 root DEBUG Async action: Activate profile uuid:dffb464b-28ab-4f0d-9cca-ac5c5f75fe69 iface:dummy0 type: dummy finished 2023-03-17 04:12:35,343 root DEBUG Async action: Retrieve applied config: ethernet ens3 started 2023-03-17 04:12:35,344 root DEBUG Async action: Retrieve applied config: dummy dummy0 started 2023-03-17 04:12:35,346 root DEBUG Async action: Retrieve applied config: ethernet ens3 finished 2023-03-17 04:12:35,346 root DEBUG Async action: Retrieve applied config: dummy dummy0 finished 2023-03-17 04:12:35,347 root DEBUG Interface dummy.dummy0 found. Merging the interface information. 2023-03-17 04:12:35,347 root DEBUG Interface ethernet.ens3 found. Merging the interface information. Desired state applied: --- interfaces: - name: dummy0 type: dummy state: up description: config ethernet Checkpoint: NetworkManager|/org/freedesktop/NetworkManager/Checkpoint/1 Check on the node, the dummy0 is created: % oc debug node/qeci-d52067-fg742-compute-1 -n openshift-infra Starting pod/qeci-d52067-fg742-compute-1-debug ... To use host binaries, run `chroot /host` Pod IP: 10.0.96.73 If you don't see a command prompt, try pressing enter. sh-4.4# chroot /host sh-4.4# nmcli con show NAME UUID TYPE DEVICE ens3 21d47e65-8523-1a06-af22-6f121086f085 ethernet ens3 dummy0 dffb464b-28ab-4f0d-9cca-ac5c5f75fe69 dummy dummy0 I will also run some auto for knmstate operator, will give the update when i'm done, thanks. Found one issue during auto testing.
Try nncp as below, to disable ipv6, but failed:
% cat disablev6.yaml
apiVersion: nmstate.io/v1
kind: NodeNetworkConfigurationPolicy
metadata:
name: disable-v6
spec:
nodeSelector:
kubernetes.io/hostname: qiowang-03166-tldrt-compute-1
desiredState:
interfaces:
- name: ens3
description: disable ipv6
type: ethernet
state: up
ipv6:
enabled: false
% oc apply -f disablev6.yaml
nodenetworkconfigurationpolicy.nmstate.io/disable-v6 created
% oc get nncp
NAME STATUS REASON
disable-v6 Degraded FailedToConfigure
message in nnce shows:
message: |
error reconciling NodeNetworkConfigurationPolicy at desired state apply: ,
failed to execute nmstatectl set --no-commit --timeout 480: 'exit status 1'
libnmstate.error.NmstateVerificationError:
desired
=======
---
dns-resolver:
search:
- qiowang-03166.qe.devcluster.openshift.com
server: []
current
=======
---
dns-resolver:
search: []
server: []
difference
==========
--- desired
+++ current
@@ -1,5 +1,4 @@
---
dns-resolver:
- search:
- - qiowang-03166.qe.devcluster.openshift.com
+ search: []
server: []
full content of nnce is attached.
The second failure is because this semi-auto DNS config is on the desired interface and nmstate discarded it DNS config. To fix this use case(static DNS search with auto DHCP name server), we need to wait https://bugzilla.redhat.com/show_bug.cgi?id=2177762 finished. Considering the massive work required, this will take at least 1 month to finish. I have created RHEL 9 bug https://bugzilla.redhat.com/show_bug.cgi?id=2179916 to make sure the same use case will be also supported in RHEL 9.2+. Patch sent to upstream: https://github.com/nmstate/nmstate/pull/2308 With this patch, nmstate now support static DNS search domains along with dynamic DNS nameserver learn from DHCP/autoconf. To test this problem: Case 1: * Use `nmcli c modify <connection_name> ipv4.dns-search example.org` to set a static DNS search domains on auto IP interface. * Use nmstatectl to create a dummy interface Case 2: * Set up a full auto IP interface. * apply this desire state via nmstatectl: --- dns-resolver: config: search: - example.com - example.org * Then verify `nmcli c show <connection_name>` has to static DNS search stored along all other dhcp option untouched. Scratch build for RHEL 8.6/8.8/8.9 are stored in: https://people.redhat.com/fge/bz_2174710/ Above scratch build nmstate-1.4.4-0.20230411.1796git59a6a39f.el8 has been tested in cluster which reproduce this problem initially by Qiong Wang Verified with: nmstate-1.4.4-2.el8.x86_64 nispor-1.2.10-1.el8.x86_64 NetworkManager-1.40.16-4.el8.x86_64 DISTRO=RHEL-8.9.0-updates-20230520.49 Same as the above. |
Description of problem: create dummy interface with nmstate nncp failed, producing stack trace, nncp and nnce is degraded Version-Release number of selected component (if applicable): OCP version: 4.13.0-0.nightly-2023-02-17-090603 knmstate operator version: 4.13.0-202212050852 node version: Red Hat Enterprise Linux 8.6 (Ootpa) How reproducible: 100%(seems only on one specified env now) Steps to Reproduce: 1.install knmstate operator 2.apply below nncp to create a dummy interface --- apiVersion: nmstate.io/v1 kind: NodeNetworkConfigurationPolicy metadata: name: dummy-policy spec: nodeSelector: kubernetes.io/hostname: qiowang-02200-n26p4-compute-0 desiredState: interfaces: - name: dummy0 description: config ethernet type: dummy state: up 3.check nncp status and nnce status Actual results: nncp and nnce is degraded, create dummy interface failed Expected results: nncp and nnce is Available, create dummy interface successfully Additional info:only found this issue on one specified CI template now, I'm still not sure what is special in it: upi-on-baremetal/versioned-installer-openstack-https_proxy-remove_rhcos_worker-fips-ci https://gitlab.cee.redhat.com/aosqe/flexy-templates/-/blob/master/functionality-testing/aos-4_13/upi-on-baremetal/versioned-installer-openstack-https_proxy-remove_rhcos_worker-fips-ci run nmstatectl in the nmstate-handler pod, got stack trace % oc rsh nmstate-handler-kcqht sh-4.4# vi /usr/tmp/dummy.yaml sh-4.4# cat /usr/tmp/dummy.yaml interfaces: - name: dummy0 description: config ethernet type: dummy state: up sh-4.4# nmstatectl apply /usr/tmp/dummy.yaml --no-commit --timeout 120 2023-02-20 10:16:15,431 root DEBUG Nmstate version: 1.2.1 2023-02-20 10:16:15,431 root DEBUG Applying desire state: {'interfaces': [{'name': 'dummy0', 'description': 'config ethernet', 'type': 'dummy', 'state': 'up'}]} 2023-02-20 10:16:15,432 root WARNING Failed to load plugin nmstate_plugin_ovsdb: No module named 'ovs' 2023-02-20 10:16:15,592 root DEBUG NetworkManager version 1.36.0 2023-02-20 10:16:15,603 root DEBUG Async action: Retrieve applied config: ethernet ens3 started 2023-02-20 10:16:15,605 root DEBUG Async action: Retrieve applied config: ethernet ens3 finished 2023-02-20 10:16:15,607 root DEBUG Interface ethernet.ens3 found. Merging the interface information. 2023-02-20 10:16:15,644 root DEBUG Interface lo is type unknown and will be ignored during the activation 2023-02-20 10:16:15,644 root DEBUG The current route {'table-id': 254, 'destination': '10.128.0.0/14', 'next-hop-interface': 'tun0', 'next-hop-address': '0.0.0.0', 'metric': 0} has been discarded due to Route {'table-id': 254, 'metric': 0, 'destination': '10.128.0.0/14', 'next-hop-address': '0.0.0.0', 'next-hop-interface': 'tun0'} next hop to down/absent interface 2023-02-20 10:16:15,645 root DEBUG The current route {'table-id': 254, 'destination': '172.30.0.0/16', 'next-hop-interface': 'tun0', 'next-hop-address': '0.0.0.0', 'metric': 0} has been discarded due to Route {'table-id': 254, 'metric': 0, 'destination': '172.30.0.0/16', 'next-hop-address': '0.0.0.0', 'next-hop-interface': 'tun0'} next hop to down/absent interface Traceback (most recent call last): File "/usr/bin/nmstatectl", line 11, in <module> load_entry_point('nmstate==1.2.1', 'console_scripts', 'nmstatectl')() File "/usr/lib/python3.6/site-packages/nmstatectl/nmstatectl.py", line 74, in main return args.func(args) File "/usr/lib/python3.6/site-packages/nmstatectl/nmstatectl.py", line 355, in apply args.save_to_disk, File "/usr/lib/python3.6/site-packages/nmstatectl/nmstatectl.py", line 419, in apply_state save_to_disk=save_to_disk, File "/usr/lib/python3.6/site-packages/libnmstate/netapplier.py", line 86, in apply desired_state, ignored_ifnames, current_state, save_to_disk File "/usr/lib/python3.6/site-packages/libnmstate/net_state.py", line 72, in __init__ self._ifaces.gen_dns_metadata(self._dns, self._route) File "/usr/lib/python3.6/site-packages/libnmstate/ifaces/ifaces.py", line 670, in gen_dns_metadata self._kernel_ifaces[iface_name].store_dns_metadata(dns_metadata) KeyError: None