Bug 2221648

Summary: nmpolicy capture leaves dhcp values which blocks further policies
Product: Red Hat Enterprise Linux 9 Reporter: Yossi Segev <ysegev>
Component: nmstateAssignee: Gris Ge <fge>
Status: CLOSED MIGRATED QA Contact: Mingyu Shi <mshi>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 9.2CC: ferferna, jiji, jishi, network-qe, sfaye, till
Target Milestone: rcKeywords: MigratedToJIRA, Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-08-17 09:57:44 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
1-static-ip-primary-net.yaml none

Description Yossi Segev 2023-07-10 13:46:44 UTC
Created attachment 1974978 [details]
1-static-ip-primary-net.yaml

Description of problem:
Applying a dynamic IP (DHCP) configuration on a cluster node's primary interface leaves DHCP configuration values, which then block other policies from being applied.


Version-Release number of selected component (if applicable):
OCP cluster version 4.13.4
kubernetes-nmstate-operator.4.13.0-202306070816
nmstate-2.2.9-6.rhaos4.13.el8.x86_64


How reproducible:
Always


Steps to Reproduce:
1.
Label a cluster node with {"capture": "allow"}:
$ oc label node cnv-qe-18.cnvqe.lab.eng.rdu2.redhat.com "capture"="allow"
* Change the node name to the name of a node in your cluster.

2.
Apply the attached setup policies 1-static-ip-primary-net.yaml and 2-capture-br1-deployment.yaml
* Wait for the first policy to finish successfully before applying the second policy.
* Set the nodeSelector in 1-static-ip-primary-net.yaml to point to the name of same node you labeled.

3.
Apply the attached teardown policies 3-capture-br1-teardown.yaml and 4-dynamic-ip-primary-net.yaml
* Once again - change the nodeSelector in 4-dynamic-ip-primary-net.yaml

4.
Check the NNS of the node, specifically the primary interface:
$ oc get nns net-ys-4132s-2-k8htj-worker-0-2mn4r -o yaml | less
      ipv4:
        address:
        - ip: 192.168.0.176
          prefix-length: 18
        dhcp: false
        dhcp-client-id: ll
        enabled: true
      ipv6:
        addr-gen-mode: eui64
        address:
        - ip: fe80::3998:9201:288e:90c5
          prefix-length: 64
        - ip: fe80::f816:3eff:fefc:319b
          prefix-length: 64
        autoconf: false
        dhcp: false
        dhcp-duid: ll
        enabled: true
      lldp:
        enabled: false
      mac-address: FA:16:3E:FC:31:9B
      max-mtu: 7950
      min-mtu: 68
      mptcp:
        address-flags: []
      mtu: 7950
      name: ens3
      state: up
      type: ethernet
      wait-ip: any

"dhcp-client-id: ll" was added to ipv4, and "dhcp-duid: ll" was added to ipv6.

5.
Try applying the 2 setup policies again (1-static-ip-primary-net.yaml and 2-capture-br1-deployment.yaml).


Actual results:
Applying the second policy fails.


Expected results:
Configuration applied successfully.


Additional info:
1. According to the NNCE, the new added parameter is to blame:
    325 [2023-07-10T13:17:10Z INFO  nmstate::query_apply::net_state] Retrying on: VerificationError: Verification failure: capture-br1.interface.ipv4.dhcp-client-id desire '    325 "ll"', current 'null'
2. The attached journalctl and nmstate-handler pod logs start just before applying the first policy, and stop after applying the last teardown policy (in order to avoid unnecessary noise).