Bug 2189437

Summary: A wrong state file for nmstate service caused network connection lost
Product: Red Hat Enterprise Linux 9 Reporter: Mingyu Shi <mshi>
Component: nmstateAssignee: Gris Ge <fge>
Status: CLOSED ERRATA QA Contact: Mingyu Shi <mshi>
Severity: high Docs Contact:
Priority: medium    
Version: 9.3CC: ferferna, jiji, jishi, network-qe, sfaye, till
Target Milestone: rcKeywords: Triaged
Target Release: ---Flags: pm-rhel: mirror+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: nmstate-2.2.11-1.el9 Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-11-07 08:24:03 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Mingyu Shi 2023-04-25 08:50:52 UTC
Description of problem:
If put a wrong state file in /etc/nmstate and reboot. After rebooting, the default gateway interface is deactivated, the network connection lost.

Version-Release number of selected component (if applicable):
nmstate-2.2.9-1.el9.x86_64
nispor-1.2.10-1.el9.x86_64
NetworkManager-1.43.6-1.el9.x86_64
DISTRO=RHEL-9.3.0-20230423.30

How reproducible:
100%

Steps to Reproduce:
cat << EOF > /etc/nmstate/veth0.yml
dns-resolver: {}
route-rules: {}
routes:
  running:
  - destination: 2620:52:0:49c2::/64
    next-hop-interface: veth0
    next-hop-address: '::'
    metric: 100
    table-id: 254
  - destination: ::/0
    next-hop-interface: veth0
    next-hop-address: fe80::ee3e:f702:1d90:80a1
    metric: 100
    table-id: 254
  - destination: 0.0.0.0/0
    next-hop-interface: veth0
    next-hop-address: 10.73.195.254
    metric: 100
    table-id: 254
interfaces:
- name: veth0
  type: veth
  state: up
  veth:
    peer: veth0_p
  mac-address: 14:18:77:6D:51:08
  mtu: 8500
  min-mtu: 60
  max-mtu: 9000
  wait-ip: any
  ipv4:
    enabled: true
    dhcp: true
    address:
    - ip: 192.168.199.1
      prefix-length: 23
    auto-dns: true
    auto-gateway: true
    auto-routes: true
    auto-route-table-id: 0
  ipv6:
    enabled: true
    dhcp: true
    autoconf: true
    address:
    - ip: 2620:52:0:49c2:1618:77ff:fe6d:5107
      prefix-length: 64
    - ip: fe80::1618:77ff:fe6d:5107
      prefix-length: 64
    auto-dns: true
    auto-gateway: true
    auto-routes: true
    auto-route-table-id: 0
    addr-gen-mode: eui64
  mptcp:
    address-flags: []
  accept-all-mac-addresses: false
  lldp:
    enabled: false
EOF

systemctl enable nmstate
reboot

# Minutes later, I failed to login the server via ssh. Logged in via console, I saw only `lo` interface was active in `nmcli con`

Actual results:
The network is lost(in this example, default gateway interface `eno1` is deactivated).

Expected results:
The failure of applying /etc/nmstate/veth0.yml shouldn't impact the rest of the network.

Additional info:
This is the correct state before rebooting:
# nmcli c
NAME      UUID                                  TYPE      DEVICE 
eno1      79a38ec3-db52-49cd-b172-85fc7f59cff9  ethernet  eno1   
lo        52b1f734-246f-4032-8fcf-8c017ee65949  loopback  lo     
eno2      5ca4cb28-18bd-4493-9390-906024a2a123  ethernet  --     
eno3      1b5661b6-b42e-4f56-b816-20e7f4349c5e  ethernet  --     
eno4      d1f9d3af-87f1-468d-af04-c9347e61d367  ethernet  --     
enp6s0f0  c486a668-2d7d-4c17-8ad3-342d82c84483  ethernet  --     
enp6s0f1  82b43fd1-01a5-4e11-9f37-55ccdfc6da6c  ethernet  --     
   
# nms show eno1
dns-resolver: {}
route-rules: {}
routes:
  running:
  - destination: 2620:52:0:49c2::/64
    next-hop-interface: eno1
    next-hop-address: '::'
    metric: 100
    table-id: 254
  - destination: ::/0
    next-hop-interface: eno1
    next-hop-address: fe80::ee3e:f702:1d90:80a1
    metric: 100
    table-id: 254
  - destination: 0.0.0.0/0
    next-hop-interface: eno1
    next-hop-address: 10.73.195.254
    metric: 100
    table-id: 254
interfaces:
- name: eno1
  type: ethernet
  state: up
  mac-address: 14:18:77:6D:51:07
  mtu: 1500
  min-mtu: 60
  max-mtu: 9000
  wait-ip: any
  ipv4:
    enabled: true
    dhcp: true
    address:
    - ip: 10.73.194.142
      prefix-length: 23
    auto-dns: true
    auto-gateway: true
    auto-routes: true
    auto-route-table-id: 0
  ipv6:
    enabled: true
    dhcp: true
    autoconf: true
    address:
    - ip: 2620:52:0:49c2:1618:77ff:fe6d:5107
      prefix-length: 64
    - ip: fe80::1618:77ff:fe6d:5107
      prefix-length: 64
    auto-dns: true
    auto-gateway: true
    auto-routes: true
    auto-route-table-id: 0
    addr-gen-mode: eui64
  mptcp:
    address-flags: []
  accept-all-mac-addresses: false
  lldp:
    enabled: false
  ethtool:
    pause:
      rx: true
      tx: true
      autoneg: true
    feature:
      tx-tcp-mangleid-segmentation: false
      tx-nocache-copy: false
      tx-tcp6-segmentation: true
      rx-gro-list: false
      rx-udp-gro-forwarding: false
      highdma: true
      tx-checksum-ipv4: true
      rx-gro: true
      tx-generic-segmentation: true
      tx-tcp-ecn-segmentation: true
      rx-checksum: true
      tx-tcp-segmentation: true
      tx-checksum-ipv6: true
    coalesce:
      rx-frames: 5
      rx-usecs: 20
      rx-usecs-irq: 0
      stats-block-usecs: 0
      tx-frames: 53
      tx-frames-irq: 5
      tx-usecs: 72
      tx-usecs-irq: 0
    ring:
      rx: 200
      rx-max: 2047
      tx: 511
      tx-max: 511
  ethernet:
    auto-negotiation: true
    speed: 1000
    duplex: full

Comment 1 Mingyu Shi 2023-04-25 08:52:09 UTC
journalctl -u nmstate.service:

Apr 24 16:58:54 localhost.localdomain systemd[1]: Starting Apply nmstate on-disk state...
Apr 24 16:58:56 localhost.localdomain nmstatectl[1468]: [2023-04-24T08:58:56Z INFO  nmstate::query_apply::net_state] Created checkpoint /org/freedesktop/NetworkManager/Checkpoint/1
Apr 24 16:58:56 localhost.localdomain nmstatectl[1468]: [2023-04-24T08:58:56Z WARN  nmstate::ip] Static addresses [InterfaceIpAddr { ip: 192.168.199.1, prefix_length: 23, mptcp_flags: None }] are ignored when dynamic IP is enabled
Apr 24 16:58:56 localhost.localdomain nmstatectl[1468]: [2023-04-24T08:58:56Z WARN  nmstate::ip] Static addresses [InterfaceIpAddr { ip: 2620:52:0:49c2:1618:77ff:fe6d:5107, prefix_length: 64, mptcp_flags: None }, InterfaceIpAddr { ip: fe80::1618:77ff:fe6d:5107, prefix_length: 64, mptcp_flags: None }] are ignored when dynamic IP is enabled
Apr 24 16:58:56 localhost.localdomain nmstatectl[1468]: [2023-04-24T08:58:56Z INFO  nmstate::nm::settings::connection] Creating veth peer profile veth0_p for veth0
Apr 24 16:58:56 localhost.localdomain nmstatectl[1468]: [2023-04-24T08:58:56Z INFO  nmstate::nm::query_apply::profile] Creating connection UUID Some("2790387d-3c0d-4249-b572-45dc0747a821"), ID Some("veth0"), type Some("veth") name Some("veth0")
Apr 24 16:58:56 localhost.localdomain nmstatectl[1468]: [2023-04-24T08:58:56Z INFO  nmstate::nm::query_apply::profile] Creating connection UUID Some("1b538c0c-8bb0-4a08-bf0d-07d901ee6c27"), ID Some("veth0_p"), type Some("veth") name Some("veth0_p")
Apr 24 16:58:56 localhost.localdomain nmstatectl[1468]: [2023-04-24T08:58:56Z INFO  nmstate::nm::query_apply::profile] Activating connection 2790387d-3c0d-4249-b572-45dc0747a821: veth0/veth
Apr 24 16:58:56 localhost.localdomain nmstatectl[1468]: [2023-04-24T08:58:56Z INFO  nmstate::nm::query_apply::profile] Activating connection 1b538c0c-8bb0-4a08-bf0d-07d901ee6c27: veth0_p/veth
Apr 24 16:58:56 localhost.localdomain nmstatectl[1468]: [2023-04-24T08:58:56Z INFO  nmstate::nm::query_apply::profile] Got activation failure Bug: Manager(UnknownConnection): Connection 'veth0_p' is not available on device veth0_p because device is strictly unmanaged
Apr 24 16:58:56 localhost.localdomain nmstatectl[1468]: [2023-04-24T08:58:56Z INFO  nmstate::nm::query_apply::profile] Will retry activation 2 seconds
Apr 24 16:58:58 localhost.localdomain nmstatectl[1468]: [2023-04-24T08:58:58Z INFO  nmstate::nm::query_apply::profile] Activating connection 1b538c0c-8bb0-4a08-bf0d-07d901ee6c27: veth0_p/veth
Apr 24 16:58:59 localhost.localdomain nmstatectl[1468]: [2023-04-24T08:58:59Z INFO  nmstate::query_apply::net_state] Retrying on: VerificationError: Verification failure: veth0.interface.mptcp desire '{"address-flags":[]}', current 'null'
Apr 24 16:59:00 localhost.localdomain nmstatectl[1468]: [2023-04-24T08:59:00Z INFO  nmstate::query_apply::net_state] Retrying on: VerificationError: Verification failure: veth0.interface.mptcp desire '{"address-flags":[]}', current 'null'
Apr 24 16:59:01 localhost.localdomain nmstatectl[1468]: [2023-04-24T08:59:01Z INFO  nmstate::query_apply::net_state] Retrying on: VerificationError: Verification failure: veth0.interface.mptcp desire '{"address-flags":[]}', current 'null'
Apr 24 16:59:02 dell-per730-20.rhts.eng.pek2.redhat.com nmstatectl[1468]: [2023-04-24T08:59:02Z INFO  nmstate::query_apply::net_state] Retrying on: VerificationError: Verification failure: veth0.interface.mptcp desire '{"address-flags":[]}', current 'null'
Apr 24 16:59:03 dell-per730-20.rhts.eng.pek2.redhat.com nmstatectl[1468]: [2023-04-24T08:59:03Z INFO  nmstate::query_apply::net_state] Retrying on: VerificationError: Verification failure: veth0.interface.mptcp desire '{"address-flags":[]}', current 'null'
Apr 24 16:59:05 dell-per730-20.rhts.eng.pek2.redhat.com nmstatectl[1468]: [2023-04-24T08:59:05Z INFO  nmstate::nm::settings::connection] Creating veth peer profile veth0_p for veth0
Apr 24 16:59:05 dell-per730-20.rhts.eng.pek2.redhat.com nmstatectl[1468]: [2023-04-24T08:59:05Z INFO  nmstate::nm::query_apply::profile] Deactivating connection 2790387d-3c0d-4249-b572-45dc0747a821: veth0/veth
Apr 24 16:59:05 dell-per730-20.rhts.eng.pek2.redhat.com nmstatectl[1468]: [2023-04-24T08:59:05Z INFO  nmstate::nm::query_apply::profile] Modifying connection UUID Some("2790387d-3c0d-4249-b572-45dc0747a821"), ID Some("veth0"), type Some("veth") name Some("veth0")
Apr 24 16:59:05 dell-per730-20.rhts.eng.pek2.redhat.com nmstatectl[1468]: [2023-04-24T08:59:05Z INFO  nmstate::nm::query_apply::profile] Modifying connection UUID Some("1b538c0c-8bb0-4a08-bf0d-07d901ee6c27"), ID Some("veth0_p"), type Some("veth") name Some("veth0_p")
Apr 24 16:59:05 dell-per730-20.rhts.eng.pek2.redhat.com nmstatectl[1468]: [2023-04-24T08:59:05Z INFO  nmstate::nm::query_apply::profile] Activating connection 2790387d-3c0d-4249-b572-45dc0747a821: veth0/veth
Apr 24 16:59:05 dell-per730-20.rhts.eng.pek2.redhat.com nmstatectl[1468]: [2023-04-24T08:59:05Z INFO  nmstate::nm::query_apply::profile] Activating connection 1b538c0c-8bb0-4a08-bf0d-07d901ee6c27: veth0_p/veth
Apr 24 16:59:05 dell-per730-20.rhts.eng.pek2.redhat.com nmstatectl[1468]: [2023-04-24T08:59:05Z INFO  nmstate::nm::query_apply::profile] Got activation failure Bug: Manager(UnknownConnection): Connection 'veth0_p' is not available on device veth0_p because device is strictly unmanaged
Apr 24 16:59:05 dell-per730-20.rhts.eng.pek2.redhat.com nmstatectl[1468]: [2023-04-24T08:59:05Z INFO  nmstate::nm::query_apply::profile] Will retry activation 2 seconds
Apr 24 16:59:07 dell-per730-20.rhts.eng.pek2.redhat.com nmstatectl[1468]: [2023-04-24T08:59:07Z INFO  nmstate::nm::query_apply::profile] Activating connection 1b538c0c-8bb0-4a08-bf0d-07d901ee6c27: veth0_p/veth
Apr 24 16:59:07 dell-per730-20.rhts.eng.pek2.redhat.com nmstatectl[1468]: [2023-04-24T08:59:07Z INFO  nmstate::query_apply::net_state] Retrying on: VerificationError: Verification failure: veth0.interface.mptcp desire '{"address-flags":[]}', current 'null'
Apr 24 16:59:08 dell-per730-20.rhts.eng.pek2.redhat.com nmstatectl[1468]: [2023-04-24T08:59:08Z INFO  nmstate::query_apply::net_state] Retrying on: VerificationError: Verification failure: veth0.interface.mptcp desire '{"address-flags":[]}', current 'null'
Apr 24 16:59:09 dell-per730-20.rhts.eng.pek2.redhat.com nmstatectl[1468]: [2023-04-24T08:59:09Z INFO  nmstate::query_apply::net_state] Retrying on: VerificationError: Verification failure: veth0.interface.mptcp desire '{"address-flags":[]}', current 'null'
Apr 24 16:59:10 dell-per730-20.rhts.eng.pek2.redhat.com nmstatectl[1468]: [2023-04-24T08:59:10Z INFO  nmstate::query_apply::net_state] Retrying on: VerificationError: Verification failure: veth0.interface.mptcp desire '{"address-flags":[]}', current 'null'
Apr 24 16:59:11 localhost.localdomain nmstatectl[1468]: [2023-04-24T08:59:11Z INFO  nmstate::query_apply::net_state] Rollbacked to checkpoint /org/freedesktop/NetworkManager/Checkpoint/1
Apr 24 16:59:11 localhost.localdomain nmstatectl[1468]: [2023-04-24T08:59:11Z ERROR nmstatectl::service] Failed to apply state file /etc/nmstate/veth0.yml: NmstateError: VerificationError: Verification failure: veth0.interface.mptcp desire '{"address-flags":[]}', current 'null'
Apr 24 16:59:11 localhost.localdomain systemd[1]: Finished Apply nmstate on-disk state.

Comment 2 Gris Ge 2023-05-15 12:09:35 UTC
Hi Mingyu,

I tried in my VM using nmstate-2.2.10-3.el9 , works well even with incorrect YML file.

Can you try again?

Comment 3 Mingyu Shi 2023-05-23 06:20:04 UTC
Looks good now:
nmstate-2.2.10-3.el9.x86_64
nispor-1.2.10-1.el9.x86_64
NetworkManager-1.43.8-1.el9.x86_64

BTW, the incorrect state in #comment0 can be applied successfully now, so I used another incorrect state(a veth without peer) instead:
interfaces:
- name: veth10
  type: veth
  state: up

The rollback has not cause any other issue.

Comment 6 Mingyu Shi 2023-06-01 08:42:29 UTC
Verified with:
nmstate-2.2.11-1.el9.x86_64
nispor-1.2.10-1.el9.x86_64
NetworkManager-1.43.8-1.el9.x86_64

Comment 8 errata-xmlrpc 2023-11-07 08:24:03 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (nmstate bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:6323