Description of problem: after the lease expires, my VM running rawhide manages to retrieve an IP from DHCP however no route is installed. Version-Release number of selected component (if applicable): Fedora-Cloud-Base-Rawhide-20211217.n.1.x86_64.qcow2 NetworkManager-1.36.0-0.3.fc36.x86_64 dhcp-client-4.4.2-17.P1.fc36.x86_64 How reproducible: discovered the problem by suspending the hypervisor, then after waking up seeing the VMs not getting network back. I manage to reproduce with a little bit of set-up: Steps to Reproduce (using libvirt and latest rawhide VM) 1. on hypervisor, ensure the dhcp lease is set to 2min (alternatively, prepare to wait for 1h) $ sudo grep dhcp-range /var/lib/libvirt/dnsmasq/default.conf dhcp-range=10.224.122.11,10.224.122.100,255.255.255.0,2m 2. boot a VM, check its ip address and route is fine: $ ip a show dev eth0 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 link/ether 52:54:00:9e:82:6c brd ff:ff:ff:ff:ff:ff altname enp1s0 inet 10.224.122.84/24 brd 10.224.122.255 scope global dynamic noprefixroute eth0 valid_lft 97sec preferred_lft 97sec inet6 fe80::5054:ff:fe9e:826c/64 scope link valid_lft forever preferred_lft forever $ ip route default via 10.224.122.1 dev eth0 proto dhcp metric 100 10.224.122.0/24 dev eth0 proto kernel scope link src 10.224.122.84 metric 100 3. on hypervisor, pause the dnsmasq process for 2 minutes $ sudo pkill -STOP -u libvirt-dnsmasq dnsmasq $ sleep 120 $ sudo pkill -CONT -u libvirt-dnsmasq dnsmasq 4. on VM, check again the ip and route Actual results: $ ip a show dev eth0 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 link/ether 52:54:00:9e:82:6c brd ff:ff:ff:ff:ff:ff altname enp1s0 inet 10.224.122.84/24 brd 10.224.122.255 scope global dynamic noprefixroute eth0 valid_lft 73sec preferred_lft 73sec inet6 fe80::5054:ff:fe9e:826c/64 scope link valid_lft forever preferred_lft forever $ ip route <empty> Expected results: ip route should be properly set Additional info: it "Recovers" by restarting NetworkManager: logs (actual results above with ip address but not route was taken at 11:13:27) Dec 19 11:10:00 raw NetworkManager[1367]: <info> [1639912200.4280] device (eth0): Activation: starting connection 'System eth0' (5fb06bd0-0bb0-7ffb-45f1-d6edd65f3e0> Dec 19 11:10:00 raw NetworkManager[1367]: <info> [1639912200.4285] device (eth0): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'managed') Dec 19 11:10:00 raw NetworkManager[1367]: <info> [1639912200.4306] manager: NetworkManager state is now CONNECTING Dec 19 11:10:00 raw NetworkManager[1367]: <info> [1639912200.4320] device (eth0): state change: prepare -> config (reason 'none', sys-iface-state: 'managed') Dec 19 11:10:00 raw NetworkManager[1367]: <info> [1639912200.4346] device (eth0): state change: config -> ip-config (reason 'none', sys-iface-state: 'managed') Dec 19 11:10:00 raw NetworkManager[1367]: <info> [1639912200.4363] dhcp4 (eth0): activation: beginning transaction (timeout in 45 seconds) Dec 19 11:10:44 raw NetworkManager[1367]: <info> [1639912244.2055] dhcp4 (eth0): state changed new lease, address=10.224.122.84 Dec 19 11:10:44 raw NetworkManager[1367]: <info> [1639912244.2074] policy: set 'System eth0' (eth0) as default for IPv4 routing and DNS Dec 19 11:10:44 raw NetworkManager[1367]: <info> [1639912244.2279] device (eth0): state change: ip-config -> ip-check (reason 'none', sys-iface-state: 'managed') Dec 19 11:10:44 raw NetworkManager[1367]: <info> [1639912244.2555] device (eth0): state change: ip-check -> secondaries (reason 'none', sys-iface-state: 'managed') Dec 19 11:10:44 raw NetworkManager[1367]: <info> [1639912244.2560] device (eth0): state change: secondaries -> activated (reason 'none', sys-iface-state: 'managed') Dec 19 11:10:44 raw NetworkManager[1367]: <info> [1639912244.2575] manager: NetworkManager state is now CONNECTED_SITE Dec 19 11:10:44 raw NetworkManager[1367]: <info> [1639912244.2589] device (eth0): Activation: successful, device activated. Dec 19 11:10:44 raw NetworkManager[1367]: <info> [1639912244.2601] manager: NetworkManager state is now CONNECTED_GLOBAL Dec 19 11:12:33 raw NetworkManager[1367]: <info> [1639912353.0510] dhcp4 (eth0): activation: beginning transaction (timeout in 45 seconds) Dec 19 11:12:33 raw NetworkManager[1367]: <info> [1639912353.0513] dhcp4 (eth0): state changed no lease Dec 19 11:12:43 raw NetworkManager[1367]: <info> [1639912363.3796] dhcp4 (eth0): state changed new lease, address=10.224.122.84 Dec 19 11:13:31 raw NetworkManager[1367]: <info> [1639912411.5304] caught SIGTERM, shutting down normally. Dec 19 11:13:31 raw NetworkManager[1367]: <info> [1639912411.5329] dhcp4 (eth0): canceled DHCP transaction Dec 19 11:13:31 raw NetworkManager[1367]: <info> [1639912411.5329] dhcp4 (eth0): activation: beginning transaction (timeout in 45 seconds) Dec 19 11:13:31 raw NetworkManager[1367]: <info> [1639912411.5329] dhcp4 (eth0): state changed no lease Dec 19 11:13:31 raw systemd[1]: Stopping Network Manager... I believe the policy is missing (policy: set 'System eth0' (eth0) as default for IPv4 routing and DNS)
with traces activated, we see there was a network unreachable while adding a route: Dec 19 12:18:00 raw NetworkManager[1310]: <debug> [1639916280.4896] platform: (eth0) address: adding or updating IPv4 address: 10.224.122.84/24 brd 10.224.122.255 lft 116sec pref 116sec lifetime 295-0[116,116] dev 2 flags noprefixroute src unknown Dec 19 12:18:00 raw NetworkManager[1310]: <trace> [1639916280.4921] platform-linux: event-notification: RTM_NEWADDR, flags 0, seq 28: 10.224.122.84/24 brd 10.224.122.255 lft 116sec pref 116sec lifetime 295-294[117,117] dev 2 flags noprefixroute src kernel Dec 19 12:18:00 raw NetworkManager[1310]: <debug> [1639916280.4922] platform: (eth0) signal: address 4 added: 10.224.122.84/24 brd 10.224.122.255 lft 116sec pref 116sec lifetime 295-294[117,117] dev 2 flags noprefixroute src kernel Dec 19 12:18:00 raw NetworkManager[1310]: <debug> [1639916280.4923] l3cfg[30ba27dff5063a0e,ifindex=2]: obj-state: appeared in platform: [05979bb96a16e910, ip4-address, 10.224.122.84/24 brd* 10.224.122.255 lft 116sec pref 116sec lifetime 295-291[120,120] dev 2 src dhcp force-commit], nm-configured, in-platform Dec 19 12:18:00 raw NetworkManager[1310]: <trace> [1639916280.4923] l3cfg[30ba27dff5063a0e,ifindex=2]: emit signal (platform-change, obj-type=ip4-address, change=added, obj=10.224.122.84/24 brd 10.224.122.255 lft 116sec pref 116sec lifetime 295-294[117,117] dev 2 flags noprefixroute src kernel) Dec 19 12:18:00 raw NetworkManager[1310]: <trace> [1639916280.4924] platform-linux: event-notification: RTM_NEWROUTE, flags excl,create, seq 0: type local table 255 10.224.122.84/32 via 0.0.0.0 dev 2 metric 0 mss 0 rt-src rt-kernel scope host pref-src 10.224.122.84 Dec 19 12:18:00 raw NetworkManager[1310]: <debug> [1639916280.4925] platform: (eth0) signal: route 4 added: type local table 255 10.224.122.84/32 via 0.0.0.0 dev 2 metric 0 mss 0 rt-src rt-kernel scope host pref-src 10.224.122.84 Dec 19 12:18:00 raw NetworkManager[1310]: <trace> [1639916280.4925] l3cfg[30ba27dff5063a0e,ifindex=2]: emit signal (platform-change, obj-type=ip4-route, change=added, obj=type local table 255 10.224.122.84/32 via 0.0.0.0 dev 2 metric 0 mss 0 rt-src rt-kernel scope host pref-src 10.224.122.84) Dec 19 12:18:00 raw NetworkManager[1310]: <trace> [1639916280.4926] platform-linux: event-notification: RTM_NEWROUTE, flags excl,create, seq 0: ignore Dec 19 12:18:00 raw NetworkManager[1310]: <debug> [1639916280.4927] platform-linux: do-add-ip4-address[2: 10.224.122.84/24]: success Dec 19 12:18:00 raw NetworkManager[1310]: <debug> [1639916280.4928] platform: (eth0) route: append IPv4 route: type unicast table 0 0.0.0.0/0 via 10.224.122.1 dev 2 metric 100 mss 0 rt-src dhcp force-commit Dec 19 12:18:00 raw NetworkManager[1310]: <debug> [1639916280.4931] platform-linux: netlink: recvmsg: error message from kernel: Network is unreachable (101) "Nexthop has invalid gateway" for request 29 Dec 19 12:18:00 raw NetworkManager[1310]: <debug> [1639916280.4932] platform-linux: do-add-ip4-route[type unicast 0.0.0.0/0 via 10.224.122.1 dev 2 metric 100 mss 0 rt-src rt-dhcp scope global force-commit]: failure 101 (Network is unreachable - Nexthop has invalid gateway) Dec 19 12:18:00 raw NetworkManager[1310]: <debug> [1639916280.4933] platform: (eth0) route-sync: ignore failure to add IPv4 route: type unicast table 0 0.0.0.0/0 via 10.224.122.1 dev 2 metric 100 mss 0 rt-src dhcp force-commit: Network is unreachable Dec 19 12:18:00 raw NetworkManager[1310]: <trace> [1639916280.4948] l3cfg[30ba27dff5063a0e,ifindex=2]: committing IPv6 configuration (update)
This should be fixed by: https://gitlab.freedesktop.org/NetworkManager/NetworkManager/-/merge_requests/1049 The loss of prefix route causes a failure when adding again the default route because the next-hop is not directly reachable.
This bug appears to have been reported against 'rawhide' during the Fedora 36 development cycle. Changing version to 36.
This message is a reminder that Fedora Linux 36 is nearing its end of life. Fedora will stop maintaining and issuing updates for Fedora Linux 36 on 2023-05-16. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a 'version' of '36'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, change the 'version' to a later Fedora Linux version. Note that the version field may be hidden. Click the "Show advanced fields" button if you do not see it. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora Linux 36 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora Linux, you are encouraged to change the 'version' to a later version prior to this bug being closed.
Fedora Linux 36 entered end-of-life (EOL) status on 2023-05-16. Fedora Linux 36 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora Linux please feel free to reopen this bug against that version. Note that the version field may be hidden. Click the "Show advanced fields" button if you do not see the version field. If you are unable to reopen this bug, please file a new report against an active release. Thank you for reporting this bug and we are sorry it could not be fixed.