Bug 1914957

Summary: [RHEL-8.4] fails to boot on x86_64, NetworkManager in endless loop
Product: Red Hat Enterprise Linux 8 Reporter: Petr Zatko <pzatko>
Component: dracutAssignee: Lukáš Nykrýn <lnykryn>
Status: CLOSED ERRATA QA Contact: Frantisek Sumsal <fsumsal>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 8.4CC: acardace, afedorova, amasolov, atragler, bgalvani, drosario, dtardon, fsumsal, guazhang, hannsj_uhl, jcastran, jjb, jkaluza, jlyle, jstodola, kristen.davis, ldu, lmiksik, lnykryn, lrintel, lzap, mgandhi, mvadkert, nm-team, qzhao, rkhan, rmetrich, sukulkar, till, vmarsik, zguo
Target Milestone: betaKeywords: Regression, TestBlocker, Triaged
Target Release: 8.4   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: dracut-049-133.git20210112 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1915356 (view as bug list) Environment:
Last Closed: 2021-05-18 15:02:58 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1879086, 1903942, 1928608    
Attachments:
Description Flags
rdsosreport.txt none

Description Petr Zatko 2021-01-11 15:10:10 UTC
Created attachment 1746298 [details]
rdsosreport.txt

Description of problem:
RHEL-8.4.0-20210111.d.1 fails to boot on x86_64 and ends in dracut shell, looking at rdsosreport.txt NetworkManager seems to be in endless loop:

[  236.913907] localhost NetworkManager[15131]: <info>  [1610375254.9128] NetworkManager (version 1.30.0-0.5.el8) is starting... (after a restart) 
[  236.914190] localhost NetworkManager[15131]: <info>  [1610375254.9131] Read config: /etc/NetworkManager/NetworkManager.conf 
[  236.915415] localhost NetworkManager[15131]: <info>  [1610375254.9143] auth[0x558563ddc640]: create auth-manager: D-Bus connection not available. Polkit is disabled and only root will be authorized. 
[  236.916135] localhost NetworkManager[15131]: <info>  [1610375254.9151] manager[0x558563def070]: monitoring kernel firmware directory '/lib/firmware'. 
[  236.916430] localhost NetworkManager[15131]: <info>  [1610375254.9154] hostname: hostname: hostnamed not used as proxy creation failed with: Could not connect: No such file or directory 
[  236.916583] localhost NetworkManager[15131]: <info>  [1610375254.9155] dns-mgr[0x558563ddf1c0]: init: dns=default,systemd-resolved rc-manager=symlink 
[  236.917741] localhost NetworkManager[15131]: <info>  [1610375254.9167] Loaded device plugin: NMTeamFactory (/usr/lib64/NetworkManager/1.30.0-0.5.el8/libnm-device-plugin-team.so) 
[  236.917839] localhost NetworkManager[15131]: <info>  [1610375254.9168] manager: rfkill: Wi-Fi enabled by radio killswitch; enabled by state file 
[  236.917902] localhost NetworkManager[15131]: <info>  [1610375254.9168] manager: rfkill: WWAN enabled by radio killswitch; enabled by state file 
[  236.917945] localhost NetworkManager[15131]: <info>  [1610375254.9169] manager: Networking is enabled by state file 
[  236.918000] localhost NetworkManager[15131]: <info>  [1610375254.9169] dhcp-init: Using DHCP client 'internal' 
[  236.918405] localhost NetworkManager[15131]: <warn>  [1610375254.9173] ifcfg-rh: dbus: don't use D-Bus for com.redhat.ifcfgrh1 service 
[  236.918497] localhost NetworkManager[15131]: <info>  [1610375254.9174] settings: Loaded settings plugin: ifcfg-rh ("/usr/lib64/NetworkManager/1.30.0-0.5.el8/libnm-settings-plugin-ifcfg-rh.so") 
[  236.918558] localhost NetworkManager[15131]: <info>  [1610375254.9175] settings: Loaded settings plugin: keyfile (internal) 
[  236.919657] localhost NetworkManager[15131]: <info>  [1610375254.9186] device (lo): carrier: link connected 
[  236.919742] localhost NetworkManager[15131]: <info>  [1610375254.9187] manager: (lo): new Generic device (/org/freedesktop/NetworkManager/Devices/1) 
[  236.919996] localhost NetworkManager[15131]: <info>  [1610375254.9189] device (ens3): carrier: link connected 
[  236.920178] localhost NetworkManager[15131]: <info>  [1610375254.9191] manager: (ens3): new Ethernet device (/org/freedesktop/NetworkManager/Devices/2) 
[  236.920515] localhost NetworkManager[15131]: <info>  [1610375254.9194] manager: (ens3): assume: will attempt to assume matching connection 'Wired Connection' (31b0dc26-a665-4740-8dfa-79520de56c63) (indicated) 
[  236.920595] localhost NetworkManager[15131]: <info>  [1610375254.9195] device (ens3): state change: unmanaged -> unavailable (reason 'connection-assumed', sys-iface-state: 'assume') 
[  236.921580] localhost NetworkManager[15131]: <info>  [1610375254.9205] device (ens3): state change: unavailable -> disconnected (reason 'connection-assumed', sys-iface-state: 'assume') 
[  236.921944] localhost NetworkManager[15131]: <info>  [1610375254.9209] device (ens3): Activation: starting connection 'Wired Connection' (31b0dc26-a665-4740-8dfa-79520de56c63) 
[  236.922206] localhost NetworkManager[15131]: <warn>  [1610375254.9211] sleep-monitor-sd: failed to acquire D-Bus proxy: Could not connect: No such file or directory 
[  236.922412] localhost NetworkManager[15131]: <info>  [1610375254.9213] device (ens3): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'assume') 
[  236.922496] localhost NetworkManager[15131]: <info>  [1610375254.9214] device (ens3): state change: prepare -> config (reason 'none', sys-iface-state: 'assume') 
[  236.922583] localhost NetworkManager[15131]: <info>  [1610375254.9215] device (ens3): state change: config -> ip-config (reason 'none', sys-iface-state: 'assume') 
[  236.922726] localhost NetworkManager[15131]: <info>  [1610375254.9216] dhcp4 (ens3): activation: beginning transaction (timeout in 45 seconds) 
[  236.932754] localhost NetworkManager[15131]: <info>  [1610375254.9317] dhcp4 (ens3): option dhcp_lease_time      => '86400' 
[  236.932835] localhost NetworkManager[15131]: <info>  [1610375254.9317] dhcp4 (ens3): option domain_name          => 'hv2.lab.eng.bos.redhat.com' 
[  236.932906] localhost NetworkManager[15131]: <info>  [1610375254.9318] dhcp4 (ens3): option domain_name_servers  => '10.19.42.41 10.11.5.19 10.5.30.160' 
[  236.932960] localhost NetworkManager[15131]: <info>  [1610375254.9319] dhcp4 (ens3): option expiry               => '1610461654' 
[  236.932999] localhost NetworkManager[15131]: <info>  [1610375254.9319] dhcp4 (ens3): option ip_address           => '10.16.56.23' 
[  236.933060] localhost NetworkManager[15131]: <info>  [1610375254.9320] dhcp4 (ens3): option next_server          => '10.19.42.13' 
[  236.933101] localhost NetworkManager[15131]: <info>  [1610375254.9320] dhcp4 (ens3): option nis_domain           => 'redhat.com' 
[  236.933143] localhost NetworkManager[15131]: <info>  [1610375254.9321] dhcp4 (ens3): option ntp_servers          => '10.16.59.254 10.5.30.160 10.2.32.37 10.2.32.38 10.5.26.10 10.5.27.10 10.11.160.238 10.18.52.10 10.18.100.10' 
[  236.933196] localhost NetworkManager[15131]: <info>  [1610375254.9321] dhcp4 (ens3): option requested_broadcast_address => '1' 
[  236.933233] localhost NetworkManager[15131]: <info>  [1610375254.9322] dhcp4 (ens3): option requested_domain_name => '1' 
[  236.933268] localhost NetworkManager[15131]: <info>  [1610375254.9322] dhcp4 (ens3): option requested_domain_name_servers => '1' 
[  236.933307] localhost NetworkManager[15131]: <info>  [1610375254.9322] dhcp4 (ens3): option requested_domain_search => '1' 
[  236.933342] localhost NetworkManager[15131]: <info>  [1610375254.9323] dhcp4 (ens3): option requested_host_name  => '1' 
[  236.933381] localhost NetworkManager[15131]: <info>  [1610375254.9323] dhcp4 (ens3): option requested_interface_mtu => '1' 
[  236.933417] localhost NetworkManager[15131]: <info>  [1610375254.9323] dhcp4 (ens3): option requested_ms_classless_static_routes => '1' 
[  236.933452] localhost NetworkManager[15131]: <info>  [1610375254.9324] dhcp4 (ens3): option requested_nis_domain => '1' 
[  236.933487] localhost NetworkManager[15131]: <info>  [1610375254.9324] dhcp4 (ens3): option requested_nis_servers => '1' 
[  236.933525] localhost NetworkManager[15131]: <info>  [1610375254.9325] dhcp4 (ens3): option requested_ntp_servers => '1' 
[  236.933561] localhost NetworkManager[15131]: <info>  [1610375254.9325] dhcp4 (ens3): option requested_rfc3442_classless_static_routes => '1' 
[  236.933595] localhost NetworkManager[15131]: <info>  [1610375254.9325] dhcp4 (ens3): option requested_root_path  => '1' 
[  236.933630] localhost NetworkManager[15131]: <info>  [1610375254.9326] dhcp4 (ens3): option requested_routers    => '1' 
[  236.933668] localhost NetworkManager[15131]: <info>  [1610375254.9326] dhcp4 (ens3): option requested_static_routes => '1' 
[  236.933704] localhost NetworkManager[15131]: <info>  [1610375254.9326] dhcp4 (ens3): option requested_subnet_mask => '1' 
[  236.933739] localhost NetworkManager[15131]: <info>  [1610375254.9327] dhcp4 (ens3): option requested_time_offset => '1' 
[  236.933773] localhost NetworkManager[15131]: <info>  [1610375254.9327] dhcp4 (ens3): option requested_wpad       => '1' 
[  236.933812] localhost NetworkManager[15131]: <info>  [1610375254.9327] dhcp4 (ens3): option routers              => '10.16.59.254' 
[  236.933848] localhost NetworkManager[15131]: <info>  [1610375254.9328] dhcp4 (ens3): option subnet_mask          => '255.255.252.0' 
[  236.933896] localhost NetworkManager[15131]: <info>  [1610375254.9328] dhcp4 (ens3): state changed unknown -> bound 
[  236.934220] localhost NetworkManager[15131]: <info>  [1610375254.9331] device (ens3): state change: ip-config -> ip-check (reason 'none', sys-iface-state: 'assume') 
[  236.934320] localhost NetworkManager[15131]: <info>  [1610375254.9332] device (ens3): state change: ip-check -> secondaries (reason 'none', sys-iface-state: 'assume') 
[  236.934392] localhost NetworkManager[15131]: <info>  [1610375254.9333] device (ens3): state change: secondaries -> activated (reason 'none', sys-iface-state: 'assume') 
[  236.934554] localhost NetworkManager[15131]: <info>  [1610375254.9335] manager: NetworkManager state is now CONNECTED_LOCAL 
[  236.934859] localhost NetworkManager[15131]: <info>  [1610375254.9338] manager: NetworkManager state is now CONNECTED_SITE 
[  236.934940] localhost NetworkManager[15131]: <info>  [1610375254.9339] policy: set 'Wired Connection' (ens3) as default for IPv4 routing and DNS 
[  236.935816] localhost NetworkManager[15131]: <info>  [1610375254.9347] device (ens3): Activation: successful, device activated. 
[  236.935939] localhost NetworkManager[15131]: <info>  [1610375254.9349] manager: NetworkManager state is now CONNECTED_GLOBAL 
[  236.936209] localhost NetworkManager[15131]: <info>  [1610375254.9351] manager: startup complete 
[  236.936272] localhost NetworkManager[15131]: <info>  [1610375254.9352] quitting now that startup is complete 
[  236.936597] localhost NetworkManager[15131]: <info>  [1610375254.9355] dhcp4 (ens3): canceled DHCP transaction 
[  236.936652] localhost NetworkManager[15131]: <info>  [1610375254.9356] dhcp4 (ens3): state changed bound -> done 
[  236.936701] localhost NetworkManager[15131]: <info>  [1610375254.9356] device (ens3): DHCPv4: trying to acquire a new lease within 90 seconds 
[  236.936767] localhost NetworkManager[15131]: <info>  [1610375254.9357] manager: NetworkManager state is now CONNECTED_SITE 
[  236.936868] localhost NetworkManager[15131]: <info>  [1610375254.9358] exiting (success) 

Associated beaker job:
https://beaker.engineering.redhat.com/jobs/4971795

Attaching rdsosreport.txt

Comment 1 Beniamino Galvani 2021-01-11 15:58:13 UTC
Can you please reproduce adding the 'rd.debug' kernel command line argument, and attach new logs? Thank you.

Comment 2 Petr Zatko 2021-01-12 06:54:08 UTC
Hi Beniamino,

I have reproduced it with rd.debug, but the console log is quite enormous so I can't attach it.

You can pull it from:
http://lab-02.rhts.eng.bos.redhat.com/beaker/logs/recipes/9378+/9378599/console.log 

Beaker job:
https://beaker.engineering.redhat.com/jobs/4973435

Comment 3 Jan Stodola 2021-01-12 09:32:52 UTC
This problem now exists in RHEL-8.4.0-20210112.n.0, last working nigtly compose was RHEL-8.4.0-20210108.n.0.
Here is the changelog between the nightly composes:

===== UPGRADED PACKAGES =====
alsa-lib: alsa-lib-1.2.4-3.el8 -> alsa-lib-1.2.4-4.el8
alsa-utils: alsa-utils-1.2.4-1.el8 -> alsa-utils-1.2.4-2.el8
bash: bash-4.4.19-12.el8 -> bash-4.4.19-14.el8
bind-dyndb-ldap: bind-dyndb-ldap-11.6-1.module+el8.4.0+8826+e9043163 -> bind-dyndb-ldap-11.6-2.module+el8.4.0+9328+4ec4e316
binutils: binutils-2.30-85.el8 -> binutils-2.30-87.el8
brotli: brotli-1.0.6-2.el8 -> brotli-1.0.6-3.el8
buildah: buildah-1.11.6-8.module+el8.3.0+8377+eff33c85 -> buildah-1.11.6-8.module+el8.3.0+9348+d780f094
cockpit: cockpit-234-1.el8 -> cockpit-235-1.el8
cockpit-appstream: cockpit-appstream-234-1.el8 -> cockpit-appstream-235-1.el8
cockpit-podman: cockpit-podman-26-1.module+el8.3.1+9107+df0d2892 -> cockpit-podman-27.1-3.module+el8.3.1+9380+85743958
conmon: conmon-2:2.0.15-1.module+el8.3.0+8377+eff33c85 -> conmon-2:2.0.15-1.module+el8.3.0+9348+d780f094
container-selinux: container-selinux-2:2.130.0-1.module+el8.3.0+8377+eff33c85 -> container-selinux-2:2.155.0-1.module+el8.3.1+9380+85743958
containernetworking-plugins: containernetworking-plugins-0.8.3-4.module+el8.3.0+8377+eff33c85 -> containernetworking-plugins-0.9.0-1.module+el8.3.1+9380+85743958
criu: criu-3.15-1.module+el8.3.1+9107+df0d2892 -> criu-3.15-1.module+el8.3.1+9380+85743958
crontabs: crontabs-1.11-16.20150630git.el8 -> crontabs-1.11-17.20190603git.el8
crun: crun-0.16-2.module+el8.3.1+9107+df0d2892 -> crun-0.16-2.module+el8.3.1+9380+85743958
dovecot: dovecot-1:2.3.8-6.el8 -> dovecot-1:2.3.8-7.el8
dracut: dracut-049-95.git20200804.el8_3.4 -> dracut-049-131.git20210107.el8
dwarves: dwarves-1.17-1.el8 -> dwarves-1.19-1.el8
firefox: firefox-78.6.0-1.el8_3 -> firefox-78.6.1-1.el8_3
frr: frr-7.0-10.el8 -> frr-7.5-2.el8
fuse-overlayfs: fuse-overlayfs-1.3.0-1.module+el8.3.1+9107+df0d2892 -> fuse-overlayfs-1.3.0-1.module+el8.3.1+9380+85743958
fwupd: fwupd-1.4.2-4.el8 -> fwupd-1.5.5-1.el8
ipa: ipa-4.9.0-0.5.rc3.module+el8.4.0+9124+ced20601 -> ipa-4.9.0-1.module+el8.4.0+9275+6e05eb02
kernel: kernel-4.18.0-270.el8 -> kernel-4.18.0-272.el8
kernel-rt: kernel-rt-4.18.0-269.rt7.34.el8 -> kernel-rt-4.18.0-272.rt7.37.el8
libsepol: libsepol-2.9-1.el8 -> libsepol-2.9-2.el8
libslirp: libslirp-4.3.1-1.module+el8.3.1+9107+df0d2892 -> libslirp-4.3.1-1.module+el8.3.1+9380+85743958
libyang: libyang-0.16.105-3.el8_1.2 -> libyang-1.0.184-1.el8
linux-firmware: linux-firmware-20201022-100.gitdae4b4cd.el8 -> linux-firmware-20201118-101.git7455a360.el8
lvm2: lvm2-8:2.03.11-0.4.20201222gitb84a992.el8 -> lvm2-8:2.03.11-1.el8
mesa: mesa-20.3.1-1.el8 -> mesa-20.3.2-1.el8
meson: meson-0.55.3-2.el8 -> meson-0.55.3-3.el8
mokutil: mokutil-1:0.3.0-10.el8 -> mokutil-1:0.3.0-11.el8
oci-seccomp-bpf-hook: oci-seccomp-bpf-hook-1.2.0-1.module+el8.3.1+9107+df0d2892 -> oci-seccomp-bpf-hook-1.2.0-1.module+el8.3.1+9380+85743958
opencv: opencv-3.4.6-5.el8 -> opencv-3.4.6-6.el8
p11-kit: p11-kit-0.23.21-4.el8 -> p11-kit-0.23.22-1.el8
perl: perl-4:5.26.3-418.el8 -> perl-4:5.26.3-419.el8
plymouth: plymouth-0.9.4-7.20200615git1e36e30.el8 -> plymouth-0.9.4-8.20200615git1e36e30.el8
podman: podman-1.6.4-23.module+el8.3.0+8377+eff33c85 -> podman-2.2.1-3.module+el8.3.1+9380+85743958
python-ply: python-ply-3.9-8.el8 -> python-ply-3.9-9.el8
rear: rear-2.4-17.el8 -> rear-2.4-18.el8
rhel-system-roles: rhel-system-roles-1.0-21.el8 -> rhel-system-roles-1.0-23.el8
rt-tests: rt-tests-1.9-2.el8 -> rt-tests-1.10-1.el8
rteval-loads: rteval-loads-1.4-10.el8 -> rteval-loads-1.4-11.el8
runc: runc-1.0.0-69.rc92.module+el8.3.1+9107+df0d2892 -> runc-1.0.0-69.rc92.module+el8.3.1+9380+85743958
scap-security-guide: scap-security-guide-0.1.53-2.el8 -> scap-security-guide-0.1.53-3.el8
skopeo: skopeo-1:1.2.0-4.module+el8.3.1+9107+df0d2892 -> skopeo-1:1.2.0-9.module+el8.3.1+9380+85743958
sssd: sssd-2.4.0-3.el8 -> sssd-2.4.0-5.el8
stratis-cli: stratis-cli-2.3.0-2.el8 -> stratis-cli-2.3.0-3.el8
udica: udica-0.2.1-2.module+el8.3.0+8377+eff33c85 -> udica-0.2.1-2.module+el8.3.0+9348+d780f094
uuid: uuid-1.6.2-42.el8 -> uuid-1.6.2-43.el8
virtio-win: virtio-win-1.9.14-4.el8 -> virtio-win-1.9.15-0.el8
xfsprogs: xfsprogs-5.0.0-7.el8 -> xfsprogs-5.0.0-8.el8

===== DOWNGRADED PACKAGES =====
python-podman-api: python-podman-api-1.2.0-0.2.gitd0a45fe.module+el8.3.1+9107+df0d2892 -> python-podman-api-1.2.0-0.2.gitd0a45fe.module+el8.3.0+9348+d780f094
slirp4netns: slirp4netns-1.1.8-1.module+el8.3.1+9107+df0d2892 -> slirp4netns-0.4.2-3.git21fdece.module+el8.3.0+9348+d780f094
toolbox: toolbox-0.0.8-1.module+el8.3.1+9107+df0d2892 -> toolbox-0.0.7-1.module+el8.3.0+9348+d780f094

This blocks testing and gating of other packages.

Lukas, I can see dracut has changed, can you please have a look at this bug?

Comment 5 Beniamino Galvani 2021-01-12 10:32:58 UTC
Jan or Petr, could you also please start a job with the last working compose (RHEL-8.4.0-20210108.n.0) and rd.debug so that we can compare the differences? It seems something changed in dracut (restoring the needinfo on Lukas).

Comment 10 Honggang LI 2021-01-13 04:25:04 UTC
*** Bug 1915615 has been marked as a duplicate of this bug. ***

Comment 11 Miroslav Vadkerti 2021-01-14 11:30:21 UTC
I wonder why this bug was not caught by gating. Has anybody an idea what test are we missing here?

Comment 14 guazhang@redhat.com 2021-01-15 01:54:10 UTC
*** Bug 1916168 has been marked as a duplicate of this bug. ***

Comment 17 Lukáš Nykrýn 2021-01-20 11:35:22 UTC
*** Bug 1915675 has been marked as a duplicate of this bug. ***

Comment 18 David Tardon 2021-01-20 11:58:07 UTC
*** Bug 1916467 has been marked as a duplicate of this bug. ***

Comment 25 Desnes Nunes (IBM - old account, do not use) 2021-03-02 19:04:45 UTC
Hello everyone,

Red Hat Bug 1916168 was closed as a duplicate of this one, and we may have reproduced that bug on IBM's Beaker.

Furthermore, we have found out that on our case, PXE on powerpc lpars were getting screwed up another missconfigured DHCP server sending NACKs into the network during the DHCPREQUEST/DHCPOFFER negotiation.
Don't know if this is the same issue here, but I have left a comment with further details on that bug: https://bugzilla.redhat.com/show_bug.cgi?id=1916168#c2

Comment 26 jpbn 2021-04-10 07:17:31 UTC
Same (?) bug here. fedora 34 workstation nightly build x86-64 : 

how to reproduce:
1. push power button;, 
2. look at grey screen, esc does not work,

Comment 27 jpbn 2021-04-10 11:11:37 UTC
after intsall new nightly build fedora workstation live x86_64 34 20210409 bug is gone.

Comment 28 jpbn 2021-04-11 11:34:22 UTC
this was an hardware failure. 2008 machine broken down.

Comment 30 errata-xmlrpc 2021-05-18 15:02:58 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (dracut bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:1661