Bug 2051753 - Install transaction gets stuck at the end of system install, between post-transaction script phase and verification phase
Summary: Install transaction gets stuck at the end of system install, between post-tra...
Keywords:
Status: CLOSED DUPLICATE of bug 2054230
Alias: None
Product: Fedora
Classification: Fedora
Component: dnf
Version: 39
Hardware: All
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: rpm-software-management
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-02-08 00:26 UTC by Adam Williamson
Modified: 2023-08-25 09:27 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-08-25 09:27:21 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
anaconda.log from a failed test - https://openqa.fedoraproject.org/tests/1120794 (423.47 KB, text/plain)
2022-02-08 00:31 UTC, Adam Williamson
no flags Details
packaging.log from a failed test - https://openqa.fedoraproject.org/tests/1120794 (24.55 KB, text/plain)
2022-02-08 00:32 UTC, Adam Williamson
no flags Details
syslog (system log messages) from a failed test - https://openqa.fedoraproject.org/tests/1120794 (5.20 MB, text/plain)
2022-02-08 00:32 UTC, Adam Williamson
no flags Details

Description Adam Williamson 2022-02-08 00:26:00 UTC
This has been going on a long time, but I haven't previously filed it since it's tricky to debug and doesn't seem to reproduce easily outside of openQA. But, it happened a lot today, so I thought I'd dig into it as far as I can and file a bug in case anyone else can figure out what's going on.

Occasionally, an openQA system install test will fail in a distinctive way: the test times out with the installer apparently stuck showing "Configuring rootfiles.noarch" (usually). Today, we had several failures like that and also several which instead got stuck at "Configuring NetworkManager-config-connectivity-fedora.noarch". The installer stays like this for ten minutes plus, and would likely stay like that for even longer if openQA didn't eventually time out and give up.

Looking at the logs from this kind of failure, this is what we see at the end of anaconda.log:

06:30:10,826 INF modules.payloads.payload.dnf.transaction_progress: Configuring (running scriptlet for): NetworkManager-config-connectivity-fedora-1:1.36.0-0.9.fc36.noarch 1643999192 6320938716751e3f38b2a8a1516202df117bccea12a72e67682ba53a3078b4c0
06:30:12,654 INF modules.payloads.payload.dnf.transaction_progress: Configuring (running scriptlet for): NetworkManager-config-connectivity-fedora-1:1.36.0-0.9.fc36.noarch 1643999192 6320938716751e3f38b2a8a1516202df117bccea12a72e67682ba53a3078b4c0
06:30:13,380 INF modules.payloads.payload.dnf.transaction_progress: Configuring (running scriptlet for): NetworkManager-config-connectivity-fedora-1:1.36.0-0.9.fc36.noarch 1643999192 6320938716751e3f38b2a8a1516202df117bccea12a72e67682ba53a3078b4c0
06:30:15,333 INF modules.payloads.payload.dnf.transaction_progress: Configuring (running scriptlet for): NetworkManager-config-connectivity-fedora-1:1.36.0-0.9.fc36.noarch 1643999192 6320938716751e3f38b2a8a1516202df117bccea12a72e67682ba53a3078b4c0
06:30:15,567 INF modules.payloads.payload.dnf.transaction_progress: Configuring (running scriptlet for): NetworkManager-config-connectivity-fedora-1:1.36.0-0.9.fc36.noarch 1643999192 6320938716751e3f38b2a8a1516202df117bccea12a72e67682ba53a3078b4c0
06:30:16,082 INF modules.payloads.payload.dnf.transaction_progress: Configuring (running scriptlet for): NetworkManager-config-connectivity-fedora-1:1.36.0-0.9.fc36.noarch 1643999192 6320938716751e3f38b2a8a1516202df117bccea12a72e67682ba53a3078b4c0
06:30:16,576 INF modules.payloads.payload.dnf.transaction_progress: Configuring (running scriptlet for): NetworkManager-config-connectivity-fedora-1:1.36.0-0.9.fc36.noarch 1643999192 6320938716751e3f38b2a8a1516202df117bccea12a72e67682ba53a3078b4c0
06:30:16,840 INF modules.payloads.payload.dnf.transaction_progress: Configuring (running scriptlet for): NetworkManager-config-connectivity-fedora-1:1.36.0-0.9.fc36.noarch 1643999192 6320938716751e3f38b2a8a1516202df117bccea12a72e67682ba53a3078b4c0
06:30:17,299 INF modules.payloads.payload.dnf.transaction_progress: Configuring (running scriptlet for): NetworkManager-config-connectivity-fedora-1:1.36.0-0.9.fc36.noarch 1643999192 6320938716751e3f38b2a8a1516202df117bccea12a72e67682ba53a3078b4c0
06:30:17,640 INF modules.payloads.payload.dnf.transaction_progress: Configuring (running scriptlet for): NetworkManager-config-connectivity-fedora-1:1.36.0-0.9.fc36.noarch 1643999192 6320938716751e3f38b2a8a1516202df117bccea12a72e67682ba53a3078b4c0
06:30:17,850 INF modules.payloads.payload.dnf.transaction_progress: Configuring (running scriptlet for): NetworkManager-config-connectivity-fedora-1:1.36.0-0.9.fc36.noarch 1643999192 6320938716751e3f38b2a8a1516202df117bccea12a72e67682ba53a3078b4c0
06:30:18,131 INF modules.payloads.payload.dnf.transaction_progress: Configuring (running scriptlet for): NetworkManager-config-connectivity-fedora-1:1.36.0-0.9.fc36.noarch 1643999192 6320938716751e3f38b2a8a1516202df117bccea12a72e67682ba53a3078b4c0
06:30:19,733 INF modules.payloads.payload.dnf.transaction_progress: Configuring (running scriptlet for): NetworkManager-config-connectivity-fedora-1:1.36.0-0.9.fc36.noarch 1643999192 6320938716751e3f38b2a8a1516202df117bccea12a72e67682ba53a3078b4c0
06:30:19,980 INF modules.payloads.payload.dnf.transaction_progress: Configuring (running scriptlet for): NetworkManager-config-connectivity-fedora-1:1.36.0-0.9.fc36.noarch 1643999192 6320938716751e3f38b2a8a1516202df117bccea12a72e67682ba53a3078b4c0
06:30:21,062 INF modules.payloads.payload.dnf.transaction_progress: Configuring (running scriptlet for): NetworkManager-config-connectivity-fedora-1:1.36.0-0.9.fc36.noarch 1643999192 6320938716751e3f38b2a8a1516202df117bccea12a72e67682ba53a3078b4c0
06:30:23,965 INF modules.payloads.payload.dnf.transaction_progress: Configuring (running scriptlet for): NetworkManager-config-connectivity-fedora-1:1.36.0-0.9.fc36.noarch 1643999192 6320938716751e3f38b2a8a1516202df117bccea12a72e67682ba53a3078b4c0
06:30:24,460 INF modules.payloads.payload.dnf.transaction_progress: Configuring (running scriptlet for): NetworkManager-config-connectivity-fedora-1:1.36.0-0.9.fc36.noarch 1643999192 6320938716751e3f38b2a8a1516202df117bccea12a72e67682ba53a3078b4c0
06:30:25,046 INF modules.payloads.payload.dnf.transaction_progress: Configuring (running scriptlet for): NetworkManager-config-connectivity-fedora-1:1.36.0-0.9.fc36.noarch 1643999192 6320938716751e3f38b2a8a1516202df117bccea12a72e67682ba53a3078b4c0
06:30:53,568 INF modules.payloads.payload.dnf.transaction_progress: Configuring (running scriptlet for): NetworkManager-config-connectivity-fedora-1:1.36.0-0.9.fc36.noarch 1643999192 6320938716751e3f38b2a8a1516202df117bccea12a72e67682ba53a3078b4c0
06:30:54,144 INF modules.payloads.payload.dnf.transaction_progress: Configuring (running scriptlet for): NetworkManager-config-connectivity-fedora-1:1.36.0-0.9.fc36.noarch 1643999192 6320938716751e3f38b2a8a1516202df117bccea12a72e67682ba53a3078b4c0
06:30:54,991 INF modules.payloads.payload.dnf.transaction_progress: Configuring (running scriptlet for): NetworkManager-config-connectivity-fedora-1:1.36.0-0.9.fc36.noarch 1643999192 6320938716751e3f38b2a8a1516202df117bccea12a72e67682ba53a3078b4c0
06:30:55,513 INF modules.payloads.payload.dnf.transaction_progress: Configuring (running scriptlet for): NetworkManager-config-connectivity-fedora-1:1.36.0-0.9.fc36.noarch 1643999192 6320938716751e3f38b2a8a1516202df117bccea12a72e67682ba53a3078b4c0
06:30:56,866 INF modules.payloads.payload.dnf.transaction_progress: Configuring (running scriptlet for): NetworkManager-config-connectivity-fedora-1:1.36.0-0.9.fc36.noarch 1643999192 6320938716751e3f38b2a8a1516202df117bccea12a72e67682ba53a3078b4c0

This is what we see at the end of packaging.log:

06:30:17,820 INF dnf.rpm: Running in chroot, ignoring command 'daemon-reload'
Running in chroot, ignoring command 'reload-or-restart'

06:30:19,941 INF dnf.rpm: Running in chroot, ignoring request.

06:30:57,388 DDEBUG dnf: RPM transaction over.

and this is what we see around the same time in syslog (system log):

11:30:57,085 DEBUG systemd:Bus private-bus-connection: changing state UNSET → OPENING
11:30:57,087 DEBUG systemd:sd-bus: starting bus private-bus-connection on fds 54/54 (socket:[95263], socket:[95263])...
11:30:57,087 DEBUG systemd:Bus private-bus-connection: changing state OPENING → AUTHENTICATING
11:30:57,101 DEBUG systemd:Registering bus object implementation for path=/org/freedesktop/systemd1 iface=org.freedesktop.systemd1.Manager
11:30:57,102 DEBUG systemd:Registering bus object implementation for path=/org/freedesktop/systemd1/job iface=org.freedesktop.systemd1.Job
11:30:57,102 DEBUG systemd:Registering bus object implementation for path=/org/freedesktop/systemd1/unit iface=org.freedesktop.systemd1.Unit
11:30:57,103 DEBUG systemd:Registering bus object implementation for path=/org/freedesktop/systemd1/unit iface=org.freedesktop.systemd1.Automount
11:30:57,103 DEBUG systemd:Registering bus object implementation for path=/org/freedesktop/systemd1/unit iface=org.freedesktop.systemd1.Device
11:30:57,103 DEBUG systemd:Registering bus object implementation for path=/org/freedesktop/systemd1/unit iface=org.freedesktop.systemd1.Mount
11:30:57,103 DEBUG systemd:Registering bus object implementation for path=/org/freedesktop/systemd1/unit iface=org.freedesktop.systemd1.Path
11:30:57,103 DEBUG systemd:Registering bus object implementation for path=/org/freedesktop/systemd1/unit iface=org.freedesktop.systemd1.Scope
11:30:57,104 DEBUG systemd:Registering bus object implementation for path=/org/freedesktop/systemd1/unit iface=org.freedesktop.systemd1.Service
11:30:57,104 DEBUG systemd:Registering bus object implementation for path=/org/freedesktop/systemd1/unit iface=org.freedesktop.systemd1.Slice
11:30:57,104 DEBUG systemd:Registering bus object implementation for path=/org/freedesktop/systemd1/unit iface=org.freedesktop.systemd1.Socket
11:30:57,105 DEBUG systemd:Registering bus object implementation for path=/org/freedesktop/systemd1/unit iface=org.freedesktop.systemd1.Swap
11:30:57,105 DEBUG systemd:Registering bus object implementation for path=/org/freedesktop/systemd1/unit iface=org.freedesktop.systemd1.Target
11:30:57,105 DEBUG systemd:Registering bus object implementation for path=/org/freedesktop/systemd1/unit iface=org.freedesktop.systemd1.Timer
11:30:57,106 DEBUG systemd:Registering bus object implementation for path=/org/freedesktop/LogControl1 iface=org.freedesktop.LogControl1
11:30:57,106 DEBUG systemd:Accepted new private connection.
11:30:57,107 DEBUG systemd:Bus private-bus-connection: changing state AUTHENTICATING → RUNNING
11:30:57,107 DEBUG systemd:Got message type=method_call sender=n/a destination=org.freedesktop.systemd1 path=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=StartTransientUnit cookie=1 reply_cookie=0 signature=ssa(sv)a(sa(sv)) error-name=n/a error-message=n/a
11:30:57,119 DEBUG systemd:SELinux access check scon=system_u:system_r:rpm_script_t:s0 tcon=system_u:system_r:kernel_t:s0 tclass=system perm=start state=permissive function=method_start_transient_unit path=n/a cmdline=n/a: Success
11:30:57,137 DEBUG systemd:run-rd032bb8810564d35a392bbe9a8f0051a.service: Failed to load configuration: No such file or directory
11:30:57,148 DEBUG systemd:SELinux reload 7
11:30:57,184 DEBUG systemd:Successfully loaded SELinux database in 35.400ms, size on heap is 1106K.
11:30:57,216 DEBUG systemd:SELinux access check scon=system_u:system_r:rpm_script_t:s0 tcon=system_u:object_r:systemd_unit_file_t:s0 tclass=service perm=start state=permissive function=bus_unit_queue_job path=/run/systemd/transient/run-rd032bb8810564d35a392bbe9a8f0051a.service cmdline=n/a: Success
11:30:57,216 DEBUG systemd:run-rd032bb8810564d35a392bbe9a8f0051a.service: Trying to enqueue job run-rd032bb8810564d35a392bbe9a8f0051a.service/start/fail
11:30:57,217 DEBUG systemd:run-rd032bb8810564d35a392bbe9a8f0051a.service: Installed new job run-rd032bb8810564d35a392bbe9a8f0051a.service/start as 1451
11:30:57,218 DEBUG systemd:run-rd032bb8810564d35a392bbe9a8f0051a.service: Enqueued job run-rd032bb8810564d35a392bbe9a8f0051a.service/start as 1451
11:30:57,218 DEBUG systemd:Sent message type=signal sender=org.freedesktop.systemd1 destination=n/a path=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=UnitNew cookie=1 reply_cookie=0 signature=so error-name=n/a error-message=n/a
11:30:57,218 DEBUG systemd:Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=UnitNew cookie=1906 reply_cookie=0 signature=so error-name=n/a error-message=n/a
11:30:57,218 DEBUG systemd:Sent message type=signal sender=org.freedesktop.systemd1 destination=n/a path=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=JobNew cookie=2 reply_cookie=0 signature=uos error-name=n/a error-message=n/a
11:30:57,219 DEBUG systemd:Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=JobNew cookie=1907 reply_cookie=0 signature=uos error-name=n/a error-message=n/a
11:30:57,219 DEBUG systemd:Sent message type=method_return sender=org.freedesktop.systemd1 destination=n/a path=n/a interface=n/a member=n/a cookie=3 reply_cookie=1 signature=o error-name=n/a error-message=n/a
11:30:57,222 DEBUG systemd:Failed to read pids.max attribute of root cgroup, ignoring: No data available
11:30:57,223 DEBUG systemd:run-rd032bb8810564d35a392bbe9a8f0051a.service: Passing 0 fds to service
11:30:57,223 DEBUG systemd:run-rd032bb8810564d35a392bbe9a8f0051a.service: About to execute /usr/bin/systemctl start man-db-cache-update
11:30:57,226 DEBUG systemd:run-rd032bb8810564d35a392bbe9a8f0051a.service: Forked /usr/bin/systemctl as 25939
11:30:57,228 DEBUG systemd:Sent message type=signal sender=org.freedesktop.systemd1 destination=n/a path=/org/freedesktop/systemd1/unit/run_2drd032bb8810564d35a392bbe9a8f0051a_2eservice interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=4 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a
11:30:57,228 DEBUG systemd:Sent message type=signal sender=org.freedesktop.systemd1 destination=n/a path=/org/freedesktop/systemd1/unit/run_2drd032bb8810564d35a392bbe9a8f0051a_2eservice interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=5 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a
11:30:57,229 DEBUG systemd:Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1/unit/run_2drd032bb8810564d35a392bbe9a8f0051a_2eservice interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=1908 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a
11:30:57,229 DEBUG systemd:Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1/unit/run_2drd032bb8810564d35a392bbe9a8f0051a_2eservice interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=1909 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a
11:30:57,229 DEBUG systemd-logind:Got message type=signal sender=:1.2 destination=n/a path=/org/freedesktop/systemd1/unit/run_2drd032bb8810564d35a392bbe9a8f0051a_2eservice interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=1908 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a
11:30:57,230 DEBUG systemd-logind:Got message type=signal sender=:1.2 destination=n/a path=/org/freedesktop/systemd1/unit/run_2drd032bb8810564d35a392bbe9a8f0051a_2eservice interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=1909 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a
11:30:57,231 DEBUG systemd:run-rd032bb8810564d35a392bbe9a8f0051a.service: Changed dead -> running
11:30:57,231 DEBUG systemd:run-rd032bb8810564d35a392bbe9a8f0051a.service: Job 1451 run-rd032bb8810564d35a392bbe9a8f0051a.service/start finished, result=done
11:30:57,231 INFO systemd:Started run-rd032bb8810564d35a392bbe9a8f0051a.service - /usr/bin/systemctl start man-db-cache-update.
11:30:57,231 DEBUG systemd:Sent message type=signal sender=org.freedesktop.systemd1 destination=n/a path=/org/freedesktop/systemd1/unit/run_2drd032bb8810564d35a392bbe9a8f0051a_2eservice interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=6 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a
11:30:57,232 DEBUG systemd:Sent message type=signal sender=org.freedesktop.systemd1 destination=n/a path=/org/freedesktop/systemd1/unit/run_2drd032bb8810564d35a392bbe9a8f0051a_2eservice interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=7 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a
11:30:57,232 DEBUG systemd:Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1/unit/run_2drd032bb8810564d35a392bbe9a8f0051a_2eservice interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=1910 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a
11:30:57,232 DEBUG systemd:Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1/unit/run_2drd032bb8810564d35a392bbe9a8f0051a_2eservice interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=1911 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a
11:30:57,233 DEBUG systemd-logind:Got message type=signal sender=:1.2 destination=n/a path=/org/freedesktop/systemd1/unit/run_2drd032bb8810564d35a392bbe9a8f0051a_2eservice interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=1910 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a
11:30:57,233 DEBUG systemd-logind:Got message type=signal sender=:1.2 destination=n/a path=/org/freedesktop/systemd1/unit/run_2drd032bb8810564d35a392bbe9a8f0051a_2eservice interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=1911 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a
11:30:57,233 DEBUG systemd:Sent message type=signal sender=org.freedesktop.systemd1 destination=n/a path=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=JobRemoved cookie=8 reply_cookie=0 signature=uoss error-name=n/a error-message=n/a
11:30:57,233 DEBUG systemd:Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=JobRemoved cookie=1912 reply_cookie=0 signature=uoss error-name=n/a error-message=n/a
11:30:57,234 DEBUG systemd-journald:Successfully sent stream file descriptor to service manager.
11:30:57,235 DEBUG systemd:run-rd032bb8810564d35a392bbe9a8f0051a.service: Executing: /usr/bin/systemctl start man-db-cache-update
11:30:57,236 NOTICE audit:SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:kernel_t:s0 msg='unit=run-rd032bb8810564d35a392bbe9a8f0051a comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
11:30:57,237 DEBUG systemd:Bus private-bus-connection: changing state RUNNING → CLOSING
11:30:57,237 DEBUG systemd:Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1/unit/run_2drd032bb8810564d35a392bbe9a8f0051a_2eservice interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=1913 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a
11:30:57,237 DEBUG systemd:Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1/unit/run_2drd032bb8810564d35a392bbe9a8f0051a_2eservice interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=1914 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a
11:30:57,238 DEBUG systemd:run-rd032bb8810564d35a392bbe9a8f0051a.service: Failed to send unit change signal for run-rd032bb8810564d35a392bbe9a8f0051a.service: Connection reset by peer
11:30:57,239 DEBUG systemd:systemd-journald.service: Got notification message from PID 1478 (FDSTORE=1)
11:30:57,239 DEBUG systemd-logind:Got message type=signal sender=:1.2 destination=n/a path=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=JobRemoved cookie=1912 reply_cookie=0 signature=uoss error-name=n/a error-message=n/a
11:30:57,239 DEBUG systemd-logind:Got message type=signal sender=:1.2 destination=n/a path=/org/freedesktop/systemd1/unit/run_2drd032bb8810564d35a392bbe9a8f0051a_2eservice interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=1913 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a
11:30:57,239 DEBUG systemd-logind:Got message type=signal sender=:1.2 destination=n/a path=/org/freedesktop/systemd1/unit/run_2drd032bb8810564d35a392bbe9a8f0051a_2eservice interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=1914 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a
11:30:57,240 DEBUG systemd:systemd-journald.service: Added fd 57 (n/a) to fd store.
11:30:57,240 DEBUG systemd:Bus private-bus-connection: changing state CLOSING → CLOSED
11:30:57,240 DEBUG systemd:Got disconnect on private connection.
11:30:57,326 INFO systemctl:Bus n/a: changing state UNSET → OPENING
11:30:57,328 INFO systemctl:sd-bus: starting bus by connecting to /run/systemd/private...
11:30:57,332 DEBUG systemd:Bus private-bus-connection: changing state UNSET → OPENING
11:30:57,332 DEBUG systemd:sd-bus: starting bus private-bus-connection on fds 54/54 (socket:[95064], socket:[95064])...
11:30:57,332 DEBUG systemd:Bus private-bus-connection: changing state OPENING → AUTHENTICATING
11:30:57,332 DEBUG systemd:Registering bus object implementation for path=/org/freedesktop/systemd1 iface=org.freedesktop.systemd1.Manager
11:30:57,333 DEBUG systemd:Registering bus object implementation for path=/org/freedesktop/systemd1/job iface=org.freedesktop.systemd1.Job
11:30:57,333 DEBUG systemd:Registering bus object implementation for path=/org/freedesktop/systemd1/unit iface=org.freedesktop.systemd1.Unit
11:30:57,333 DEBUG systemd:Registering bus object implementation for path=/org/freedesktop/systemd1/unit iface=org.freedesktop.systemd1.Automount
11:30:57,333 DEBUG systemd:Registering bus object implementation for path=/org/freedesktop/systemd1/unit iface=org.freedesktop.systemd1.Device
11:30:57,333 DEBUG systemd:Registering bus object implementation for path=/org/freedesktop/systemd1/unit iface=org.freedesktop.systemd1.Mount
11:30:57,333 DEBUG systemd:Registering bus object implementation for path=/org/freedesktop/systemd1/unit iface=org.freedesktop.systemd1.Path
11:30:57,333 DEBUG systemd:Registering bus object implementation for path=/org/freedesktop/systemd1/unit iface=org.freedesktop.systemd1.Scope
11:30:57,333 DEBUG systemd:Registering bus object implementation for path=/org/freedesktop/systemd1/unit iface=org.freedesktop.systemd1.Service
11:30:57,333 DEBUG systemd:Registering bus object implementation for path=/org/freedesktop/systemd1/unit iface=org.freedesktop.systemd1.Slice
11:30:57,333 DEBUG systemd:Registering bus object implementation for path=/org/freedesktop/systemd1/unit iface=org.freedesktop.systemd1.Socket
11:30:57,334 DEBUG systemd:Registering bus object implementation for path=/org/freedesktop/systemd1/unit iface=org.freedesktop.systemd1.Swap
11:30:57,334 DEBUG systemd:Registering bus object implementation for path=/org/freedesktop/systemd1/unit iface=org.freedesktop.systemd1.Target
11:30:57,334 DEBUG systemd:Registering bus object implementation for path=/org/freedesktop/systemd1/unit iface=org.freedesktop.systemd1.Timer
11:30:57,334 DEBUG systemd:Registering bus object implementation for path=/org/freedesktop/LogControl1 iface=org.freedesktop.LogControl1
11:30:57,334 DEBUG systemd:Accepted new private connection.
11:30:57,335 INFO systemctl:Bus n/a: changing state OPENING → AUTHENTICATING
11:30:57,336 INFO systemctl:Executing dbus call org.freedesktop.systemd1.Manager StartUnit(man-db-cache-update.service, replace)
11:30:57,336 INFO systemctl:Bus n/a: changing state AUTHENTICATING → RUNNING
11:30:57,337 DEBUG systemd:Bus private-bus-connection: changing state AUTHENTICATING → RUNNING
11:30:57,338 DEBUG systemd:Got message type=method_call sender=n/a destination=org.freedesktop.systemd1 path=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=StartUnit cookie=1 reply_cookie=0 signature=ss error-name=n/a error-message=n/a
11:30:57,338 DEBUG systemd:man-db-cache-update.service: Failed to load configuration: No such file or directory
11:30:57,338 DEBUG systemd:SELinux access check scon=system_u:system_r:kernel_t:s0 tcon=system_u:system_r:kernel_t:s0 tclass=system perm=start state=permissive function=bus_unit_method_start_generic path=n/a cmdline=n/a: Success
11:30:57,338 DEBUG systemd:SELinux access check scon=system_u:system_r:kernel_t:s0 tcon=system_u:system_r:kernel_t:s0 tclass=system perm=start state=permissive function=bus_unit_queue_job path=n/a cmdline=n/a: Success
11:30:57,338 DEBUG systemd:man-db-cache-update.service: Trying to enqueue job man-db-cache-update.service/start/replace
11:30:57,338 DEBUG systemd:Sent message type=error sender=org.freedesktop.systemd1 destination=n/a path=n/a interface=n/a member=n/a cookie=1 reply_cookie=1 signature=s error-name=org.freedesktop.systemd1.NoSuchUnit error-message=Unit man-db-cache-update.service not found.
11:30:57,338 DEBUG systemd:Failed to process message type=method_call sender=n/a destination=org.freedesktop.systemd1 path=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=StartUnit cookie=1 reply_cookie=0 signature=ss error-name=n/a error-message=n/a: Unit man-db-cache-update.service not found.
11:30:57,339 DEBUG systemd:man-db-cache-update.service: Collecting.
11:30:57,339 DEBUG systemd:Sent message type=signal sender=org.freedesktop.systemd1 destination=n/a path=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=UnitNew cookie=2 reply_cookie=0 signature=so error-name=n/a error-message=n/a
11:30:57,339 DEBUG systemd:Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=UnitNew cookie=1915 reply_cookie=0 signature=so error-name=n/a error-message=n/a
11:30:57,339 DEBUG systemd:Sent message type=signal sender=org.freedesktop.systemd1 destination=n/a path=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=UnitRemoved cookie=3 reply_cookie=0 signature=so error-name=n/a error-message=n/a
11:30:57,339 DEBUG systemd:Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=UnitRemoved cookie=1916 reply_cookie=0 signature=so error-name=n/a error-message=n/a
11:30:57,339 DEBUG systemd-logind:Got message type=signal sender=:1.2 destination=n/a path=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=UnitRemoved cookie=1916 reply_cookie=0 signature=so error-name=n/a error-message=n/a
11:30:57,340 INFO systemctl:Sent message type=method_call sender=n/a destination=org.freedesktop.systemd1 path=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager member=StartUnit cookie=1 reply_cookie=0 signature=ss error-name=n/a error-message=n/a
11:30:57,340 INFO systemctl:Got message type=error sender=org.freedesktop.systemd1 destination=n/a path=n/a interface=n/a member=n/a cookie=1 reply_cookie=1 signature=s error-name=org.freedesktop.systemd1.NoSuchUnit error-message=Unit man-db-cache-update.service not found.
11:30:57,340 INFO systemctl:Failed to start man-db-cache-update.service: Unit man-db-cache-update.service not found.
11:30:57,340 INFO systemctl:Bus n/a: changing state RUNNING → CLOSED
11:30:57,341 DEBUG systemd:Bus private-bus-connection: changing state RUNNING → CLOSING
11:30:57,341 DEBUG systemd:Bus private-bus-connection: changing state CLOSING → CLOSED
11:30:57,341 DEBUG systemd:Got disconnect on private connection.
11:30:57,345 DEBUG systemd:systemd-journald.service: Received EPOLLHUP on stored fd 57 (stored), closing.
11:30:57,348 DEBUG systemd:Received SIGCHLD from PID 25939 (systemctl).
11:30:57,348 DEBUG systemd:Child 25939 (systemctl) died (code=exited, status=5/NOTINSTALLED)
11:30:57,348 DEBUG systemd:run-rd032bb8810564d35a392bbe9a8f0051a.service: Child 25939 belongs to run-rd032bb8810564d35a392bbe9a8f0051a.service.
11:30:57,348 NOTICE systemd:run-rd032bb8810564d35a392bbe9a8f0051a.service: Main process exited, code=exited, status=5/NOTINSTALLED
11:30:57,348 WARNING systemd:run-rd032bb8810564d35a392bbe9a8f0051a.service: Failed with result 'exit-code'.
11:30:57,348 DEBUG systemd:run-rd032bb8810564d35a392bbe9a8f0051a.service: Service will not restart (restart setting)
11:30:57,348 DEBUG systemd:run-rd032bb8810564d35a392bbe9a8f0051a.service: Changed running -> failed
11:30:57,349 DEBUG systemd:Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1 interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=1917 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a
11:30:57,349 DEBUG systemd-logind:Got message type=signal sender=:1.2 destination=n/a path=/org/freedesktop/systemd1 interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=1917 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a
11:30:57,363 DEBUG systemd:run-rd032bb8810564d35a392bbe9a8f0051a.service: Failed to delete cgroup entry from LSM BPF map: No such file or directory
11:30:57,364 DEBUG systemd:run-rd032bb8810564d35a392bbe9a8f0051a.service: Unit entered failed state.
11:30:57,365 NOTICE audit:SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:kernel_t:s0 msg='unit=run-rd032bb8810564d35a392bbe9a8f0051a comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'
11:30:57,365 DEBUG systemd:run-rd032bb8810564d35a392bbe9a8f0051a.service: Consumed 62ms CPU time.
11:30:57,366 DEBUG systemd:Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1/unit/run_2drd032bb8810564d35a392bbe9a8f0051a_2eservice interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=1918 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a
11:30:57,366 DEBUG systemd-logind:Got message type=signal sender=:1.2 destination=n/a path=/org/freedesktop/systemd1/unit/run_2drd032bb8810564d35a392bbe9a8f0051a_2eservice interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=1918 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a
11:30:57,366 DEBUG systemd-logind:Got message type=signal sender=:1.2 destination=n/a path=/org/freedesktop/systemd1/unit/run_2drd032bb8810564d35a392bbe9a8f0051a_2eservice interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=1919 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a
11:30:57,366 DEBUG systemd:Sent message type=signal sender=n/a destination=n/a path=/org/freedesktop/systemd1/unit/run_2drd032bb8810564d35a392bbe9a8f0051a_2eservice interface=org.freedesktop.DBus.Properties member=PropertiesChanged cookie=1919 reply_cookie=0 signature=sa{sv}as error-name=n/a error-message=n/a
11:30:57,368 DEBUG systemd:run-rd032bb8810564d35a392bbe9a8f0051a.service: Control group is empty.

After that, nothing else is logged in syslog for another 40 seconds, and what gets logged then is just regular NetworkManager activity; basically for the next 10-15 minutes while openQA is just waiting for the install to complete, nothing gets logged except NetworkManager and systemd doing stuff they do at regular intervals.

So it seems like the dnf transaction is somehow getting stuck. Comparing with a successful install, it seems like what's meant to come next is the verification stage. On a successful install, the next messages we see in anaconda.log are:

06:58:55,236 INF modules.payloads.payload.dnf.transaction_progress: Verifying: Box2D-2.4.1-7.fc36.x86_64 1642619958 1e94adde3f8e713ae01df924af4cb23d6cf42282291a8e294a2e10d5b52ac91c
06:58:55,237 INF modules.payloads.payload.dnf.transaction_progress: Verifying: CharLS-2.0.0-8.fc36.x86_64 1642620127 73b3ffe5936f582487f0a06702ed8b1e6b4f230e9a5d0b8bd2fbf788bc17bca0

etc. The next messages we see in packaging.log are:

06:58:56,620 DDEBUG dnf: timer: verify transaction: 2531 ms
06:58:56,627 DDEBUG dnf: timer: transaction: 905790 ms
06:58:57,375 DBG dnf: Installed: Box2D-2.4.1-7.fc36.x86_64
06:58:57,375 DBG dnf: Installed: CharLS-2.0.0-8.fc36.x86_64

etc. It's a little hard to tell what the next expected messages in syslog would be as it's a bit noisy.

It's particularly interesting to me that we get the message "dnf: RPM transaction over." in packaging.log in the failure case. dnf logs that message here: https://github.com/rpm-software-management/dnf/blob/master/dnf/base.py#L1077

so it seems like dnf is somehow getting stuck between that point and https://github.com/rpm-software-management/dnf/blob/master/dnf/base.py#L1126 , where the verification should happen. But I can't pinpoint it more precisely than that yet, or see *why* it's getting stuck.

As I mentioned, I haven't reproduced this manually, but then openQA does run over a hundred install tests on every compose. Usually we get maybe one or two failures like this per compose; today for some reason was a particularly bad day, we had I think five. So the chances of hitting it aren't super high and I really don't want to sit around here running a hundred installs in a row unless I really have to. :D

openQA tests are set to run on VMs with 2GB of RAM and a 13GB hard disk in most cases, which is pretty tight I guess, but there are no indications really of this being a resource starvation issue in the logs or how the VM behaves. When openQA eventually does give up, it switches to a VT and runs a bunch of commands to upload log files for us to examine; if the system was starved for resources I'd expect that to run slow or fail, but it does not. Also, two of the tests that failed today were configurations with 3GB of RAM.

Comment 2 Adam Williamson 2022-02-08 00:32:05 UTC
Created attachment 1859691 [details]
packaging.log from a failed test - https://openqa.fedoraproject.org/tests/1120794

Comment 3 Adam Williamson 2022-02-08 00:32:52 UTC
Created attachment 1859692 [details]
syslog (system log messages) from a failed test - https://openqa.fedoraproject.org/tests/1120794

Comment 4 Ben Cotton 2022-02-08 20:07:09 UTC
This bug appears to have been reported against 'rawhide' during the Fedora 36 development cycle.
Changing version to 36.

Comment 5 Lukáš Hrázký 2022-02-14 15:23:15 UTC
I would assume it's not feasible to run strace on the process automatically to see what it hangs on?

One of the possible culprits could be the sqlite history db. It doesn't seem to be possible to disable the history on dnf level (just to see if the issue would go away).

Comment 6 Adam Williamson 2022-02-14 16:57:12 UTC
Um, off the top of my head, not sure if it's feasible, but it'd probably be difficult. anaconda doesn't just run dnf, it uses dnf's python bindings - via multiprocessing, just to make things more fun - so we'd have to attach strace to whatever anaconda process is doing that, somehow. It might wind up running too slowly to be practical.

What I was going to do was patch dnf to log on every line around the affected area to try and figure out where we were getting to. However, I changed focus temporarily, because of an interesting coincidence. The problem happening more often seemed to just about coincide with the installer images getting bigger - https://bugzilla.redhat.com/show_bug.cgi?id=2031215 - which means the installer environment itself is bigger, which means that it's taking up more RAM (the installer environment is loaded into RAM because there's nowhere else to safely put it). So I worked on making the installer environment smaller instead - I'm getting several lorax PRs to that end merged at the moment. I'm hoping that when that gets done and the images get smaller again, this problem will become less common again, which would indicate it really is to do with memory consumption.

Comment 7 Lukáš Hrázký 2022-02-14 17:36:05 UTC
(In reply to Adam Williamson from comment #6)
> What I was going to do was patch dnf to log on every line around the
> affected area to try and figure out where we were getting to.

I was reluctant to suggest that, but if you'd do it, it should lead to the culprit...

> The problem happening more often seemed to just about coincide with the
> installer images getting bigger -
> https://bugzilla.redhat.com/show_bug.cgi?id=2031215 - which means the
> installer environment itself is bigger, which means that it's taking up more
> RAM (the installer environment is loaded into RAM because there's nowhere
> else to safely put it). So I worked on making the installer environment
> smaller instead - I'm getting several lorax PRs to that end merged at the
> moment. I'm hoping that when that gets done and the images get smaller
> again, this problem will become less common again, which would indicate it
> really is to do with memory consumption.

Alright, let's see if that helps.

Comment 8 Ben Cotton 2023-04-25 16:53:44 UTC
This message is a reminder that Fedora Linux 36 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora Linux 36 on 2023-05-16.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
'version' of '36'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, change the 'version' 
to a later Fedora Linux version. Note that the version field may be hidden.
Click the "Show advanced fields" button if you do not see it.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora Linux 36 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora Linux, you are encouraged to change the 'version' to a later version
prior to this bug being closed.

Comment 9 Adam Williamson 2023-04-27 22:56:10 UTC
This is still happening, and I still didn't get the roundtuits to do the detailed investigation. But I still mean to! Otherwise, maybe when we switch to DNF 5 it'll help. :D

Comment 10 Fedora Release Engineering 2023-08-16 08:07:17 UTC
This bug appears to have been reported against 'rawhide' during the Fedora Linux 39 development cycle.
Changing version to 39.

Comment 11 Jaroslav Mracek 2023-08-23 14:25:01 UTC
Fixing hanging issues is a nightmare because there is no hint where to start.

Is it possible to try older Fedoras to investigate the first version of dnf that exhibits the issue? This might be a good start point. Then we could modify dnf version or environment or package content in the transaction to get a better picture of the trigger.

Comment 12 Jaroslav Mracek 2023-08-23 14:26:38 UTC
Maybe the issue is related to https://bugzilla.redhat.com/show_bug.cgi?id=2054230 where we have the same problem with reproducing the issue.

Comment 13 Adam Williamson 2023-08-23 15:58:21 UTC
"Is it possible to try older Fedoras to investigate the first version of dnf that exhibits the issue?"

The trouble is it doesn't happen often at all, I'm only aware of it because it happens in openQA where obviously we run a *lot* of install tests (dozens every day), of which maybe 1 in 50 or so (rough guess) run into this. It'd be extremely tedious installing Fedora 22 50 times in a row to try and hit this, plus, it's very hard to decide when to *stop* testing a release that isn't failing - maybe you're just getting lucky?

We don't keep the openQA tests backwards-compatible forever, it'd just require too many ugly branches in the code and keeping too many old needles around (which affects performance). So we can't run current openQA tests on older releases. Theoretically we could check out older states of the openQA tests and run *those* on older releases, but in practice that's a bit difficult (since we only have two large-scale openQA deployments and we kinda need to those to run the *current* test code...)

I appreciate this isn't very helpful, and I'm not blaming anyone else for this bug not moving forward, it's definitely on me to try and get some more useful info on it. But it just keeps not being the most important thing I have to do right now so it never happens :/ I *do* think I had a viable plan to fiddle around with anaconda code a bit to get some more info on at least exactly what's going on when it gets stuck, I just never got around to doing it.

Comment 14 Adam Williamson 2023-08-23 16:00:16 UTC
Jaroslav: that does look very similar, yeah.

Comment 15 Jaroslav Mracek 2023-08-25 09:27:21 UTC
I am closing as a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=2054230. The last comment in the issue says that the issue is not reproducible anymore. Please feel free to open the issue if the problem reappear. From the investigation it looks like that it hangs on object in DNF/libdnf that were not modified for ages and during list comprehension of object defined in c-python. It also looks like that the issue is related of running DNF it threads, that is only used by Anaconda. I have to say that there is very low chance to find the cause of such a problem.

*** This bug has been marked as a duplicate of bug 2054230 ***


Note You need to log in before you can comment on or make changes to this bug.