Description of problem: rpm-ostree crashes when attempting to install packages with on Fedora IoT on a Raspberry Pi Zero 2 W. It appears to be killed by systemd-oomd. Version-Release number of selected component (if applicable): Fedora-IoT-36-20220326.0 How reproducible: 100% Steps to Reproduce: 1. Flash an SD Card with Fedora-IoT-36-20220326.0.aarch64.xz using the Fedora `arm-image-installer` package. The full command I used to create the SD card is provided here. ``` $ sudo arm-image-installer \ --image="$HOME/Downloads/Fedora-IoT-36-20220326.0.aarch64.raw.xz" \ --media=/dev/sda \ --addconsole \ --addkey="$HOME/.ssh/id_rsa.pub" \ --norootpass \ --relabel \ --resizefs \ --showboot \ --target=rpi3 \ -y ``` 2. Insert SD Card in the Raspberry Pi Zero 2 W and boot. 3. Log in and attempt to install any package or packages with rpm-ostree. ``` $ rpm-ostree install fish ``` Actual results: Package fails to install. ``` $ rpm-ostree install fish Checking out tree 8d4a5a2... done Enabled rpm-md repositories: updates fedora-cisco-openh264 updates-testing fedora Importing rpm-md... done error: Bus owner changed, aborting. This likely means the daemon crashed; check logs with `journalctl -xe`. ``` Expected results: rpm-ostree should succesfully overlay the package. Additional info: rpm-ostreed log output when failure occurs is provided below. ``` $ journalctl -b -u rpm-ostreed Mar 26 11:56:46 zero2w-01.jwillikers.io systemd[1]: Starting rpm-ostreed.service - rpm-ostree System Management Daemon... Mar 26 11:56:46 zero2w-01.jwillikers.io rpm-ostree[932]: Reading config file '/etc/rpm-ostreed.conf' Mar 26 11:56:49 zero2w-01.jwillikers.io rpm-ostree[932]: In idle state; will auto-exit in 63 seconds Mar 26 11:56:49 zero2w-01.jwillikers.io systemd[1]: Started rpm-ostreed.service - rpm-ostree System Management Daemon. Mar 26 11:56:49 zero2w-01.jwillikers.io rpm-ostree[932]: Allowing active client :1.20 (uid 0) Mar 26 11:56:49 zero2w-01.jwillikers.io rpm-ostree[932]: client(id:cli dbus:1.20 unit:session-1.scope uid:0) added; new total=1 Mar 26 11:56:49 zero2w-01.jwillikers.io rpm-ostree[932]: Locked sysroot Mar 26 11:56:49 zero2w-01.jwillikers.io rpm-ostree[932]: Initiated txn PkgChange for client(id:cli dbus:1.20 unit:session-1.scope uid:0): /org/projectatomic/rpmostree1/fedora_iot Mar 26 11:56:49 zero2w-01.jwillikers.io rpm-ostree[932]: Process [pid: 925 uid: 0 unit: session-1.scope] connected to transaction progress Mar 26 11:56:50 zero2w-01.jwillikers.io rpm-ostree[932]: Librepo version: 1.14.2 with CURL_GLOBAL_ACK_EINTR support (libcurl/7.81.0 OpenSSL/3.0.2 zlib/1.2.11 brotli/1.0.9 libidn2/2.3.2 libpsl/0.21.1 (+libidn2/2.3.2) libssh/0.9.6/openssl/zlib nghttp2/1.46.0 OpenLDAP/2.6.1) Mar 26 11:57:12 zero2w-01.jwillikers.io rpm-ostree[932]: Downloading: https://pkgs.tailscale.com/stable/fedora/repo.gpg Mar 26 11:57:13 zero2w-01.jwillikers.io rpm-ostree[932]: Downloading: https://pkgs.tailscale.com/stable/fedora/aarch64/repodata/repomd.xml Mar 26 11:57:14 zero2w-01.jwillikers.io rpm-ostree[932]: Downloading: https://pkgs.tailscale.com/stable/fedora/aarch64/repodata/repomd.xml.asc Mar 26 11:57:15 zero2w-01.jwillikers.io rpm-ostree[932]: Downloading: https://pkgs.tailscale.com/stable/fedora/aarch64/repodata/fbc4d973f6685b79fc3235dc4d3973e30a6abd6dca9a37c48b8c814ace531c4f-filelists.xml.gz Mar 26 11:57:15 zero2w-01.jwillikers.io rpm-ostree[932]: Downloading: https://pkgs.tailscale.com/stable/fedora/aarch64/repodata/5dcfb2273ed51ec0f857dd873fbf1fda0d46e7b43be7b8c11110b6e3aec525cf-primary.xml.gz Mar 26 12:00:10 zero2w-01.jwillikers.io systemd[1]: rpm-ostreed.service: A process of this unit has been killed by the OOM killer. Mar 26 12:00:10 zero2w-01.jwillikers.io systemd[1]: rpm-ostreed.service: Main process exited, code=killed, status=9/KILL Mar 26 12:00:10 zero2w-01.jwillikers.io systemd[1]: rpm-ostreed.service: Failed with result 'oom-kill'. Mar 26 12:00:11 zero2w-01.jwillikers.io systemd[1]: rpm-ostreed.service: Consumed 57.722s CPU time. Mar 26 12:03:12 zero2w-01.jwillikers.io systemd[1]: Starting rpm-ostreed.service - rpm-ostree System Management Daemon... Mar 26 12:03:12 zero2w-01.jwillikers.io rpm-ostree[1415]: Reading config file '/etc/rpm-ostreed.conf' Mar 26 12:03:13 zero2w-01.jwillikers.io rpm-ostree[1415]: In idle state; will auto-exit in 64 seconds Mar 26 12:03:13 zero2w-01.jwillikers.io systemd[1]: Started rpm-ostreed.service - rpm-ostree System Management Daemon. Mar 26 12:03:13 zero2w-01.jwillikers.io rpm-ostree[1415]: Allowing active client :1.28 (uid 0) Mar 26 12:03:13 zero2w-01.jwillikers.io rpm-ostree[1415]: client(id:cli dbus:1.28 unit:session-4.scope uid:0) added; new total=1 Mar 26 12:03:13 zero2w-01.jwillikers.io rpm-ostree[1415]: client(id:cli dbus:1.28 unit:session-4.scope uid:0) vanished; remaining=0 Mar 26 12:03:13 zero2w-01.jwillikers.io rpm-ostree[1415]: In idle state; will auto-exit in 61 seconds Mar 26 12:04:14 zero2w-01.jwillikers.io systemd[1]: rpm-ostreed.service: Deactivated successfully. Mar 26 12:05:26 zero2w-01.jwillikers.io systemd[1]: Starting rpm-ostreed.service - rpm-ostree System Management Daemon... Mar 26 12:05:26 zero2w-01.jwillikers.io rpm-ostree[1486]: Reading config file '/etc/rpm-ostreed.conf' Mar 26 12:05:26 zero2w-01.jwillikers.io systemd[1]: Started rpm-ostreed.service - rpm-ostree System Management Daemon. Mar 26 12:05:26 zero2w-01.jwillikers.io rpm-ostree[1486]: In idle state; will auto-exit in 63 seconds Mar 26 12:05:26 zero2w-01.jwillikers.io rpm-ostree[1486]: Allowing active client :1.30 (uid 0) Mar 26 12:05:26 zero2w-01.jwillikers.io rpm-ostree[1486]: client(id:cli dbus:1.30 unit:session-4.scope uid:0) added; new total=1 Mar 26 12:05:26 zero2w-01.jwillikers.io rpm-ostree[1486]: Locked sysroot Mar 26 12:05:26 zero2w-01.jwillikers.io rpm-ostree[1486]: Initiated txn PkgChange for client(id:cli dbus:1.30 unit:session-4.scope uid:0): /org/projectatomic/rpmostree1/fedora_iot Mar 26 12:05:26 zero2w-01.jwillikers.io rpm-ostree[1486]: Process [pid: 1478 uid: 0 unit: session-4.scope] connected to transaction progress Mar 26 12:05:27 zero2w-01.jwillikers.io rpm-ostree[1486]: Librepo version: 1.14.2 with CURL_GLOBAL_ACK_EINTR support (libcurl/7.81.0 OpenSSL/3.0.2 zlib/1.2.11 brotli/1.0.9 libidn2/2.3.2 libpsl/0.21.1 (+libidn2/2.3.2) libssh/0.9.6/openssl/zlib nghttp2/1.46.0 OpenLDAP/2.6.1) Mar 26 12:07:03 zero2w-01.jwillikers.io systemd[1]: rpm-ostreed.service: A process of this unit has been killed by the OOM killer. Mar 26 12:07:03 zero2w-01.jwillikers.io systemd[1]: rpm-ostreed.service: Main process exited, code=killed, status=9/KILL Mar 26 12:07:03 zero2w-01.jwillikers.io systemd[1]: rpm-ostreed.service: Failed with result 'oom-kill'. Mar 26 12:07:03 zero2w-01.jwillikers.io systemd[1]: rpm-ostreed.service: Consumed 35.293s CPU time. Mar 26 12:15:22 zero2w-01.jwillikers.io systemd[1]: Starting rpm-ostreed.service - rpm-ostree System Management Daemon... Mar 26 12:15:22 zero2w-01.jwillikers.io rpm-ostree[1806]: Reading config file '/etc/rpm-ostreed.conf' Mar 26 12:15:22 zero2w-01.jwillikers.io rpm-ostree[1806]: In idle state; will auto-exit in 60 seconds Mar 26 12:15:22 zero2w-01.jwillikers.io systemd[1]: Started rpm-ostreed.service - rpm-ostree System Management Daemon. Mar 26 12:15:22 zero2w-01.jwillikers.io rpm-ostree[1806]: Allowing active client :1.33 (uid 0) Mar 26 12:15:22 zero2w-01.jwillikers.io rpm-ostree[1806]: client(id:cli dbus:1.33 unit:session-4.scope uid:0) added; new total=1 Mar 26 12:15:22 zero2w-01.jwillikers.io rpm-ostree[1806]: Locked sysroot Mar 26 12:15:22 zero2w-01.jwillikers.io rpm-ostree[1806]: Initiated txn PkgChange for client(id:cli dbus:1.33 unit:session-4.scope uid:0): /org/projectatomic/rpmostree1/fedora_iot Mar 26 12:15:22 zero2w-01.jwillikers.io rpm-ostree[1806]: Process [pid: 1798 uid: 0 unit: session-4.scope] connected to transaction progress Mar 26 12:15:23 zero2w-01.jwillikers.io rpm-ostree[1806]: Librepo version: 1.14.2 with CURL_GLOBAL_ACK_EINTR support (libcurl/7.81.0 OpenSSL/3.0.2 zlib/1.2.11 brotli/1.0.9 libidn2/2.3.2 libpsl/0.21.1 (+libidn2/2.3.2) libssh/0.9.6/openssl/zlib nghttp2/1.46.0 OpenLDAP/2.6.1) Mar 26 12:15:24 zero2w-01.jwillikers.io rpm-ostree[1806]: Txn PkgChange on /org/projectatomic/rpmostree1/fedora_iot failed: Removing extensions/rpmostree/private/commit: Operation was cancelled Mar 26 12:15:24 zero2w-01.jwillikers.io rpm-ostree[1806]: Unlocked sysroot Mar 26 12:15:24 zero2w-01.jwillikers.io rpm-ostree[1806]: Process [pid: 1798 uid: 0 unit: session-4.scope] disconnected from transaction progress Mar 26 12:15:24 zero2w-01.jwillikers.io rpm-ostree[1806]: client(id:cli dbus:1.33 unit:session-4.scope uid:0) vanished; remaining=0 Mar 26 12:15:24 zero2w-01.jwillikers.io rpm-ostree[1806]: In idle state; will auto-exit in 60 seconds Mar 26 12:16:24 zero2w-01.jwillikers.io rpm-ostree[1806]: In idle state; will auto-exit in 60 seconds Mar 26 12:16:24 zero2w-01.jwillikers.io systemd[1]: rpm-ostreed.service: Deactivated successfully. Mar 26 12:16:24 zero2w-01.jwillikers.io systemd[1]: rpm-ostreed.service: Consumed 1.476s CPU time. Mar 26 12:18:23 zero2w-01.jwillikers.io systemd[1]: Starting rpm-ostreed.service - rpm-ostree System Management Daemon... Mar 26 12:18:24 zero2w-01.jwillikers.io rpm-ostree[2024]: Reading config file '/etc/rpm-ostreed.conf' Mar 26 12:18:24 zero2w-01.jwillikers.io rpm-ostree[2024]: In idle state; will auto-exit in 64 seconds Mar 26 12:18:24 zero2w-01.jwillikers.io systemd[1]: Started rpm-ostreed.service - rpm-ostree System Management Daemon. Mar 26 12:18:24 zero2w-01.jwillikers.io rpm-ostree[2024]: Allowing active client :1.42 (uid 0) Mar 26 12:18:24 zero2w-01.jwillikers.io rpm-ostree[2024]: client(id:cli dbus:1.42 unit:session-5.scope uid:0) added; new total=1 Mar 26 12:18:24 zero2w-01.jwillikers.io rpm-ostree[2024]: Locked sysroot Mar 26 12:18:24 zero2w-01.jwillikers.io rpm-ostree[2024]: Initiated txn PkgChange for client(id:cli dbus:1.42 unit:session-5.scope uid:0): /org/projectatomic/rpmostree1/fedora_iot Mar 26 12:18:24 zero2w-01.jwillikers.io rpm-ostree[2024]: Process [pid: 2017 uid: 0 unit: session-5.scope] connected to transaction progress Mar 26 12:18:24 zero2w-01.jwillikers.io rpm-ostree[2024]: Librepo version: 1.14.2 with CURL_GLOBAL_ACK_EINTR support (libcurl/7.81.0 OpenSSL/3.0.2 zlib/1.2.11 brotli/1.0.9 libidn2/2.3.2 libpsl/0.21.1 (+libidn2/2.3.2) libssh/0.9.6/openssl/zlib nghttp2/1.46.0 OpenLDAP/2.6.1) Mar 26 12:19:35 zero2w-01.jwillikers.io systemd[1]: rpm-ostreed.service: A process of this unit has been killed by the OOM killer. Mar 26 12:19:35 zero2w-01.jwillikers.io systemd[1]: rpm-ostreed.service: Main process exited, code=killed, status=9/KILL Mar 26 12:19:35 zero2w-01.jwillikers.io systemd[1]: rpm-ostreed.service: Failed with result 'oom-kill'. Mar 26 12:19:35 zero2w-01.jwillikers.io systemd[1]: rpm-ostreed.service: Consumed 33.281s CPU time. ```
TBH I don't think this is rpm-ostree as I'm also seeing things die on a traditional rpm based system when trying to update a Zero2W. I think something else has changed, I think it might be a libdnf change or something that both dnf and rpm-ostree link against.
(In reply to Peter Robinson from comment #1) > TBH I don't think this is rpm-ostree as I'm also seeing things die on a > traditional rpm based system when trying to update a Zero2W. I think > something else has changed, I think it might be a libdnf change or something > that both dnf and rpm-ostree link against. Yes, I just tried Fedora Minimal 36. `dnf upgrade` is also killed by systemd-oomd.
Hello, the Raspberry Pi Zero 2 W has 512MB of RAM. This is way below the Fedora minimal spec of 2GB. Your process is being killed because you've run out of memory. I'm not saying Fedora shouldn't work on these minimal platforms, but it's a fact that dnf eats several hundreds of megabytes when reading the repository data into memory, so that's already pushing it. The amount of memory is obviously directly tied to the size of the repositories, so what likely changed is just that the repos have grown over time. I'm not entirely sure if rpm-ostree changes something in the grand scheme of things but I don't think it does. So it'd be best to investigate the repository size (possibly see what caused the increase) and see if it could be reduced. There's a possible optimization on dnf side that we are planning for the next version of dnf (dnf 5), in not always loading the rpm filelists, which should reduce the footprint. But that's still somewhat far in the future (first release is aimed at fedora 38 right now) and is a bit problematic because the filelists are being used by dependency resolution (though that should really be migrated away from).
(In reply to Lukáš Hrázký from comment #4) > Hello, the Raspberry Pi Zero 2 W has 512MB of RAM. This is way below the > Fedora minimal spec of 2GB. Your process is being killed because you've run > out of memory. I'm not saying Fedora shouldn't work on these minimal > platforms, but it's a fact that dnf eats several hundreds of megabytes when > reading the repository data into memory, so that's already pushing it. The > amount of memory is obviously directly tied to the size of the repositories, > so what likely changed is just that the repos have grown over time. So the 2Gb minimum requirement for Fedora is for running anaconda and dealing with loading the selinux into memory in a RAM disk. Fedora has run just fine in even 265Mb of RAM on Arm devices for some time including dnf, this is a recent regression in the use of memory. In cases like cloud images and arm images when you're not running the anaconda installer the memory usage is much lower. It's no uncommon to have cloud instances with just 512Mb of RAM. > I'm not entirely sure if rpm-ostree changes something in the grand scheme of > things but I don't think it does. I also see this problem with a traditional Fedora on the Zero2W > So it'd be best to investigate the repository size (possibly see what caused > the increase) and see if it could be reduced. This was working fine until the last update of dnf.
What are the good and bad versions then? Could you make a comparison of the memory footprint using the exact same repositories?
The last time dnf worked on my rpi-zero2w the following was upgraded: [root@rpi-zero2w ~]# dnf upgrade --exclude=grub2-tools-extra --refresh Fedora 36 -aarch64 13 kB/s | 12 kB 00:00 Fedora 36 openh264 (From Cisco) -aarch64 1.0 kB/s | 990 B 00:00 Fedora 36 -aarch64 - Updates 9.3 kB/s | 19 kB 00:02 Fedora 36 -aarch64 - Test Updates 13 kB/s | 13 kB 00:00 Dependencies resolved. =========================================================================================== Package Architecture Version Repository Size =========================================================================================== Upgrading: NetworkManager aarch64 1:1.36.2-1.fc36 updates-testing 1.9 M NetworkManager-initscripts-updown noarch 1:1.36.2-1.fc36 updates-testing 14 k NetworkManager-libnm aarch64 1:1.36.2-1.fc36 updates-testing 1.6 M NetworkManager-wifi aarch64 1:1.36.2-1.fc36 updates-testing 118 k bind-libs aarch64 32:9.16.27-1.fc36 updates-testing 1.2 M bind-license noarch 32:9.16.27-1.fc36 updates-testing 16 k bind-utils aarch64 32:9.16.27-1.fc36 updates-testing 206 k curl aarch64 7.82.0-2.fc36 updates-testing 305 k dnf noarch 4.11.1-1.fc36 updates-testing 454 k dnf-data noarch 4.11.1-1.fc36 updates-testing 42 k dnf-plugins-core noarch 4.1.0-1.fc36 updates-testing 34 k libcurl aarch64 7.82.0-2.fc36 updates-testing 294 k libdnf aarch64 0.66.0-1.fc36 updates-testing 613 k microdnf aarch64 3.8.1-1.fc36 updates-testing 48 k openssl-libs aarch64 1:3.0.2-1.fc36 updates-testing 2.0 M python-unversioned-command noarch 3.10.3-1.fc36 updates-testing 10 k python3 aarch64 3.10.3-1.fc36 updates-testing 27 k python3-dnf noarch 4.11.1-1.fc36 updates-testing 414 k python3-dnf-plugins-core noarch 4.1.0-1.fc36 updates-testing 220 k python3-hawkey aarch64 0.66.0-1.fc36 updates-testing 101 k python3-libdnf aarch64 0.66.0-1.fc36 updates-testing 743 k python3-libs aarch64 3.10.3-1.fc36 updates-testing 7.3 M systemd aarch64 250.3-8.fc36 updates-testing 4.1 M systemd-libs aarch64 250.3-8.fc36 updates-testing 586 k systemd-networkd aarch64 250.3-8.fc36 updates-testing 516 k systemd-oomd-defaults noarch 250.3-8.fc36 updates-testing 27 k systemd-pam aarch64 250.3-8.fc36 updates-testing 323 k systemd-resolved aarch64 250.3-8.fc36 updates-testing 258 k systemd-udev aarch64 250.3-8.fc36 updates-testing 1.8 M vim-data noarch 2:8.2.4579-1.fc36 updates-testing 28 k vim-minimal aarch64 2:8.2.4579-1.fc36 updates-testing 696 k wget aarch64 1.21.3-1.fc36 updates-testing 771 k yum noarch 4.11.1-1.fc36 updates-testing 40 k Installing dependencies: NetworkManager-initscripts-ifcfg-rh aarch64 1:1.36.2-1.fc36 updates-testing 113 k Transaction Summary =========================================================================================== Install 1 Package Upgrade 33 Packages Total download size: 27 M Is this ok [y/N]:
Proposed as a Blocker for 36-final by Fedora user jwillikers using the blocker tracking app because: Basic Release Criterion: https://fedoraproject.org/wiki/Basic_Release_Criteria#Installing.2C_removing_and_updating_software "The installed system must be able appropriately to install, remove, and update software with the default console tool for the relevant software type (e.g. default console package manager). This includes downloading of packages to be installed/updated." Basic Release Criterion (Fedora IoT): https://fedoraproject.org/wiki/Basic_Release_Criteria#rpm-ostree_requirements "It must be possible to install additional software with the rpm-ostree install command. Software installation must also include dependencies where necessary and installed software should provide the intended functionality." This bugs makes it not possible to install additional software via DNF or rpm-ostree on any system with 1/2 GiB of RAM which is common for Arm devices and cloud images.
Discussed during the 2022-04-04 blocker review meeting: [1] The decision to classify this bug as an RejectedBlocker was made: "The RPi Zero 2W is explicitly stated that it is not currently supported by Fedora ARM, so we reject this bug as a blocker." [1] https://meetbot-raw.fedoraproject.org/fedora-blocker-review/2022-04-04/f36-blocker-review.2022-04-04-16.00.log.html
A test in an F36 container on x86_64: # dnf install time ... # dnf rq --installed dnf dnf-0:4.10.0-2.fc36.noarch # dnf clean all 36 files removed # /usr/bin/time -v dnf upgrade ... Command exited with non-zero status 1 Command being timed: "dnf upgrade" User time (seconds): 25.46 System time (seconds): 1.47 Percent of CPU this job got: 52% Elapsed (wall clock) time (h:mm:ss or m:ss): 0:50.83 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 523572 <<<<<< the total memory consumed: 523MB Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 253 Minor (reclaiming a frame) page faults: 347318 Voluntary context switches: 12644 Involuntary context switches: 378 Swaps: 0 File system inputs: 7672 File system outputs: 625976 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 1 # dnf upgrade dnf ... # dnf rq --installed dnf dnf-0:4.11.1-2.fc36.noarch # dnf clean all 36 files removed # /usr/bin/time -v dnf upgrade ... Command exited with non-zero status 1 Command being timed: "dnf upgrade" User time (seconds): 26.40 System time (seconds): 1.48 Percent of CPU this job got: 72% Elapsed (wall clock) time (h:mm:ss or m:ss): 0:38.49 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 533084 <<<<<< the total memory consumed: 533MB Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 245 Minor (reclaiming a frame) page faults: 329871 Voluntary context switches: 12366 Involuntary context switches: 280 Swaps: 0 File system inputs: 32 File system outputs: 564184 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 1 We can see an increase of 10MB in memory consumption, I've run the second part another time and it was 517MB, actually lower. I see no significant memory consumption increase between the two versions, but feel free to provide a similar test in your environment to prove there's one in your case. Unfortunately the older versions of the packages are no longer in F36 repos, I've downloaded them from here manually: https://koji.fedoraproject.org/koji/buildinfo?buildID=1881803 https://koji.fedoraproject.org/koji/buildinfo?buildID=1847273 https://koji.fedoraproject.org/koji/buildinfo?buildID=1847272 Again, the memory consumption is high indeed and we'd like for dnf to work on these low-spec devices, but it is what it is right now. If it wasn't clear originally, the memory is actually consumed by libsolv, which is what loads the repository data into memory. It is very non-trivial to fix, but we'd welcome any contributions in this space.
Does openSUSE organize there repositories differently? I'm a little confused why zypper doesn't have this problem while DNF does since they both use libsolv. Zypper, however, has much lower memory usage. The following is the performance of using zypper to install a package on openSUSE Tumbleweed on the Raspberry Pi Zero 2 W. # /usr/bin/time -v zypper install podman Loading repository data... Reading installed packages... Resolving package dependencies... The following 3 recommended packages were automatically selected: cni-plugins criu podman-cni-config The following 23 NEW packages are going to be installed: catatonit cni cni-plugins conmon criu fuse-overlayfs iptables libbsd0 libcontainers-common libfuse3-3 libip6tc2 libnet9 libnetfilter_conntrack3 libnfnetlink0 libprotobuf-c1 libslirp0 podman podman-cni-config python38-ipaddr python38-protobuf runc slirp4netns xtables-plugins 23 new packages to install. ... Checking for file conflicts: .............................................[done] ... Executing %posttrans scripts .............................................[done] Command being timed: "zypper install podman" User time (seconds): 23.00 System time (seconds): 8.51 Percent of CPU this job got: 38% Elapsed (wall clock) time (h:mm:ss or m:ss): 1:21.70 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 110612 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 1757 Minor (reclaiming a frame) page faults: 104141 Voluntary context switches: 17447 Involuntary context switches: 3463 Swaps: 0 File system inputs: 39024 File system outputs: 434432 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0
So (as you've touched on), you have a whole different set of repositories there. They contain a different number of packages, that may be the most significant factor. Second, I don't know the details of how zypper works, but did you clean its cache first? With how dnf works, it downloads repo metadata, loads them into memory (that's the memory peak) and then writes libsolv cache files for them. On another run, it only loads the cache files and the memory consumption is much lower due to some optimizations. If I re-run dnf with loading from libsolv cache I get: Maximum resident set size (kbytes): 160308 Another thing that could be causing this is that from what I've heard, zypper doesn't use file provides for dependency resolution (they've got ridden of it, as we also should I think), so it may not be loading the file lists and that can make a big difference in memory consumption. It'd be great if someone did a more thorough analysis and came with some conclusions (there may possibly also be savings to be made on the side of the repo metadata as they're generated). Right now we have higher priorities with dnf 5, which is also the place where e.g. a change to not use file lists would be done anyway, we most likely wouldn't be doing it for dnf 4.
If I remove all the existing repository metadata and the cache, before installing a package, there is a slight increase in the max amount of memory usage, but it's still close to 100 MiB. The command output is shown below. Running the `zypper update` and `zypper refresh` commands separately show similar results. If I find some more time I may be able to profile DNF and compare the Fedora / openSUSE repositories. It would be great to find someone with some working knowledge of Zypper. I'd be happy to see this situation improved for DNF 5. I appreciate all the feedback on this, thanks! # zypper clean --all All repositories have been cleaned up. # /usr/bin/time -v zypper install podman Retrieving repository 'openSUSE-Tumbleweed-Oss' metadata .................[done] Building repository 'openSUSE-Tumbleweed-Oss' cache ......................[done] Retrieving repository 'openSUSE-Tumbleweed-Update' metadata ..............[done] Building repository 'openSUSE-Tumbleweed-Update' cache ...................[done] Loading repository data... Reading installed packages... Resolving package dependencies... The following 3 recommended packages were automatically selected: cni-plugins criu podman-cni-config The following 23 NEW packages are going to be installed: catatonit cni cni-plugins conmon criu fuse-overlayfs iptables libbsd0 libcontainers-common libfuse3-3 libip6tc2 libnet9 libnetfilter_conntrack3 libnfnetlink0 libprotobuf-c1 libslirp0 podman podman-cni-config python38-ipaddr python38-protobuf runc slirp4netns xtables-plugins 23 new packages to install. Overall download size: 27.4 MiB. Already cached: 0 B. After the operation, additional 142.8 MiB will be used. Continue? [y/n/v/...? shows all options] (y): y ... Checking for file conflicts: .............................................[done] ... Executing %posttrans scripts .............................................[done] Command being timed: "zypper install podman" User time (seconds): 51.63 System time (seconds): 10.17 Percent of CPU this job got: 52% Elapsed (wall clock) time (h:mm:ss or m:ss): 1:56.78 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 111044 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 1765 Minor (reclaiming a frame) page faults: 157141 Voluntary context switches: 37330 Involuntary context switches: 4680 Swaps: 0 File system inputs: 40168 File system outputs: 499456 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0 Refresh: # zypper clean --all All repositories have been cleaned up. # /usr/bin/time -v zypper refresh Retrieving repository 'openSUSE-Tumbleweed-Oss' metadata .................] Building repository 'openSUSE-Tumbleweed-Oss' cache ......................] Retrieving repository 'openSUSE-Tumbleweed-Update' metadata ..............] Building repository 'openSUSE-Tumbleweed-Update' cache ...................] All repositories have been refreshed. Command being timed: "zypper refresh" User time (seconds): 34.86 System time (seconds): 3.38 Percent of CPU this job got: 80% Elapsed (wall clock) time (h:mm:ss or m:ss): 0:47.44 Average shared text size (kbytes): 0 Average unshared data size (kbytes): 0 Average stack size (kbytes): 0 Average total size (kbytes): 0 Maximum resident set size (kbytes): 82956 Average resident set size (kbytes): 0 Major (requiring I/O) page faults: 4 Minor (reclaiming a frame) page faults: 69197 Voluntary context switches: 15830 Involuntary context switches: 1683 Swaps: 0 File system inputs: 1400 File system outputs: 99928 Socket messages sent: 0 Socket messages received: 0 Signals delivered: 0 Page size (bytes): 4096 Exit status: 0
(In reply to Lukáš Hrázký from comment #12) > > Another thing that could be causing this is that from what I've heard, > zypper doesn't use file provides for dependency resolution (they've got > ridden of it, as we also should I think), so it may not be loading the file > lists and that can make a big difference in memory consumption. > Zypper supports file provides just fine, they just don't load filelists.xml by default, so they're restricted to the ones in primary.xml. It will load the filelists.xml on-demand if a non-primary.xml file dependency is detected, though.
In the next major version of packager (DNF5 project) we plan that it will be possible to disable loading of filelists. Of cource it will disable some functionality in DNF therefore people will suffer. But it is a future - Fedora 39.
I am proposing to close it as deferred.