Bug 1623547
Summary: | anaconda too resource hungry | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Dan Horák <dan> |
Component: | pygobject3 | Assignee: | Colin Walters <walters> |
Status: | CLOSED ERRATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | 29 | CC: | anaconda-maint-list, awilliam, bugproxy, dan, hannsj_uhl, icq, jbicha, john.j5live, jonathan, jstodola, kellin, kevin, pbrobinson, pwhalen, vanmeeuwen+fedora, vponcova, walters, wwoods |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | s390x | ||
OS: | Linux | ||
See Also: | https://gitlab.gnome.org/GNOME/pygobject/issues/247 | ||
Whiteboard: | AcceptedFreezeException | ||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2018-09-18 15:35:36 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | |||
Bug Blocks: | 245418, 467765, 1517012 |
Description
Dan Horák
2018-08-29 15:15:07 UTC
F-28 installs without any issue on the guest. Seeing the same on ARM, the initrd uncompressed comes in around 1.5Gb I have also hit this testing on arm, the F29 container images seem to be failing as a result where the host has 2G RAM. Fedora-29-20180827.n.0 install.img is 394 MB Fedora 28 GA install.img is 412 MB Doesn't zswap cause the problem? screenshot of top from a started installation top - 09:10:39 up 15 min, 1 user, load average: 16.88, 14.45, 8.35 Tasks: 122 total, 3 running, 119 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.0 us, 26.2 sy, 0.0 ni, 4.0 id, 67.9 wa, 0.4 hi, 0.0 si, 1.5 st MiB Mem : 1997.3 total, 10.0 free, 1409.8 used, 577.6 buff/cache MiB Swap: 0.0 total, 0.0 free, 0.0 used. 27.2 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2086 root 20 0 411820 170396 2764 D 0.0 8.3 0:14.28 python3 -m pyanaconda.modules.localization 2094 root 20 0 374884 132680 1876 D 0.0 6.5 0:12.24 python3 -m pyanaconda.modules.services 2085 root 20 0 373612 131764 2224 D 0.0 6.4 0:12.98 python3 -m pyanaconda.modules.security 2093 root 20 0 379624 130408 2780 D 0.2 6.4 0:11.78 python3 -m pyanaconda.modules.storage 2077 root 20 0 378272 129556 2588 D 0.0 6.3 0:13.30 python3 -m pyanaconda.modules.network 2084 root 20 0 363964 121832 2100 D 0.0 6.0 0:11.63 python3 -m pyanaconda.modules.timezone 2083 root 20 0 359468 117232 1848 D 0.0 5.7 0:11.19 python3 -m pyanaconda.modules.users 2082 root 20 0 345544 104360 2896 D 0.0 5.1 0:12.46 python3 -m pyanaconda.modules.payload 2095 root 20 0 342688 101268 2816 D 0.2 5.0 0:12.04 python3 -m pyanaconda.dbus_addons.baz 1926 root 20 0 360044 89196 24776 S 0.0 4.4 0:03.05 /usr/sbin/rsyslogd -n 2015 root 20 0 322360 56760 3056 D 0.2 2.8 0:03.55 /usr/bin/python3 /usr/sbin/anaconda 2066 root 20 0 297620 55748 2284 D 0.2 2.7 0:08.82 python3 -m pyanaconda.modules.boss 1735 root 20 0 69576 25996 24904 D 0.2 1.3 0:05.54 /usr/lib/systemd/systemd-journald 1866 root rt 0 280212 18036 8000 S 0.0 0.9 0:00.06 /sbin/multipathd -d -s 1 root 20 0 103212 7628 3108 D 0.2 0.4 0:06.48 /usr/lib/systemd/systemd --switched-root --system --deserialize 32 1983 polkitd 20 0 1975044 6900 2404 S 0.0 0.3 0:00.10 /usr/lib/polkit-1/polkitd --no-debug 1777 root 20 0 28552 6300 2836 S 0.0 0.3 0:00.14 /usr/lib/systemd/systemd-udevd 2020 root 20 0 394288 5868 2960 S 0.0 0.3 0:00.10 /usr/libexec/udisks2/udisksd 1927 root 20 0 17548 5032 2848 S 0.0 0.2 0:00.13 /usr/lib/systemd/systemd-logind 1845 root 20 0 368384 4772 2480 S 0.0 0.2 0:00.06 /usr/sbin/lvmetad -f -t 3600 1943 root 20 0 251120 4220 1624 S 0.0 0.2 0:00.46 /usr/sbin/NetworkManager --no-daemon 2062 root 20 0 15580 3660 3016 S 0.0 0.2 0:00.02 dbus-daemon --print-address --syslog --config-file=/usr/share/anaconda/dbus/anaconda-bus.conf 2141 root 20 0 11416 3660 2760 S 0.0 0.2 0:00.04 sshd: root [priv] 2014 root 20 0 11416 3464 2624 S 0.0 0.2 0:00.00 /usr/sbin/sshd -D -f /etc/ssh/sshd_config.anaconda 2168 root 20 0 7044 3456 2700 R 0.2 0.2 0:01.22 top 1947 dbus 20 0 15872 3324 2424 D 0.0 0.2 0:00.36 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation --syslog-only 2151 root 20 0 11764 3228 2284 D 0.2 0.2 0:00.82 sshd: root@pts/0 1923 root 20 0 14988 2892 2080 S 0.0 0.1 0:01.76 /sbin/rngd -f 2152 root 20 0 3788 2036 1612 S 0.0 0.1 0:00.00 -bash 1937 root 20 0 3788 2032 1612 S 0.0 0.1 0:00.00 /bin/bash --login 2079 root 20 0 214132 1684 1444 S 0.0 0.1 0:00.00 /usr/bin/bash /usr/libexec/anaconda/start-module pyanaconda.modules.security 2081 root 20 0 214132 1684 1444 S 0.0 0.1 0:00.00 /usr/bin/bash /usr/libexec/anaconda/start-module pyanaconda.modules.payload 2091 root 20 0 214132 1684 1444 S 0.0 0.1 0:00.00 /usr/bin/bash /usr/libexec/anaconda/start-module pyanaconda.modules.services 2065 root 20 0 214132 1680 1444 S 0.0 0.1 0:00.00 /usr/bin/bash /usr/libexec/anaconda/start-module pyanaconda.modules.boss 2070 root 20 0 214132 1680 1444 S 0.0 0.1 0:00.00 /usr/bin/bash /usr/libexec/anaconda/start-module pyanaconda.modules.timezone 2073 root 20 0 214132 1680 1444 S 0.0 0.1 0:00.00 /usr/bin/bash /usr/libexec/anaconda/start-module pyanaconda.modules.network 2074 root 20 0 214132 1680 1444 S 0.0 0.1 0:00.00 /usr/bin/bash /usr/libexec/anaconda/start-module pyanaconda.modules.localization 2080 root 20 0 214132 1680 1444 S 0.0 0.1 0:00.00 /usr/bin/bash /usr/libexec/anaconda/start-module pyanaconda.modules.users 2090 root 20 0 214132 1680 1444 S 0.0 0.1 0:00.00 /usr/bin/bash /usr/libexec/anaconda/start-module pyanaconda.modules.storage 2092 root 20 0 214132 1680 1444 S 0.0 0.1 0:00.00 /usr/bin/bash /usr/libexec/anaconda/start-module pyanaconda.dbus_addons.baz 2061 root 20 0 2200 296 168 S 0.0 0.0 0:00.00 /sbin/auditd 2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 [kthreadd] 3 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 [rcu_gp] 4 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 [rcu_par_gp] 5 root 20 0 0 0 0 I 0.2 0.0 0:00.75 [kworker/0:0-events] 6 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 [kworker/0:0H-kblockd] 7 root 20 0 0 0 0 I 0.0 0.0 0:00.58 [kworker/u128:0-events_unbound] 8 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 [mm_percpu_wq] No real change with 3GB, both were with inst.zram=off. Load reaching 20 is not sane ... top - 10:05:00 up 20 min, 1 user, load average: 19.37, 18.00, 11.84 Tasks: 126 total, 3 running, 123 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.0 us, 26.0 sy, 0.0 ni, 13.1 id, 59.0 wa, 0.3 hi, 0.0 si, 1.5 st MiB Mem : 3005.3 total, 13.9 free, 2367.4 used, 624.1 buff/cache MiB Swap: 0.0 total, 0.0 free, 0.0 used. 24.1 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1410 root 0 -20 0 0 0 R 98.6 0.0 14:10.05 loop1 50 root 20 0 0 0 0 S 3.8 0.0 0:33.11 kswapd0 1405 root 0 -20 0 0 0 R 1.1 0.0 0:09.10 loop0 ... The loop1 process (decompressing the install.img?) consuming 1 whole CPU is also not good ... and no change with 4GB, I wonder if there is a kernel issue with the squashfs install.img top - 11:09:24 up 38 min, 1 user, load average: 19.75, 20.26, 17.65 Tasks: 123 total, 6 running, 116 sleeping, 0 stopped, 1 zombie %Cpu(s): 0.0 us, 26.8 sy, 0.0 ni, 11.4 id, 59.7 wa, 0.4 hi, 0.0 si, 1.8 st MiB Mem : 4013.3 total, 19.7 free, 3266.7 used, 727.0 buff/cache MiB Swap: 0.0 total, 0.0 free, 0.0 used. 20.8 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 1419 root 0 -20 0 0 0 R 97.0 0.0 28:23.56 loop1 50 root 20 0 0 0 0 R 6.9 0.0 1:41.26 kswapd0 1413 root 0 -20 0 0 0 R 1.1 0.0 0:19.78 loop0 2334 root 20 0 535908 14988 2900 S 0.2 0.4 0:27.02 lvmdbusd One hint from mkolman - the memory consumption of the python3 processes managing the anaconda panels is much higher then in other environments. The overloaded "loop1" process might be result of a close-to-OOM situation. Hi, Anaconda runs in multiple processes since Fedora 28 and I don't think there was any major change in Fedora 29 that could cause this problem. Also it looks like only ARM and s390x are affected. I would reassign the bug to someone else, but I don't know to whom. Seems we have a reproducer - install f28 and update to f29 (I've used fedora-upgrade) - dnf install anaconda - truncate -s 4G /mnt/image - anaconda <--vnc|--text> --image /mnt/image --vnc needs metacity installed too narrowed reproducer - export DBUS_STARTER_ADDRESS="$DBUS_SESSION_BUS_ADDRESS" - run "python3.7 -m pyanaconda.modules.storage" - after a short while DEBUG:anaconda.dbus.connection:Publishing an object at /org/fedoraproject/Anaconda/Modules/Storage/DiskInitialization. Traceback (most recent call last): File "/usr/lib64/python3.7/site-packages/pyanaconda/modules/storage/storage.py", line 80, in publish kickstart_module.publish() File "/usr/lib64/python3.7/site-packages/pyanaconda/modules/storage/disk_initialization/initialization.py", line 65, in publish DiskInitializationInterface(self)) File "/usr/lib64/python3.7/site-packages/pyanaconda/dbus/connection.py", line 99, in publish_object reg = self.connection.register_object(object_path, obj, None) File "/usr/lib/python3.7/site-packages/pydbus/registration.py", line 162, in register_object return ObjectRegistration(self, path, interfaces, wrapper, own_wrapper=True) File "/usr/lib/python3.7/site-packages/pydbus/registration.py", line 136, in __init__ ids = [bus.con.register_object(path, interface, wrapper.call_method, None, None) for interface in interfaces] File "/usr/lib/python3.7/site-packages/pydbus/registration.py", line 136, in <listcomp> ids = [bus.con.register_object(path, interface, wrapper.call_method, None, None) for interface in interfaces] gi.repository.GLib.Error: g-io-error-quark: An object is already exported for the interface org.fedoraproject.Anaconda.Modules.Storage.DiskInitialization at /org/fedoraproject/Anaconda/Modules/Storage/DiskInitialization (2) DEBUG:anaconda.dbus.connection:Publishing an object at /org/fedoraproject/Anaconda/Modules/Storage/DiskInitialization. Traceback (most recent call last): File "/usr/lib64/python3.7/site-packages/pyanaconda/modules/storage/storage.py", line 80, in publish starts to appear and increasing memory consumption is visible in top output for python3 process The problem is that the publish method is called repeatedly, but it should be called only once. We use GLib.idle_add to run the publish method once the main loop is started. It looks like the function behaves differently on s390x. The following script should print "Running" and "Processing" and wait. It works that way on x86_64, but it keeps printing "Processing" on s390x. Same for Fedora 29 and Rawhide. Reproducer: from gi.repository import GLib loop = GLib.MainLoop() def f(): print("Processing") return False GLib.idle_add(f) print("Running") loop.run() Tested on Fedora 29: pygobject3-3.29.2-1.fc29 Tested on Rawhide: pygobject3-3.30.0-1.fc30 Reassigning to pygobject3. If the following C code is equivalent to the python version, then it behaves correctly meaning the python bindings are buggy, not Glib. Compiled with "gcc -Wall -O2 `pkg-config --cflags glib-2.0` `pkg-config --libs glib-2.0` -g -o g g.c" #include <glib.h> gboolean f(gpointer data) { g_print("Processing...\n"); return FALSE; } int main(void) { GMainLoop *loop = g_main_loop_new(NULL, FALSE); g_idle_add(f, NULL); g_print("Running...\n"); g_main_loop_run(loop); return 0; } I'm getting closer - it's reproduceable with F-28 and the latest python3-gobject-3.30.0-1.fc30.s390x (built locally). Next step is bisect between pygobject 3.28 and 3.30 Bisected to https://gitlab.gnome.org/GNOME/pygobject/commit/25a5b066467523547d3b3c1947ae3722c3fc38d5 as the first bad commit, based on the result of the internal mainloop test. And it's likely an endianness problem, because I get the issue on ppc64. We are unblocked now - https://gitlab.gnome.org/GNOME/pygobject/issues/247#note_316214 pygobject3-3.30.0-2.fc29 has been submitted as an update to Fedora 29. https://bodhi.fedoraproject.org/updates/FEDORA-2018-5685089f8b Proposed as a Freeze Exception for 29-beta by Fedora user sharkcz using the blocker tracking app because: Anaconda crashes without the fix with OOM making it impossible to install F-29 on s390x. +1 FE +1 FE pygobject3-3.30.0-2.fc29 has been pushed to the Fedora 29 testing repository. If problems still persist, please make note of it in this bug report. See https://fedoraproject.org/wiki/QA:Updates_Testing for instructions on how to install test updates. You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2018-5685089f8b +1 FE, that's +3, setting accepted. pygobject3-3.30.0-2.fc29 has been pushed to the Fedora 29 stable repository. If problems still persist, please make note of it in this bug report. Can someone test that Beta RC3 worked OK on s390x, then we can close this? Thanks! https://kojipkgs.fedoraproject.org/compose/29/Fedora-29-20180916.0/ There is no problem with running anaconda any more with the Beta RC3 compose. FWIW, reverting that patch looks safe to me. for the record - the revert went into upstream with https://gitlab.gnome.org/GNOME/pygobject/commit/aa6dc30a1eb25eea9587f8e96dad3caa12d02e37 (until there is a better solution) Thanks for the confirmation. |