Bug 1929856 - systemd-oomd kills anaconda in the middle of system install when installing from KDE live image with 2GB RAM
Summary: systemd-oomd kills anaconda in the middle of system install when installing f...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: systemd
Version: 34
Hardware: All
OS: Linux
unspecified
urgent
Target Milestone: ---
Assignee: Michel Lind
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard: openqa
Depends On:
Blocks: F34BetaBlocker 1913794
TreeView+ depends on / blocked
 
Reported: 2021-02-17 19:15 UTC by Adam Williamson
Modified: 2021-02-20 00:49 UTC (History)
17 users (show)

Fixed In Version: systemd-oomd-defaults-247.3-3
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-02-19 08:01:40 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
journalctl (503.20 KB, text/plain)
2021-02-17 21:55 UTC, Chris Murphy
no flags Details
oomctl data when invoking liveinst (12.11 KB, text/plain)
2021-02-18 00:24 UTC, Michel Lind
no flags Details

Description Adam Williamson 2021-02-17 19:15:01 UTC
All openQA install tests on the KDE live image failed in today's Branched. In all cases, the logs show it's because anaconda was killed by systemd-oomd.

Note how anaconda runs on the live images, as it may matter, I don't know: it uses consolehelper, so the process starts by running `liveinst` as a *regular user*, which eventually results via consolehelper machinations in anaconda running as *root*.

The system journal shows this:

Feb 17 09:27:25 localhost-live systemd-oomd[915]: Memory pressure for /user.slice/user-1000.slice/user is greater than 4 for more than 10 seconds and there was reclaim activity

The user journal shows this:

Feb 17 09:27:25 localhost-live systemd[1271]: app-liveinst-675e1db00e51458ea9b5b7462b78d4fb.scope: systemd-oomd killed 74 process(es) in this unit.
Feb 17 09:27:26 localhost-live systemd[1271]: app-liveinst-675e1db00e51458ea9b5b7462b78d4fb.scope: Succeeded.
Feb 17 09:27:26 localhost-live systemd[1271]: app-liveinst-675e1db00e51458ea9b5b7462b78d4fb.scope: Consumed 20.409s CPU time.

The test runs with 2GB of RAM. Granted that's a bit low, but it's not unusual for VMs, and the install has not failed due to memory pressure before (except for the issue with debug kernels and KASAN last month). I can't actually find our 'official' system requirements anywhere any more, but AFAIK last time I checked, 2GB of RAM was mentioned. I don't think "installer suddenly killed by oomd" is a reasonable outcome for a 2GB install attempt.

Comment 1 Adam Williamson 2021-02-17 19:16:04 UTC
Proposing as a Beta blocker as a conditional violation of "The installer must be able to complete an installation using any supported locally connected storage interface" (and all other criteria that imply successful installation), when attempted with 2GB of RAM or less from the KDE live.

Comment 2 Chris Murphy 2021-02-17 20:44:57 UTC
https://docs.fedoraproject.org/en-US/fedora/rawhide/release-notes/welcome/Hardware_Overview/#hardware_overview-specs
Minimum System Configuration
1GHz or faster processor
2GB System Memory
10GB unallocated drive space

However, there's a 'low memory installations' section that proposes both "less than 768MB of system memory" and "less than 1GB of memory".

https://getfedora.org/en/workstation/download/
* Fedora requires a minimum of 20GB disk, 2GB RAM, to install and run successfully. Double those amounts is recommended.

I think that'd be consistent with KDE too. 

If it's not possible then (a) oomd should disable itself with a suitable message, and/or (b) we need to increase the minimum memory requirements.

Comment 3 Geraldo Simião 2021-02-17 21:07:51 UTC
It installed normally with 4Gb RAM and 4CPUs on KVM/qemu virt-manager install.

Comment 4 Chris Murphy 2021-02-17 21:55:32 UTC
Created attachment 1757662 [details]
journalctl

$ journalctl --no-hostname -o short-monotonic --no-pager

This is in a qemu-kvm, 2G RAM. The swap (on zram) is ~20% full at the time of the kill. It's strictly a memory pressure kill.

There are two kill events:

[  325.044689] systemd-oomd[993]: Memory pressure for /user.slice/user-1000.slice/user is greater than 4 for more than 10 seconds and there was reclaim activity
[  325.057782] systemd[1313]: plasma-plasmashell.service: systemd-oomd killed 69 process(es) in this unit.
[  340.059781] systemd-oomd[993]: Memory pressure for /user.slice/user-1000.slice/user is greater than 4 for more than 10 seconds and there was reclaim activity
[  340.108661] systemd[1313]: app-org.kde.korgac-autostart.service: systemd-oomd killed 65 process(es) in this unit.

Comment 5 Michel Lind 2021-02-18 00:24:11 UTC
Created attachment 1757672 [details]
oomctl data when invoking liveinst

cmurf provided this data that we're using to tweak oomd's settings

Comment 6 Michel Lind 2021-02-18 00:25:18 UTC
So the data shows memory pressure peaking at just under 9% right before oomd kills anaconda. We verified that bumping the threshold to 10% allows Anaconda to complete the installation

https://src.fedoraproject.org/rpms/systemd/pull-request/51 with this change

Comment 7 Michel Lind 2021-02-19 08:01:40 UTC
Should be fixed in https://bodhi.fedoraproject.org/updates/FEDORA-2021-dadbaaac54


Note You need to log in before you can comment on or make changes to this bug.