Bug 1941170 - Systemd-oomd very aggressive in killing apps
Summary: Systemd-oomd very aggressive in killing apps
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Fedora
Classification: Fedora
Component: systemd
Version: 34
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: ---
Assignee: Anita Zhang
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-03-20 15:49 UTC by Isaac Bernadus
Modified: 2022-06-13 19:47 UTC (History)
23 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-05-07 10:09:30 UTC
Type: Bug


Attachments (Terms of Use)
htop (974.12 KB, image/png)
2021-04-07 14:20 UTC, Mikhail
no flags Details
system log (107 bytes, text/plain)
2021-04-07 14:21 UTC, Mikhail
no flags Details
system log (1.47 MB, text/plain)
2021-04-07 14:33 UTC, Mikhail
no flags Details
journalctl -- oomd kills (4.48 KB, text/plain)
2022-05-31 04:59 UTC, Bogdan Vitoc
no flags Details
macos force quit GUI screenshot (21.91 KB, image/jpeg)
2022-06-01 18:33 UTC, Bogdan Vitoc
no flags Details
journalctl -u systemd-oomd -g Killed | grep -v Boot (4.78 KB, text/plain)
2022-06-04 17:17 UTC, Sam
no flags Details

Description Isaac Bernadus 2021-03-20 15:49:58 UTC
Description of problem:

Systemd-oomd is very aggressive when it comes to memory management. In fedora 33 I've been able to run quite a few apps without a problem but in fedora 34, the apps get killed way too quickly. Here's an example of atom getting killed by systemd-oomd:

Mar 20 22:18:34 x505za systemd-oomd[1020]: Memory pressure for /user.slice/user-1000.slice/user@1000.service is greater than 10 for more than 10 seconds and there was reclaim activity
Mar 20 22:18:34 x505za systemd[1604]: app-gnome-atom-11930.scope: systemd-oomd killed 47 process(es) in this unit.
Mar 20 22:18:36 x505za systemd[1604]: app-gnome-atom-11930.scope: Deactivated successfully.
Mar 20 22:18:36 x505za systemd[1604]: app-gnome-atom-11930.scope: Consumed 7.557s CPU time

This event is triggered when around 70-80% of my memory is filled up despite still having space in swap.

Version-Release number of selected component (if applicable): systemd 248 (v248~rc4-1.fc34)


How reproducible:
Always

Steps to Reproduce:
1. Load up a bunch of apps to fill up memory
2. Wait for systemd-oomd to trigger reclaim activity
3.

Actual results:
Apps get killed very quickly

Expected results:
Apps to run normally until memory and swap is almost full

Additional info:

System Specs:

Ryzen 3 2200u
4GB RAM
Kernel 5.11.7-300.fc34.x86_64

Comment 1 Isaac Bernadus 2021-03-20 16:44:30 UTC
Seems like if the memory gets filled fast enough, systemd will even decide to kill Gnome

Comment 2 Davide Repetto 2021-03-23 14:59:38 UTC
Same problem here.
The fedora defaults are too aggressive. They make systemd-oomd very trigger-happy (ManagedOOMMemoryPressureLimit=10% for 10 seconds)

With 4GB and zram, you can barely use anything. With 2GB it is a task-massacre all the time.

The defaults suggested in the manual (60% & 30s) still prevent excessive spinning while working way more predictably.

Comment 3 iolo 2021-03-24 22:59:27 UTC
I've noticed this too while testing Fedora Workstation 34 this week. I'll leave Netbeans or Brave running to go do something else, and I've got maybe about 4 GB of free RAM out of 8 GB total at that point. Then, some time later, I will try to go back to Netbeans or Brave, or whatever it is, only to find that it's been killed. I've never had anything like this happen before.

Comment 4 Anita Zhang 2021-03-25 07:12:19 UTC
I will work on updating the pressure defaults now that the test week results have come in. I agree that the defaults are a bit aggressive, but that's what the test week and beta was meant to iron out.

Comment 5 Anita Zhang 2021-03-30 09:04:29 UTC
I've submitted https://src.fedoraproject.org/rpms/systemd/pull-request/58# to bump pressure defaults to 50% for 20s. Hopefully these more conservative values will perform better for most people.

Comment 6 Fedora Update System 2021-03-31 09:18:33 UTC
FEDORA-2021-8595b30af3 has been submitted as an update to Fedora 34. https://bodhi.fedoraproject.org/updates/FEDORA-2021-8595b30af3

Comment 7 Fedora Update System 2021-04-01 02:04:06 UTC
FEDORA-2021-8595b30af3 has been pushed to the Fedora 34 testing repository.
Soon you'll be able to install the update with the following command:
`sudo dnf upgrade --enablerepo=updates-testing --advisory=FEDORA-2021-8595b30af3`
You can provide feedback for this update here: https://bodhi.fedoraproject.org/updates/FEDORA-2021-8595b30af3

See also https://fedoraproject.org/wiki/QA:Updates_Testing for more information on how to test updates.

Comment 8 Fedora Update System 2021-04-03 01:28:14 UTC
FEDORA-2021-8595b30af3 has been pushed to the Fedora 34 stable repository.
If problem still persists, please make note of it in this bug report.

Comment 9 Mikhail 2021-04-07 14:20:40 UTC
Created attachment 1769904 [details]
htop

Today oomd again killed my container.


$ cat /usr/lib/systemd/oomd.conf.d/10-oomd-defaults.conf
[OOM]
DefaultMemoryPressureDurationSec=20s


$ cat /usr/lib/systemd/system/user@.service.d/10-oomd-user-service-defaults.conf
[Service]
ManagedOOMMemoryPressure=kill
ManagedOOMMemoryPressureLimit=50%

Comment 10 Mikhail 2021-04-07 14:21:13 UTC
Created attachment 1769916 [details]
system log

Comment 11 Mikhail 2021-04-07 14:33:13 UTC
Created attachment 1769918 [details]
system log

Comment 12 Anita Zhang 2021-04-08 22:42:10 UTC
@Mikhail Was the system responsive and performing well at 54% pressure on the user service cgroup? Also can you try stopping systemd-oomd (sudo systemctl stop systemd-oomd) and recording what the highest tolerable pressure value was from `/sys/fs/cgroup/user.slice/user-$UID.slice/user@$UID.service/memory.pressure` while your container is running? We can't control for all workloads but it's worthwhile to see what pressure is tolerable or not.

Comment 13 Mikhail 2021-09-07 19:56:26 UTC
(In reply to Anita Zhang from comment #12)
> @Mikhail Was the system responsive and performing well at 54% pressure on
> the user service cgroup? Also can you try stopping systemd-oomd (sudo
> systemctl stop systemd-oomd) and recording what the highest tolerable
> pressure value was from
> `/sys/fs/cgroup/user.slice/user-$UID.slice/user@$UID.service/memory.
> pressure` while your container is running? We can't control for all
> workloads but it's worthwhile to see what pressure is tolerable or not.

$ cat /sys/fs/cgroup/user.slice/user-$UID.slice/user@$UID.service/memory.pressure
some avg10=0.00 avg60=0.00 avg300=0.13 total=1698253169
full avg10=0.00 avg60=0.00 avg300=0.11 total=1515028054

$ journalctl -b -u systemd-oomd --no-pager
-- Journal begins at Thu 2021-07-29 17:02:00 +05, ends at Wed 2021-09-08 00:51:09 +05. --
Sep 04 03:16:03 primary-ws systemd[1]: Starting Userspace Out-Of-Memory (OOM) Killer...
Sep 04 03:16:03 primary-ws systemd[1]: Started Userspace Out-Of-Memory (OOM) Killer.
Sep 08 00:23:23 primary-ws systemd-oomd[1552]: Killed /user.slice/user-1000.slice/user@1000.service/app.slice/app-org.gnome.Terminal.slice/vte-spawn-887e6f17-fa6d-44cd-aa80-798d5c0c71ce.scope due to memory pressure for /user.slice/user-1000.slice/user@1000.service being 52.46% > 50.00% for > 20s with reclaim activity

Comment 14 Mikhail 2021-09-07 19:58:06 UTC
^^^ This is F36 and systemd oomd still killing my terminal tabs.

Comment 15 Anita Zhang 2021-09-13 17:28:18 UTC
(In reply to Mikhail from comment #13)
> $ journalctl -b -u systemd-oomd --no-pager
> -- Journal begins at Thu 2021-07-29 17:02:00 +05, ends at Wed 2021-09-08
> 00:51:09 +05. --
> Sep 04 03:16:03 primary-ws systemd[1]: Starting Userspace Out-Of-Memory
> (OOM) Killer...
> Sep 04 03:16:03 primary-ws systemd[1]: Started Userspace Out-Of-Memory (OOM)
> Killer.
> Sep 08 00:23:23 primary-ws systemd-oomd[1552]: Killed
> /user.slice/user-1000.slice/user@1000.service/app.slice/app-org.gnome.
> Terminal.slice/vte-spawn-887e6f17-fa6d-44cd-aa80-798d5c0c71ce.scope due to
> memory pressure for /user.slice/user-1000.slice/user@1000.service being
> 52.46% > 50.00% for > 20s with reclaim activity

You're pretty close to the default limits set up for Fedora so if you're fine with the added pressure you may want to try bumping them for your system with an override like so:

$ cat /etc/systemd/system/user@.service.d/99-oomd-override.conf 
[Service]
ManagedOOMMemoryPressureLimit=65%
$ sudo systemctl daemon-reload
$ oomctl # check if new limit was applied

The default values will likely be reworked once https://github.com/systemd/systemd/pull/20690 is merged. This will allow setting more tuned pressure values on slices within a user session rather than relying on one value for all of user@UID.service.

Comment 16 Mikhail 2021-09-15 08:02:02 UTC
(In reply to Anita Zhang from comment #15)
> 
> You're pretty close to the default limits set up for Fedora so if you're
> fine with the added pressure you may want to try bumping them for your
> system with an override like so:
> 
> $ cat /etc/systemd/system/user@.service.d/99-oomd-override.conf 

Directory `user@.service.d` is absent on my system.

$ ls /etc/systemd/system/user@.service.d
ls: cannot access '/etc/systemd/system/user@.service.d': No such file or directory

> [Service]
> ManagedOOMMemoryPressureLimit=65%
> $ sudo systemctl daemon-reload
> $ oomctl # check if new limit was applied
> 
> The default values will likely be reworked once
> https://github.com/systemd/systemd/pull/20690 is merged. This will allow
> setting more tuned pressure values on slices within a user session rather
> than relying on one value for all of user@UID.service.

As I am understand by default PressureLimit should be 50%

$ cat /usr/lib/systemd/oomd.conf.d/10-oomd-defaults.conf
[OOM]
DefaultMemoryPressureDurationSec=20s

$ cat /usr/lib/systemd/system/user@.service.d/10-oomd-user-service-defaults.conf
[Service]
ManagedOOMMemoryPressure=kill
ManagedOOMMemoryPressureLimit=50%

But oomctl show 60%, why?

$ oomctl
Dry Run: no
Swap Used Limit: 90.00%
Default Memory Pressure Limit: 60.00%
Default Memory Pressure Duration: 20s
System Context:
        Memory: Used: 55.3G Total: 62.6G
        Swap: Used: 104.5M Total: 63.9G
Swap Monitored CGroups:
        Path: /
                Swap Usage: (see System Context)
Memory Pressure Monitored CGroups:
        Path: /user.slice/user-1000.slice/user@1000.service
                Memory Pressure Limit: 50.00%
                Pressure: Avg10: 0.00 Avg60: 0.00 Avg300: 0.00 Total: 14s
                Current Memory Usage: 49.6G
                Memory Min: 250.0M
                Memory Low: 0B
                Pgscan: 85039860
                Last Pgscan: 85039860

Comment 17 Anita Zhang 2021-09-16 18:45:28 UTC
You(In reply to Mikhail from comment #16)
> (In reply to Anita Zhang from comment #15)
> Directory `user@.service.d` is absent on my system.

You need to make it. Directories under /etc/systemd/system are managed by the system maintainer.

> As I am understand by default PressureLimit should be 50%
> 
> But oomctl show 60%, why?
> 
> $ oomctl
> Dry Run: no
> Swap Used Limit: 90.00%
> Default Memory Pressure Limit: 60.00%
> Default Memory Pressure Duration: 20s
> System Context:
>         Memory: Used: 55.3G Total: 62.6G
>         Swap: Used: 104.5M Total: 63.9G
> Swap Monitored CGroups:
>         Path: /
>                 Swap Usage: (see System Context)
> Memory Pressure Monitored CGroups:
>         Path: /user.slice/user-1000.slice/user@1000.service
>                 Memory Pressure Limit: 50.00%
>                 Pressure: Avg10: 0.00 Avg60: 0.00 Avg300: 0.00 Total: 14s
>                 Current Memory Usage: 49.6G
>                 Memory Min: 250.0M
>                 Memory Low: 0B
>                 Pgscan: 85039860
>                 Last Pgscan: 85039860

Default memory pressure limit is 60% meaning if the unit doesn't override it, it will use 60%. But here since we ship a config for user@.service, the memory pressure limit is overridden to be 50% (it has it in the output above the "Pressure" line).

Comment 18 Mikhail 2021-11-12 14:57:19 UTC
> You're pretty close to the default limits set up for Fedora so if you're fine with the added pressure you may want to try bumping them for your system with an override like so

ManagedOOMMemoryPressureLimit=65% did't helps :(

$ journalctl -b -u systemd-oomd --no-pager
-- Journal begins at Thu 2021-10-07 03:47:38 +05, ends at Fri 2021-11-12 19:54:14 +05. --
Nov 12 14:42:28 primary-ws systemd[1]: Starting Userspace Out-Of-Memory (OOM) Killer...
Nov 12 14:42:28 primary-ws systemd[1]: Started Userspace Out-Of-Memory (OOM) Killer.
Nov 12 17:50:48 primary-ws systemd-oomd[1172]: Killed /user.slice/user-1000.slice/user@1000.service/app.slice/app-org.gnome.Terminal.slice/vte-spawn-f92b7041-15da-41fb-8076-8221774567da.scope due to memory pressure for /user.slice/user-1000.slice/user@1000.service being 65.70% > 65.00% for > 20s with reclaim activity

$ cat /sys/fs/cgroup/user.slice/user-$UID.slice/user@$UID.service/memory.pressure
some avg10=3.68 avg60=31.49 avg300=25.59 total=424288160
full avg10=3.59 avg60=29.41 avg300=23.58 total=390639367

$ oomctl 
Dry Run: no
Swap Used Limit: 90.00%
Default Memory Pressure Limit: 60.00%
Default Memory Pressure Duration: 20s
System Context:
        Memory: Used: 60.4G Total: 62.6G
        Swap: Used: 16.5G Total: 71.9G
Swap Monitored CGroups:
        Path: /
                Swap Usage: (see System Context)
Memory Pressure Monitored CGroups:
        Path: /user.slice/user-1000.slice/user@1000.service
                Memory Pressure Limit: 65.00%
                Pressure: Avg10: 2.11 Avg60: 26.64 Avg300: 23.10 Total: 6min 30s
                Current Memory Usage: 25.5G
                Memory Min: 250.0M
                Memory Low: 0B
                Pgscan: 36702397
                Last Pgscan: 36690917

Comment 19 Mohamed Akram 2022-04-14 11:46:24 UTC
How do I disable this completely? It's constantly killing apps despite plenty of RAM and swap space available. I do `sudo systemctl disable --now systemd-oomd.service` and it comes back when I restart. I don't want to use any userspace OOM killer.

Comment 20 Davide Repetto 2022-04-14 13:09:35 UTC
(In reply to Mohamed Akram from comment #19)
> How do I disable this completely?

You can disable it completely with "systemctl mask systemd-oomd"
(Masked services won't start even if you launch them manually.)

Comment 21 Anita Zhang 2022-04-14 17:33:26 UTC
(In reply to Mohamed Akram from comment #19)
> It's constantly killing apps despite plenty of RAM and swap space available.

Hey this sounds like a legit bug? Do you still have the logs from this event? They should be visible in the journal by doing `journalctl -u systemd-oomd -g Killed`

Comment 22 Bogdan Vitoc 2022-05-31 04:57:02 UTC
Just wanted to chime in that I've had systemd-oomd kill my gnome shell four times in past two months on my current install of F35 (leading to a pretty jarring experience). I'm a pretty lay Linux user, will try to attach logs, let me know if anything else is helpful.

Comment 23 Bogdan Vitoc 2022-05-31 04:59:38 UTC
Created attachment 1885395 [details]
journalctl -- oomd kills

Comment 24 Chris Murphy 2022-06-01 16:15:44 UTC
>Mar 25 19:00:46 fedora systemd-oomd[1612]: Killed /user.slice/user-1000.slice/user@1000.service/session.slice/org.gnome.Shell@wayland.service due to memory used (16360116224) / total (16541884416) and swap used (7752622080) / total (8589930496) being more than 90.00%

That something is being killed off at 90% swap usage makes sense, but not GNOME Shell. That's exchanging one big problem for another big problem, I'm not sure we can ever consider killing the desktop itself as a solution to the swap perf problem. And it makes me wonder if the only thing we can do is ensure resource control limits everything well enough that the user has the ability to choose what program needs to get killed off rather than doing it for them? As in, I'm wondering if oomd really should only be killing the most obvious candidates, and actually permit the less obvious candidates while still constraining the resources they can use to like 90% or whatever allows the shell+terminal to remain responsive enough (i.e. not perfect) such that the user doesn't reach for the power button. But instead reaches for top or systemd-cgtop to find out what's hogging resources, and decides whether to clobber it or not?

Comment 25 Bogdan Vitoc 2022-06-01 18:31:57 UTC
I agree.

I might add though (and this is my first time on linux forums, so I'm not sure how this is generally addressed) that this problem can easily affect non-technical end users as well, which would likely not be comfortable with shell commands such as top or systemd-cgtop. In case it is helpful, I'll attach a screenshot of how macOS solves this problem through a GUI interface for force-quitting that includes current memory usage for each application. I think this is a nice way of empowering the user to make the decision. However, I could not find an existing linux/GNOME GUI that does something similar.

In the meantime, perhaps improving the heuristics of the resource manager so core services like GNOME Shell are not killed would be helpful. And I think that so long as auto-killing is the active solution, it would be nice for any apps that are killed due to memory constraints for there to be a system alert informing the user as to this decision, since the event otherwise seems quite anomalous.

Comment 26 Bogdan Vitoc 2022-06-01 18:33:07 UTC
Created attachment 1885855 [details]
macos force quit GUI screenshot

Comment 27 Sam 2022-06-04 17:17:10 UTC
Created attachment 1886712 [details]
journalctl -u systemd-oomd -g Killed | grep -v Boot

My daughter is using Fedora 35 with Cinnamon Desktop and has been complaining about randomly being logged out.

We'll see how things go with systemd-oomd disabled/masked.


Note You need to log in before you can comment on or make changes to this bug.