Note: This bug is displayed in read-only format because
the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Cause: A default value of OOMScoreAdjust was used by systemd for the fapolicyd service
Consequence: The service got killed quickly by OOM when there was enough memory
Fix: The OOM killer is disabled for the fapolicyd service.
Result: The service does not get killed because of insufficient memory.
DescriptionJan Pazdziora (Red Hat)
2022-06-15 15:00:05 UTC
Description of problem:
The fapolicyd is a security relevant daemon which authorizes execution of processes on a system. When it gets killed, the authorization suddenly is not available and anything is allowed to get executed.
The out-of-memory killer in the kernel can easily kill the fapolicyd for example when unauthorized process creates many small processes, which causes the OOM algorithm to see fapolicyd as the best candidate to be killed, instead of aiming at some of those unprivileged processes.
Version-Release number of selected component (if applicable):
fapolicyd-1.1-103.el9_0.x86_64
How reproducible:
Non deterministic but it is possible to get system to such state.
Steps to Reproduce:
1. Have machine with say 2.5 GB of memory.
2. Disable swap to put more pressure on the memory: swapoff -a
3. Set permissive = 1 in /etc/fapolicyd/fapolicyd.conf so that we can run that custom forking bomb to demonstrate the problem.
4. systemctl start fapolicyd
5. In /etc/security/limits.conf add
test soft nproc unlimited
test hard nproc unlimited
test soft sigpending unlimited
test hard sigpending unlimited
6. Compile program
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sched.h>
#include <sys/mman.h>
void alloc_memory(int len)
{
void *ret = mmap(0, len, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
if (ret == MAP_FAILED) {
perror("mmap");
return;
}
memset(ret, 'x', len);
}
int main(int argc, char **argv)
{
if (argc < 3) {
fprintf(stderr, "usage: %s <bytes> <processes>\n", argv[0]);
return 1;
}
int bytes = atoi(argv[1]);
int procs = atoi(argv[2]);
while (procs-- > 0) {
switch (fork()) {
case -1:
perror("first fork");
return 1;
case 0:
switch (fork()) {
case -1:
perror("second fork");
return 1;
case 0:
alloc_memory(bytes);
sleep(1000);
return 0;
default:
return 0;
}
default:
break;
}
sched_yield(); // be somewhat inconspicuous
}
}
// Thanks to Jiří Jabůrek.
7. Log in as the user test.
8. As root, increase the maximum number of processes for the session, something like
echo 1000000 > /sys/fs/cgroup/user.slice/user-1000.slice/pids.max
On my system that value is 7039 by default and I did not find a way to increase it before the user logs in.
9. Run the testing program as user test with some reasonable parameters. On my system, running it a couple of times as
./test-alloc 10 10000
will eventually exhaust the memory. You can also use
systemctl status $$
to see the memory used, and to poke the system in other ways as well.
Eventually the system will get slow, OOM killer will kick in, and if sshd won't get killed, you will be able to check the journal.
10. As root, run journalctl -l | grep kill
Actual results:
Jun 15 16:56:00 machine.example.com kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/system.slice/fapolicyd.service,task=fapolicyd,pid=14931,uid=990
Jun 15 16:56:00 machine.example.com kernel: Out of memory: Killed process 14931 (fapolicyd) total-vm:166680kB, anon-rss:27324kB, file-rss:0kB, shmem-rss:0kB, UID:990 pgtables:156kB oom_score_adj:0
Expected results:
fapolicyd is a security and authorization relevant service, it should set its OOMScoreAdjust to avoid being killed before the unprivileged processes.
Additional info:
Upstream already seems to have the change in
https://github.com/linux-application-whitelisting/fapolicyd/commit/42661aafe4efe882fbd4532ce6fa9c396f548e92
but it seems to go in the opposite direction -- the value should really be -900 or something similar.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory (fapolicyd bug fix and enhancement update), and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHBA-2022:8236
Description of problem: The fapolicyd is a security relevant daemon which authorizes execution of processes on a system. When it gets killed, the authorization suddenly is not available and anything is allowed to get executed. The out-of-memory killer in the kernel can easily kill the fapolicyd for example when unauthorized process creates many small processes, which causes the OOM algorithm to see fapolicyd as the best candidate to be killed, instead of aiming at some of those unprivileged processes. Version-Release number of selected component (if applicable): fapolicyd-1.1-103.el9_0.x86_64 How reproducible: Non deterministic but it is possible to get system to such state. Steps to Reproduce: 1. Have machine with say 2.5 GB of memory. 2. Disable swap to put more pressure on the memory: swapoff -a 3. Set permissive = 1 in /etc/fapolicyd/fapolicyd.conf so that we can run that custom forking bomb to demonstrate the problem. 4. systemctl start fapolicyd 5. In /etc/security/limits.conf add test soft nproc unlimited test hard nproc unlimited test soft sigpending unlimited test hard sigpending unlimited 6. Compile program #include <stdio.h> #include <stdlib.h> #include <string.h> #include <unistd.h> #include <sched.h> #include <sys/mman.h> void alloc_memory(int len) { void *ret = mmap(0, len, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0); if (ret == MAP_FAILED) { perror("mmap"); return; } memset(ret, 'x', len); } int main(int argc, char **argv) { if (argc < 3) { fprintf(stderr, "usage: %s <bytes> <processes>\n", argv[0]); return 1; } int bytes = atoi(argv[1]); int procs = atoi(argv[2]); while (procs-- > 0) { switch (fork()) { case -1: perror("first fork"); return 1; case 0: switch (fork()) { case -1: perror("second fork"); return 1; case 0: alloc_memory(bytes); sleep(1000); return 0; default: return 0; } default: break; } sched_yield(); // be somewhat inconspicuous } } // Thanks to Jiří Jabůrek. 7. Log in as the user test. 8. As root, increase the maximum number of processes for the session, something like echo 1000000 > /sys/fs/cgroup/user.slice/user-1000.slice/pids.max On my system that value is 7039 by default and I did not find a way to increase it before the user logs in. 9. Run the testing program as user test with some reasonable parameters. On my system, running it a couple of times as ./test-alloc 10 10000 will eventually exhaust the memory. You can also use systemctl status $$ to see the memory used, and to poke the system in other ways as well. Eventually the system will get slow, OOM killer will kick in, and if sshd won't get killed, you will be able to check the journal. 10. As root, run journalctl -l | grep kill Actual results: Jun 15 16:56:00 machine.example.com kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/system.slice/fapolicyd.service,task=fapolicyd,pid=14931,uid=990 Jun 15 16:56:00 machine.example.com kernel: Out of memory: Killed process 14931 (fapolicyd) total-vm:166680kB, anon-rss:27324kB, file-rss:0kB, shmem-rss:0kB, UID:990 pgtables:156kB oom_score_adj:0 Expected results: fapolicyd is a security and authorization relevant service, it should set its OOMScoreAdjust to avoid being killed before the unprivileged processes. Additional info: Upstream already seems to have the change in https://github.com/linux-application-whitelisting/fapolicyd/commit/42661aafe4efe882fbd4532ce6fa9c396f548e92 but it seems to go in the opposite direction -- the value should really be -900 or something similar.