Bug 2179701

Summary: fapolicyd-cli --update then mount/umount twice causes fapolicyd daemon to block (state 'D')
Product: Red Hat Enterprise Linux 9 Reporter: Daniel Reynolds <dareynol>
Component: fapolicydAssignee: Radovan Sroka <rsroka>
Status: CLOSED MIGRATED QA Contact: BaseOS QE Security Team <qe-baseos-security>
Severity: high Docs Contact:
Priority: unspecified    
Version: 9.1CC: dapospis
Target Milestone: rcKeywords: MigratedToJIRA, Triaged
Target Release: ---Flags: pm-rhel: mirror+
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-07-19 12:00:46 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Daniel Reynolds 2023-03-20 00:01:36 UTC
Description of problem:

The fapolicyd daemon is blocked on the second mount/umount event of a watched filesystem type after `fapolicyd-cli --update` has been run, after which no new processes can be spawned and "INFO: task fapolicyd:{pid} blocked for more than X seconds." messages emitted to the console.

Version-Release number of selected component (if applicable):
- fapolicyd-1.1.3-102


How reproducible:

Always

Steps to Reproduce:

1. Fresh 'minimal' install of RHEL9.1 (system registered during install)
2. Make sure system is up to date: `dnf upgrade -y`
3. Install fapolicyd: `dnf install -y fapolicyd`
4. Start fapolicyd service: `systemctl start fapolicyd`
5. Issue update request: `fapolicyd-cli --update`
6. Cause two mount or umount events of a filesystem type defined in /etc/fapolicyd/fapolicyd.conf. The two mount/umount events don't need to be related, only to be for filesystem types being watched by fapolicyd, but always happened on the second event.

Two methods were used to create mount/umount events in the investigation:
- an additional disk in the system and manually mount then umount the device, and
- defining a simple job in a user's crontab, e.g.: `sudo crontab -e -u root` -> '* * * * * true'.

If using the cron approach, the user cannot be logged in otherwise systemd-logind will not create then destroy the user session in /run/user/{uid}.

Actual results:

- fapolicyd hangs.
- new processes are stopped from spawned.


Expected results:

- fapolicyd does not hang.


Additional info:

- Red Hat STE has reproduced this issue on a VM.

- Further notes from case:

Looking at the code of fapolicyd-cli in GitHub (https://github.com/linux-application-whitelisting/fapolicyd/blob/ca4d4f5093b907abfa5d32007cd9edc831de166a/src/cli/fapolicyd-cli.c#L490), the '--update' command simply writes "1\n" to /run/fapolicyd/fapolicyd.fifo, so the above process was followed again but instead of using fapolicyd-cli in step 5, the `echo 1 > /run/fapolicyd/fapolicyd.fifo` command was used (and subsequently recognised by the fapolicyd daemon; /var/log/messages showed "fapolicyd[{pid}]: It looks like there was an update of the system... Syncing DB.") but interestingly, any number of subsequent mount/umount events did not trigger the issue.

Inspecting the fapolicyd SRPM, the 'fapolicyd-fgets-update-thread.patch' patches the fapolicyd-cli.c file:

diff -up ./src/cli/fapolicyd-cli.c.upgrade-thread ./src/cli/fapolicyd-cli.c
--- ./src/cli/fapolicyd-cli.c.upgrade-thread    2022-08-03 18:00:02.374999369 +0200
+++ ./src/cli/fapolicyd-cli.c   2022-08-03 18:00:09.802830497 +0200
@@ -482,7 +482,7 @@ static int do_update(void)
                }
        }

-       ssize_t ret = write(fd, "1", 2);
+       ssize_t ret = write(fd, "1\n", 3);

        if (ret == -1) {
                fprintf(stderr, "Write: %s -> %s\n", _pipe, strerror(errno));

Here you can see the write() function is updated but the third param (size) is '3', while the corresponding line in the GitHub version is '2'. Suspecting this to be the cause, new RPMs were created from the SRPM with the mentioned patch updated to use '2' as the size value and then installed over the top of the existing packages. The process was then followed from step 4 (using fapolicyd-cli in step 5) and the issue no longer exists.

Comment 2 Radovan Sroka 2023-07-19 11:57:20 UTC
This bug is going to be migrated.

Contact point for migration questions or issues: rsroka
Guidance for Bugzilla users to test their Jira account or create one if needed:

https://redhat.service-now.com/help?id=kb_article_view&sysparm_article=KB0016394
https://redhat.service-now.com/help?id=kb_article_view&sysparm_article=KB0016694
https://redhat.service-now.com/help?id=kb_article_view&sysparm_article=KB0016774