Bug 2135159

Summary: OpenSCAP runs slower with SSSD enumeration enabled.
Product: Red Hat Enterprise Linux 8 Reporter: Têko Mihinto <tmihinto>
Component: openscapAssignee: Jan Černý <jcerny>
Status: NEW --- QA Contact: BaseOS QE Security Team <qe-baseos-security>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 8.6CC: abokovoy, aboscatt, abroy, amepatil, atikhono, dhalasz, ekolesni, jcerny, maburgha, matyc, mhaicman, mmarhefk, myllynen, pbrezina
Target Milestone: rcKeywords: MigratedToJIRA, Triaged
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Têko Mihinto 2022-10-16 20:28:01 UTC
Description of problem:

When SSSD enumeration is enabled, oscap runs slower.
Subsequent oscap runs ( after the initial enumeration is over ) are also taking time.

* Without enumeration:

$ date ; time oscap xccdf eval --profile xccdf_org.ssgproject.content_profile_stig --rule xccdf_org.ssgproject.content_rule_no_files_unowned_by_user --thin-results /usr/share/xml/scap/ssg/content/ssg-rhel8-ds.xml
Fri Oct  7 18:48:53 IST 2022
WARNING: Datastream component 'scap_org.open-scap_cref_security-data-oval-com.redhat.rhsa-RHEL8.xml.bz2' points out to the remote 'https://access.redhat.com/security/data/oval/com.redhat.rhsa-RHEL8.xml.bz2'. Use '--fetch-remote-resources' option to download it.
WARNING: Skipping 'https://access.redhat.com/security/data/oval/com.redhat.rhsa-RHEL8.xml.bz2' file which is referenced from datastream
WARNING: Skipping ./security-data-oval-com.redhat.rhsa-RHEL8.xml.bz2 file which is referenced from XCCDF content
--- Starting Evaluation ---

Title   Ensure All Files Are Owned by a User
Rule    xccdf_org.ssgproject.content_rule_no_files_unowned_by_user
Ident   CCE-83499-4
Result  fail


real    0m34.565s
user    0m27.348s
sys     0m8.094s
$


* With enumeration:

$ date ; time oscap xccdf eval --profile xccdf_org.ssgproject.content_profile_stig --rule xccdf_org.ssgproject.content_rule_no_files_unowned_by_user --thin-results /usr/share/xml/scap/ssg/content/ssg-rhel8-ds.xml
Fri Oct  7 19:11:27 IST 2022
WARNING: Datastream component 'scap_org.open-scap_cref_security-data-oval-com.redhat.rhsa-RHEL8.xml.bz2' points out to the remote 'https://access.redhat.com/security/data/oval/com.redhat.rhsa-RHEL8.xml.bz2'. Use '--fetch-remote-resources' option to download it.
WARNING: Skipping 'https://access.redhat.com/security/data/oval/com.redhat.rhsa-RHEL8.xml.bz2' file which is referenced from datastream
WARNING: Skipping ./security-data-oval-com.redhat.rhsa-RHEL8.xml.bz2 file which is referenced from XCCDF content
--- Starting Evaluation ---

Title   Ensure All Files Are Owned by a User
Rule    xccdf_org.ssgproject.content_rule_no_files_unowned_by_user
Ident   CCE-83499-4
Result  fail


real    0m56.001s
user    0m26.325s
sys     0m7.080s
$


Version-Release number of selected component (if applicable):

$ cat /etc/redhat-release
Red Hat Enterprise Linux release 8.6 (Ootpa)
$
$  rpm -qa | grep sssd
sssd-2.6.2-4.el8_6.1.x86_64
sssd-client-debuginfo-2.6.2-4.el8_6.1.x86_64
sssd-common-2.6.2-4.el8_6.1.x86_64
sssd-ipa-2.6.2-4.el8_6.1.x86_64
sssd-krb5-2.6.2-4.el8_6.1.x86_64
sssd-debugsource-2.6.2-4.el8_6.1.x86_64
sssd-client-2.6.2-4.el8_6.1.x86_64
sssd-dbus-2.6.2-4.el8_6.1.x86_64
sssd-krb5-common-2.6.2-4.el8_6.1.x86_64
python3-sssdconfig-2.6.2-4.el8_6.1.noarch
sssd-nfs-idmap-2.6.2-4.el8_6.1.x86_64
sssd-tools-2.6.2-4.el8_6.1.x86_64
sssd-kcm-2.6.2-4.el8_6.1.x86_64
sssd-common-pac-2.6.2-4.el8_6.1.x86_64
sssd-ad-2.6.2-4.el8_6.1.x86_64
sssd-ldap-2.6.2-4.el8_6.1.x86_64
sssd-proxy-2.6.2-4.el8_6.1.x86_64
sssd-debuginfo-2.6.2-4.el8_6.1.x86_64
$

$ rpm -qa | grep openscap
openscap-scanner-1.3.6-3.el8.x86_64
openscap-debugsource-1.3.6-3.el8.x86_64
openscap-scanner-debuginfo-1.3.6-3.el8.x86_64
openscap-debuginfo-1.3.6-3.el8.x86_64
openscap-1.3.6-3.el8.x86_64
$

How reproducible:
Always.

Steps to Reproduce:
1. Create 10K IPA users, each user having its own home directory.
2. Enable SSSD enumeration.
3. Run the oscap command.

Actual results:
oscap takes longer with enumeration.

Expected results:
After SSSD initial enumeration, one would expect oscap to run faster.

Additional info:
SSSD cache was mount in tmpfs.
Setting ignore_group_members to true doesn't help.

Comment 12 Jan Černý 2022-10-31 08:08:16 UTC
Hi everyone, I'm very sorry for the long delay.

Yes, OpenSCAP calls `getpwent()` to get a list of users on a system (reference to source code: https://github.com/OpenSCAP/openscap/blob/d10c40e43e1c627912374b8fbdfa1a84967fcc92/src/OVAL/probes/unix/password_probe.c#L232).

From your comments I can see that you experience issues when evaluating scap-security-guide rule "no_files_unowned_by_user". The rule "no_files_unowned_by_user" can be problematic with performance. Its goal is to verify whether there exists any file that is owned by a nonexisting user and make sure all files on the system are owned by valid users. To find such files, it needs to check all files in the whole filesystem and check their ownership. This can be very performance intensive, especially when there is a lot of files on the system. It uses the `getpwent()` to find a list of users and checks if file owners are on the list for each file.

Can you help me understand what exactly is the problem here? Is the problem that OpenSCAP uses the `getpwent()` call but it should use a different function or a different way of obtaining password entries? Or is the problem that it reacts on a return code in a wrong way? Or something else? If it's something else, how can we start debugging?

Comment 13 Alexey Tikhonov 2022-10-31 16:45:02 UTC
Hi Jan,

(In reply to Jan Černý from comment #12)
> 
> Yes, OpenSCAP calls `getpwent()` to get a list of users on a system
> (reference to source code:
> https://github.com/OpenSCAP/openscap/blob/
> d10c40e43e1c627912374b8fbdfa1a84967fcc92/src/OVAL/probes/unix/password_probe.
> c#L232).
> 
> From your comments I can see that you experience issues when evaluating
> scap-security-guide rule "no_files_unowned_by_user". The rule
> "no_files_unowned_by_user" can be problematic with performance. Its goal is
> to verify whether there exists any file that is owned by a nonexisting user
> and make sure all files on the system are owned by valid users. To find such
> files, it needs to check all files in the whole filesystem and check their
> ownership. This can be very performance intensive, especially when there is
> a lot of files on the system. It uses the `getpwent()` to find a list of
> users and checks if file owners are on the list for each file.

I presume 'list of users' is composed once, and then used for every file, right?
Could you please reference me to the code that performs this check ("if file owners are on the list")?


> Can you help me understand what exactly is the problem here?

I think there are a number of problems.

(1) Probably implementation of this "check" isn't very optimal and consumes too much CPU power if list of users is large (in this example this is 5483 users).
But it's required to check the code.

(2) If users are sourced from SSSD (i.e. /etc/nsswitch.conf: `passwd: sss`) but enumeration is disabled in sssd.conf, this just won't work properly because `getpwent()` won't return anything.
This is probably "just" a documentation issue, but reporter clearly don't understand this.
And see next item.

(3) I'm not sure (it's not entirely clear to me from `man nsswitch.conf`) but it looks like `getpwent()` does *not* merge different databases.
This means that if `passwd: sss files` is configured, and there are files that owned both by users from LDAP (served by SSSD) and by local users (served by `libnss_files.so`) this approach with `getpwent()` can't work properly, because `getpwent()` might not return complete user set.
(Well, it can still work in this example *if* SSSD serves both LDAP and local users, and this is the case by default on RHEL8 now, but that's not the case on RHEL7, RHEL9, current Fedora and even on RHEL8 users are eligible to configure SSSD to *not* serve local users and to rely on `libnss_files.so` instead)

Why 'openscap' doesn't use `getpwuid()`/`getgrgid()` instead?
Is this kind of premature optimization?

Comment 14 Jan Černý 2022-11-01 10:20:13 UTC
> I presume 'list of users' is composed once, and then used for every file, right?

Yes, it should work this way.

> Could you please reference me to the code that performs this check ("if file owners are on the list")?

It's little complicated.

OpenSCAP merely interprets checks written in OVAL (Open Vulnerability and Assessment Language), an XML-based language for creating compliance checks. The OVAL serves as an input. We generate these files from the ComplianceAsCode project, but in general users can provide their own file. So the upstream source code for the check for the rule "no_files_unowned_by_user" questioned in this BZ can be found at https://github.com/ComplianceAsCode/content/blob/master/linux_os/guide/system/permissions/files/no_files_unowned_by_user/oval/shared.xml . I will try to quickly describe the contents of this link. It says collect all the files on / and subdirectories but filter out all files that have UID set to any of the UIDs of all users (users with username matching regular expression .*).

Then, when OpenSCAP interprets the OVAL, this code is responsible for collection of files: https://github.com/OpenSCAP/openscap/blob/maint-1.3/src/OVAL/probes/unix/file_probe.c and this code is responsible for collection of user data and password: https://github.com/OpenSCAP/openscap/blob/maint-1.3/src/OVAL/probes/unix/password_probe.c

>  I think there are a number of problems.

Thank you very much for this detailed explanation. I will try to discuss this with our team and we will see how much these are concerns for our users. I can imagine that if the LDAP users aren't fetched it can cause false results and troubles for customers.

Fortunately, we have some of these covered in the rule description. we have there the following warnings:


8<---8<---8<---8<---8<---8<---8<---8<---8<---

Warning:  For this rule to evaluate centralized user accounts, getent must be working properly so that running the command

getent passwd

returns a list of all users in your organization. If using the System Security Services Daemon (SSSD),

enumerate = true

must be configured in your organization's domain to return a complete list of users

8<---8<---8<---8<---8<---8<---8<---8<---8<---

and 

8<---8<---8<---8<---8<---8<---8<---8<---8<---

Warning:  Enabling this rule will result in slower scan times depending on the size of your organization and number of centralized users.

8<---8<---8<---8<---8<---8<---8<---8<---8<---


> Why 'openscap' doesn't use `getpwuid()`/`getgrgid()` instead?
> Is this kind of premature optimization?

I'm looking now to man pages of `getpwent()` and `getpwuid()` and I can see that the  `getpwuid()` requires a uid as a parameter but the `getpwent()` does not require the uid but it returns successive entries when called repeatedly. We need to iterate over all users because the user name can be specified as a regex and we also don't now all UIDs on a system so the `getpwent()` looks more convenient for me.



Another thing:
I have found that a problem with this rule has also been reported on the content side: https://bugzilla.redhat.com/show_bug.cgi?id=2129400

Comment 16 Alexander Bokovoy 2022-11-01 10:37:25 UTC
> I'm looking now to man pages of `getpwent()` and `getpwuid()` and I can see that the  `getpwuid()` requires a uid as a parameter but the `getpwent()` does not require the uid but it returns successive entries when called repeatedly. We need to iterate over all users because the user name can be specified as a regex and we also don't now all UIDs on a system so the `getpwent()` looks more convenient for me.

You have file ownership information, either a name or a uid/gid pair. For the former, use getpwnam(), for the latter, use getpwuid()/getgrgid(). Once you'd resolved ID to a name, run regexp evaluation.

The use of getpwent() and iterating over potentially millions of users in a centralized database is not only wrong, it is effectively impossible to capture the state of a user database this way.
If you need a caching operation, then SSSD and many other NSS providers already do caching themselves so repeating getpwnam()/getpwuid()/... calls would operate on a cached entry anyway.

Comment 21 Alexey Tikhonov 2022-11-01 11:31:49 UTC
(In reply to Jan Černý from comment #14)
> 
> Thank you very much for this detailed explanation. I will try to discuss
> this with our team and we will see how much these are concerns for our
> users.

Thank you. I'll change component meanwhile to reflect this.


> Warning:  For this rule to evaluate centralized user accounts, getent must
> be working properly so that running the command
> 
> getent passwd
> 
> returns a list of all users in your organization.

Ok, so this is documented.
But, if I understand correctly, this is still impossible to compose a complete set of users this way with typical nsswitch.confg 'files sss systemd'
So check is doomed to fail if system has files owned by users from all databases.


> > Why 'openscap' doesn't use `getpwuid()`/`getgrgid()` instead?
> > Is this kind of premature optimization?
> 
> I'm looking now to man pages of `getpwent()` and `getpwuid()` and I can see
> that the  `getpwuid()` requires a uid as a parameter

`uid` is known from `fstat(file_to_check)`


> We need to iterate over all users because the user name can be
> specified as a regex

regexp is used to describe a "valid user"?

As Alexander mentioned: getpwuid(fstat(file)) => name => check against regexp?
Or do I completely misunderstand the task..?

Comment 26 RHEL Program Management 2023-08-17 14:13:05 UTC
Issue migration from Bugzilla to Jira is in process at this time. This will be the last message in Jira copied from the Bugzilla bug.