2111742 – pcp-selinux 5.3.5-8.el8 breaks selinux

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 2111742 - pcp-selinux 5.3.5-8.el8 breaks selinux

Summary: pcp-selinux 5.3.5-8.el8 breaks selinux

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 8
Classification:	Red Hat
Component:	pcp
Sub Component:
Version:	8.6
Hardware:	All
OS:	Linux
Priority:	unspecified
Severity:	urgent
Target Milestone:	rc
Target Release:	8.7
Assignee:	Nathan Scott
QA Contact:	Jan Kurik
Docs Contact:	Jacob Taylor Valdez
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2022-07-28 04:58 UTC by John
Modified:	2023-05-16 08:40 UTC (History)
CC List:	3 users (show)
Fixed In Version:	pcp-5.3.7-15.el8
Doc Type:	No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed:	2023-05-16 08:13:26 UTC
Type:	Bug
Target Upstream Version:
Embargoed:
Dependent Products:
Flags:	pm-rhel: mirror+

Attachments	(Terms of Use)
cil file from colleagues vm audctstmr003 (12.14 KB, text/plain) 2022-07-28 08:48 UTC, John	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Issue Tracker	RHELPLAN-129347	0	None	None	None	2022-07-28 05:02:55 UTC
Red Hat Product Errata	RHBA-2023:2745	0	None	None	None	2023-05-16 08:13:33 UTC

Description John 2022-07-28 04:58:52 UTC

Description of problem:

selinux commands failing with error:

Failed to resolve typeattributeset statement at /var/lib/selinux/targeted/tmp/modules/400/pcpupstream/cil:42

Version-Release number of selected component (if applicable):

Error is present in the latest versions available for EL8:
pcp-selinux 5.3.5-8.el8 
selinux-policy-3.14.3-95.el8.noarch

How reproducible:

UNAVOIDABLE, if anyone is stupid enough to install pcp.

Steps to Reproduce:
1. install pcp-selinux
2. try to run an selinux command such as semodukle
3.

Actual results:

[root@audctstmr002 07-27 17:39:35 ~]# semodule -i my-rhsmcertdworke.pp
Failed to resolve typeattributeset statement at /var/lib/selinux/targeted/tmp/modules/400/pcpupstream/cil:63
semodule:  Failed!

Expected results:
success.

Additional info:

Comment 1 John 2022-07-28 05:15:59 UTC

Also leads to hopeless failures like this:

[root@audccfots809 07-28 14:16:21 ~]# semanage fcontext -a -t insights_client_etc_t -s system_u /etc/insights-client/machine-id
libsepol.context_from_record: type insights_client_etc_t is not defined (No such file or directory).
libsepol.context_from_record: could not create context structure (Invalid argument).
libsemanage.validate_handler: invalid context system_u:object_r:insights_client_etc_t:s0 specified for /etc/insights-client/machine-id [all files] (Invalid argument).
libsemanage.dbase_llist_iterate: could not iterate over records (Invalid argument).
OSError: [Errno 22] Invalid argument

Absolutely disgraceful for this garbage to be released into an "Enterprise" operating system.

I have better things to do than come in here logging bugs that should have never been released into the wild, or if released, should have been noticed and FIXED before my time is wasted.

I have no doubt the next comment in this bug will be to reprimand me for being, so let me get in first and say I don't care.

I'm a Red hat customer.
It's not my job to debug your operating system, my hands are full trying to USE it.

Comment 2 Nathan Scott 2022-07-28 06:05:15 UTC

Hi John,

Really sorry to hear about this problem you've encountered. This is the first I'm hearing about this particular issue and I appreciate you taking the time to report it.

I can assure you that extensive QE is performed prior to each release, and this includes manual verification by a separate department (from mine) of each issue that has been reported as "fixed" in a release. The earlier BZs you found that are similar all have subtle differences unfortunately, and I suspect this case you're encountering is a new wrinkle on an old problem (note that the line numbers in the pcpupstream/cil:XXX reports are different each time - this is indicating different selinux policy interaction problems).

Could you help me understand the problem you are seeing further? That "cil" file where the error message is generated is a temporary file, generated specially for your system during rpm installation, and then discarded. This makes it difficult for me to see the root cause straight away - could you run:

# /usr/libexec/selinux/hll/pp /var/lib/pcp/selinux/pcpupstream.pp /tmp/pcpupstream.cil

and if you could then attach the output /tmp/pcpupstream.cil file to this issue, that'd help me immensely with getting to the bottom of this issue.

I've started discussions with other developers here and in the upstream PCP community to look into further defensive measures we can take to prevent this class of problems (selinux policy mismatches between packages) from having the kinds of impact you're seeing.

Again, apologies that this has adversely impacted on your systems and wasted your time.

Comment 3 John 2022-07-28 08:47:06 UTC

I already removed pcp & pcp-selinux from most affected systems, and when i did so, i found these problems were immediately fixed.
But now when I reinstall them on a test system, the issue does not come back...

I did not remove pcp-selinux from a colleague's test VM though, so it still has the same issue, eg if i run:

[root@audctstmr003 07-28 17:42:30 ~]# semanage fcontext -a -t insights_client_etc_t -s system_u /etc/insights-client/machine-id
libsepol.context_from_record: type insights_client_etc_t is not defined (No such file or directory).
libsepol.context_from_record: could not create context structure (Invalid argument).
libsemanage.validate_handler: invalid context system_u:object_r:insights_client_etc_t:s0 specified for /etc/insights-client/machine-id [all files] (Invalid argument).
libsemanage.dbase_llist_iterate: could not iterate over records (Invalid argument).
OSError: [Errno 22] Invalid argument
[root@audctstmr003 07-28 17:44:59 ~]#

I have used /usr/libexec/selinux/hll/pp to dump to a cil file on this VM (audctstmr003), and attached as requested.

Unfortunately this VM has selinux disabled, so i cannot generate a policy to load with semodule and see if it errors with a cil line number, like my other VMs were doing (at least not until I check with colleague that its ok to enable selinux and reboot his VM tomorrow). Not sure the attached cil will be any use to you without knowing whether semodule hits an error on this vm, and the particular line number for this VM, but i may be able to get that info tomorrow.

I also have a different copy of the actual /var/lib/selinux/targeted/tmp/modules/400/pcpupstream/cil file from one of the other VMs that had the issue before I uninstalled.
Unfortunately I don't know what line numbers in the cil semodule was comlaining about on this specific VM. It may well have been line 63 (as it was on other VM audctstmr002), but i can't be sure.
If it was line 63 of this file on this VM, then that would be: (typeattributeset cil_gen_require glusterd_log_t)

Anyway, you say this file is a temporary file "generated specially for your system during rpm installation, and then discarded", but if that is the case, why did it still exist on my VMs and why was it being referred to in error messages?

Also, since this problem goes away when pcp-selinux is reinstalled, that suggests it will be a pig to track down how it arose. It sounds like the sort of bug that should be avoided in the first place, by making your rpms more consistent and static. Relying on some supposedly temporary file generated by some unknown past version, sounds like a recipe for disaster, and i'd say bugs like this prove it.

Comment 4 John 2022-07-28 08:48:08 UTC

Created attachment 1899877 [details]
cil file from colleagues vm audctstmr003

Comment 5 Nathan Scott 2022-07-29 00:27:57 UTC

(In reply to John from comment #3)
> ...
> I have used /usr/libexec/selinux/hll/pp to dump to a cil file on this VM
> (audctstmr003), and attached as requested. 

Thanks!

> Unfortunately this VM has selinux disabled, so i cannot generate a policy to
> load with semodule and see if it errors with a cil line number, like my
> other VMs were doing (at least not until I check with colleague that its ok
> to enable selinux and reboot his VM tomorrow). Not sure the attached cil
> will be any use to you without knowing whether semodule hits an error on
> this vm, and the particular line number for this VM, but i may be able to
> get that info tomorrow.

OK, appreciate it.

> I also have a different copy of the actual
> /var/lib/selinux/targeted/tmp/modules/400/pcpupstream/cil file from one of
> the other VMs that had the issue before I uninstalled.
> Unfortunately I don't know what line numbers in the cil semodule was
> comlaining about on this specific VM. It may well have been line 63 (as it
> was on other VM audctstmr002), but i can't be sure.
> If it was line 63 of this file on this VM, then that would be:
> (typeattributeset cil_gen_require glusterd_log_t)
> 

FWIW, here's a sed one-liner to extract a specific line N -
$ sed 'N,N! d' ~/pcpupstream.cil

If it turns out line 63 is the problem line on the VM 003 too, from the
attached file that's pointing us toward:

$ sed '63,63! d' ~/pcpupstream.cil
(typeattributeset cil_gen_require sbd_exec_t)

there's also mention of line 42 in your report (#c1, line 5 of this BZ).
If that happens to be the same location in the file from this VM, thats:

$ sed '42,42! d' ~/pcpupstream.cil
(typeattributeset cil_gen_require numad_t)

... but let's wait and see if we can find a problematic line number that
is specific to this VM.

Comment 6 John 2022-07-29 03:28:19 UTC

Here we go, on colleague's VM audctstmr003:

When reinstalling selinux-policy, for example, we see this:

...
  Running scriptlet: selinux-policy-3.14.3-95.el8.noarch                                                                                                                                                                                 1/2 
Failed to resolve typeattributeset statement at /var/lib/selinux/targeted/tmp/modules/400/pcpupstream/cil:42
semodule:  Failed!

And we also have this:

[root@audctstmr003 07-29 13:26:30 z_cil]# ausearch -x /usr/sbin/ifconfig --raw | audit2allow -D -M my-ifconfig
******************** IMPORTANT ***********************
To make this policy package active, execute:

semodule -i my-ifconfig.pp

[root@audctstmr003 07-29 13:26:42 z_cil]# semodule -X 300 -i my-ifconfig.pp
Failed to resolve typeattributeset statement at /var/lib/selinux/targeted/tmp/modules/400/pcpupstream/cil:42
semodule:  Failed!
[root@audctstmr003 07-29 13:27:02 z_cil]#



And we have:
[root@audctstmr003 07-29 13:20:55 ~]# /usr/libexec/selinux/hll/pp /var/lib/pcp/selinux/pcpupstream.pp /tmp/pcpupstream.audctstmr003.cil

[root@audctstmr003 07-29 13:23:24 z_cil]# sed '42,42! d' ./pcpupstream.audctstmr003.cil
(typeattributeset cil_gen_require numad_t)

So on this VM, it seems to be this numad_t causing grief.

Comment 7 Nathan Scott 2022-08-01 05:38:05 UTC

Thanks John, that's an interesting data point.

One other question - can you describe the RHEL installation method that
you're using there?

I'm wondering if there's a different approach being used which is making
it more likely for some folks to encounter this problem than others.  In
particular, I'm wondering if it involves a "simultaneous" rpm install of
both selinux-policy and pcp-selinux?  (I think its more common for tests
to run on a scenario where there's an initial RHEL install, and then PCP
is installed "on top" subsequently - perhaps there's a race in the first
approach not present in the second.

Do you use kickstart?  virt-manager with ISOs?  Something else?  Thanks.

Comment 8 John 2022-08-04 00:48:20 UTC

My VMs are all created from a VMWare template built from an early el8 release, maybe el8.0, but more likely 8.1 i think.
Over time, as releases come out, the template has been upgraded through all releases to 8.6

Basically the template has followed the same process that any production VM (deployed early in EL8 lifecycle) would undergo.
Deployed, then updated periodically as patches & releases come out.

I'm not sure if PCP was installed as part of the base package selection during original installation, or if it was installed manually after installation from the ISO.
It was installed quite early, as I have already seen PCP cause a number of problems, which i have tolerated for too long already:

1) filling filesystems due to excessive rate of data collection, which I've had to cut back by manually updating the configuration.
2) excessive and extended CPU usage at midnight to zip logs.
3) the final straw recently is the excessive and extended CPU consumption which occurs when VMs are rebooted. When you reboot a production VM after patching, you have dracut & other business going on, and you may have services also attempting to startup. The last thing you need at this point is some useless piece of garbage like PCP insisting upon indexing & compressing logs. It should defer this nonsense until after midnight, but no, if you reboot a VM, PCP thinks its important enough to slow your VM to a crawl, and will chew CPU for 10 mins or more.

I have had enough of this bad behaviour from PCP.
I won't have it on my VMs anymore, it will be removed from every one at the earliest opportunity, and it won't ever be going back on.

Comment 22 errata-xmlrpc 2023-05-16 08:13:26 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (pcp bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:2745

Note You need to log in before you can comment on or make changes to this bug.