2297965 – selinux-policy-41.8-4.fc41 breaks libvirt-dbus

Bug 2297965 - selinux-policy-41.8-4.fc41 breaks libvirt-dbus

Summary: selinux-policy-41.8-4.fc41 breaks libvirt-dbus

Keywords:
Status:	CLOSED RAWHIDE
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	selinux-policy
Sub Component:
Version:	rawhide
Hardware:	Unspecified
OS:	Linux
Priority:	unspecified
Severity:	urgent
Target Milestone:	---
Assignee:	Zdenek Pytela
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:	https://artifacts.dev.testing-farm.io...
Whiteboard:	CockpitTest
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2024-07-15 15:33 UTC by Martin Pitt
Modified:	2024-08-13 19:27 UTC (History)
CC List:	9 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2024-08-13 19:27:37 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
screenshot showing Cockpit "Virtual machines" page when no VMs configured (65.26 KB, image/png) 2024-07-16 17:28 UTC, Steve	no flags	Details
View All

Description Martin Pitt 2024-07-15 15:33:32 UTC

This is essentially the same as https://issues.redhat.com/browse/RHEL-46893:

# busctl call org.libvirt /org/libvirt/QEMU org.libvirt.Connect ListDomains u 0
Call failed: Failed to connect socket to '/var/run/libvirt/virtqemud-sock': Permission denied

AVC avc:  denied  { connectto } for  pid=1454 comm="pool-libvirt-db" path="/run/libvirt/virtqemud-sock" scontext=system_u:system_r:virt_dbus_t:s0 tcontext=system_u:system_r:virtqemud_t:s0 tclass=unix_stream_socket permissive=0


Reproducible: Always

Comment 1 Steve 2024-07-15 17:15:41 UTC

The AVC is identical to the one in:

Bug 2295200 - SELinux is preventing pool-libvirt-db from 'connectto' accesses on the unix_stream_socket /run/libvirt/virtqemud-sock.

Perhaps the openQA cockpit tests could be extended to include that "busctl" test, because the three cockpit tests against rawhide passed:

https://openqa.fedoraproject.org/tests/overview?arch=&flavor=Server-dvd-iso&machine=&test=&modules=&module_re=&group_glob=&not_group_glob=&comment=&distri=fedora&version=Rawhide&build=Fedora-Rawhide-20240715.n.0&groupid=1#

server_cockpit_basic
server_cockpit_default
server_cockpit_updates

Comment 2 Steve 2024-07-15 17:20:32 UTC

Adam, what do think about adding Martin's "busctl" test to the openQA tests for cockpit?

Comment 3 Steve 2024-07-15 18:04:43 UTC

As this screenshot shows, openQA doesn't install cockpit-machines:

https://openqa.fedoraproject.org/tests/2730650#step/server_cockpit_basic/4

Comment 4 Steve 2024-07-15 19:32:03 UTC

(In reply to Steve from comment #3)
> As this screenshot shows, openQA doesn't install cockpit-machines:
> 
> https://openqa.fedoraproject.org/tests/2730650#step/server_cockpit_basic/4

Based on a test with F40 Workstation and with cockpit-machines installed (in a VM), no virtual machines need to be configured to trigger the AVC reported in this bug. The test system has libvirt installed and configured to use modular daemons.

Simply click on "Virtual machines" in the cockpit web console navigation panel.

Comment 5 Adam Williamson 2024-07-16 01:53:52 UTC

We can certainly talk about extending the coverage of the openQA cockpit test. It has never been changed since we added it years ago when cockpit was much smaller. But there's always the trade-off between coverage, reliability and resource usage; I don't want to turn the openQA test all the way into an exhaustive functional test for the whole of cockpit, that seems like it'd be fragile and eat up openQA worker time. But if the "machines" functionality is a sufficiently important part of cockpit, we can look at extending it for sure.

Comment 6 Steve 2024-07-16 02:58:16 UTC

After posting that, I found that the Cockpit Project has its own QA process.* However, here is the problem:

Bug 2295200 was reported 2024-07-02 after informal manual testing (meaning I was just trying Cockpit to see what it does).

This bug, which is a duplicate of Bug 2295200, was reported 2024-07-15 -- almost *two weeks later*.

As for what I am proposing -- it would be a one-click test -- click on "Virtual machines" (possibly as Administrator).

The point of Comment 5 is:

1. This AVC would have been detected by that test.

2. No VM needs to be installed, which should greatly simplify the test.

> I don't want to turn the openQA test all the way into an exhaustive functional test for the whole of cockpit, that seems like it'd be fragile and eat up openQA worker time.

Good point. So how you would divide the QA work between openQA tests and Cockpit QA tests?

> But if the "machines" functionality is a sufficiently important part of cockpit, we can look at extending it for sure.

That's a huge area and not what I was proposing:
https://github.com/cockpit-project/cockpit-machines/issues

* Integration Tests of Cockpit
https://github.com/cockpit-project/cockpit/blob/main/test/README.md

Comment 7 Steve 2024-07-16 03:20:59 UTC

BTW, this selinux-policy commit appears to fix this bug:

Allow virt_dbus_t connect to virtqemud_t over a unix stream socket
https://github.com/fedora-selinux/selinux-policy/commit/665ae4822df967f38e9acb5aff89068a933f1393

Comment 8 Martin Pitt 2024-07-16 05:31:41 UTC

Hello Steve,

(In reply to Steve from comment #6)
> Bug 2295200 was reported 2024-07-02 after informal manual testing (meaning I
> was just trying Cockpit to see what it does).
> 
> This bug, which is a duplicate of Bug 2295200, was reported 2024-07-15 --
> almost *two weeks later*.

Curious -- we only saw the failure in rawhide yesterday for the first time. We saw and reported it in RHEL 10 a week ago: https://issues.redhat.com/browse/RHEL-46893
We usually notice when our "fedora-rawhide" tests in upstream PR go red - but then again, they literally do that half of the time anyway, and last week there was a prolonged testing farm outage which could just have hidden that bug.

So the problem is that rawhide gets broken from so many different sides all the time that it is difficult to see/concentrate on a single regression. :-(

Wrt. Adam's response:
> > I don't want to turn the openQA test all the way into an exhaustive functional test for the whole of cockpit, that seems like it'd be fragile and eat up openQA worker time.

Yes, I sympathize with that. SELinux' QA is currently adding a libvirt smoke tests to their test suite as well, but that will happen very late in the game (and thus not protect Fedora, and it's also kind of too late). I lean towards running these tests in upstream selinux-policy PRs: https://github.com/fedora-selinux/selinux-policy/pull/2235

Comment 9 Steve 2024-07-16 07:21:17 UTC

(In reply to Martin Pitt from comment #8)
...
> Curious -- we only saw the failure in rawhide yesterday for the first time.
> We saw and reported it in RHEL 10 a week ago: https://issues.redhat.com/browse/RHEL-46893
...

Thanks for that link.

After reviewing Bug 2295200, I now remember that selinux-policy-targeted-41.7-1.fc41.noarch was never pushed:
https://bodhi.fedoraproject.org/updates/FEDORA-2024-64134f8805
(I downloaded it from Koji, so that would explain why the bug did not show up in rawhide until later.)

Notwithstanding any development issues, I think Cockpit is fantastic. In particular, I can see all my VMs in one list, regardless of whether they are in user or system sessions.

Comment 10 Steve 2024-07-16 15:47:36 UTC

selinux-policy-41.9-1.fc41
https://bodhi.fedoraproject.org/updates/FEDORA-2024-3510bbb1bb

Comment 11 Steve 2024-07-16 17:28:15 UTC

Created attachment 2039750 [details]
screenshot showing Cockpit "Virtual machines" page when no VMs configured

selinux-policy-41.9-1.fc41 restores basic cockpit-machines functionality.

This screenshot shows the Cockpit "Virtual machines" page when there are no VMs configured.

Adam and Martin: How would you assess the fragility of that image when used in an openQA test?

Comment 12 Martin Pitt 2024-07-17 06:06:35 UTC

Steve: The general layout, when you test this with CSS selectors and text comparison: stable enough (we do that in our tests as well).

If you mean pixel perfection: poorly. These tend to get subtle noise with every other browser, font, harfbuzz etc. update in the distributions. I take it you don't propose to compare the whole picture anyway (as that includes the host name, menu entries etc. which are all unrelated), but at least just the cockpit1:localhost/machines iframe. 

I'd recommend to test the text content of the page for "No VM is running", that's very robust and enough for a smoke test.

Comment 13 Steve 2024-07-17 20:57:36 UTC

Martin: Thanks for your suggestion. My knowledge of the openQA system is very limited, but I can say that it uses fuzzy image matching, so it can tolerate small changes in an image.

Here is an example from the server_cockpit_basic test against Server-dvd-iso:
https://openqa.fedoraproject.org/tests/2733108#step/server_cockpit_basic/6

The orange line is a slider that can be used to visually compare the test image and a "needle". There is a pull-down menu that lists "Candidate needles and tags".

A "needle" is "a full screenshot in PNG format and a json file with the same name (e.g. foo.png and foo.json) containing the associated data, like which areas inside the full screenshot are relevant or the mentioned list of tags."
https://open.qa/docs/#_needles

Comment 14 Zdenek Pytela 2024-08-13 19:27:37 UTC

Closing as I believe all major virt-related problems have been resolved in rawhide.
I will backport the fixes to F40, too, and continue with solving other reported issues.

Note You need to log in before you can comment on or make changes to this bug.