Bug 2010306

Summary: Operation not permitted in setting unpriv_sgio for KubeVirt
Product: Red Hat Enterprise Linux Advanced Virtualization Reporter: Alice Frosi <afrosi>
Component: libvirtAssignee: Michal Privoznik <mprivozn>
Status: CLOSED ERRATA QA Contact: Han Han <hhan>
Severity: high Docs Contact:
Priority: high    
Version: 8.5CC: abologna, fdeutsch, jdenemar, jsuchane, lmen, mprivozn, virt-maint, yafu, yalzhang, ymankad
Target Milestone: rcKeywords: Triaged, Upstream, ZStream
Target Release: 8.5   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: libvirt-7.6.0-5.module+el8.5.0+12933+58cb48a1 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 2012223 (view as bug list) Environment:
Last Closed: 2021-11-16 07:55:27 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version: 7.9.0
Embargoed:
Bug Depends On:    
Bug Blocks: 2012223    

Description Alice Frosi 2021-10-04 12:39:33 UTC
Description of problem:
KubeVirt has currently a permission issue [1] when it tries to use a lun device. Libvirt fails in writing the value in /sys/dev/block/<maj:min>/queue/unpriv_sgio

Libvirt is running in an unprivileged container (virt-launcher pod) and in the container sysfs is mounted ro. In order to be able to set the value the container needs to be privileged and have sysfs mounted as rw (more details in [1]).  
In KubeVirt setup the value should be set by virt-handler (the privileged component) and libvirt should only check if the value is the expected one and void to perform the write.

[1] https://github.com/kubevirt/kubevirt/issues/6507


How reproducible:
Always

Steps to Reproduce:
Apply the yaml reported in [1] and checks the logs.

Actual results:
The VM in the virt-launcher pod is failing with "failed to set /sys/dev/block/251:0/queue/unpriv_sgio: Operation not permitted"

Expected results:
Run a VM with lun device

Comment 2 Michal Privoznik 2021-10-05 07:37:11 UTC
FYI: unpriv_sgio was removed from RHEL-9 in bug 1810667 (kernel) and bug 1810661 (libvirt). So maybe the sysfs exported to the pod comes from a different kernel (that still has unpriv_sgio)? Nevertheless, it's worth fixing for older kernels (non-RHEL-9).

Comment 3 Alice Frosi 2021-10-05 12:20:45 UTC
Yes currently we found the problem in the kubevirt development environment that is based on centos 8

Comment 5 Michal Privoznik 2021-10-05 12:43:28 UTC
Yeah, I think this should be filed against RHEL-AV-8 because as I say in comment 2 the unpriv_sgio doesn't exist in RHEL-9, so it is not going to be possible to see "Operation not permitted" error. Let me switch this to RHEL-AV.

Comment 6 Michal Privoznik 2021-10-05 12:49:07 UTC
Patch proposed on the list:

https://listman.redhat.com/archives/libvir-list/2021-October/msg00073.html

Comment 7 Michal Privoznik 2021-10-05 13:00:22 UTC
Merged upstream:

commit f60bc4f6208696efbe87fab22ccca8345381a6e1
Author:     Michal Prívozník <mprivozn>
AuthorDate: Tue Oct 5 14:37:12 2021 +0200
Commit:     Michal Prívozník <mprivozn>
CommitDate: Tue Oct 5 14:58:52 2021 +0200

    qemu: Check if unpriv_sgio is already set before trying to set it
    
    In case when libvirt runs inside a restricted container it may
    not have enough permissions to modify unpriv_sgio. However, it
    may have been set beforehand by sysadmin or an orchestration
    tool. Therefore, let's check whether the currently set value is
    the one we want and if it is refrain from writing to the file.
    
    Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2010306
    Signed-off-by: Michal Privoznik <mprivozn>
    Reviewed-by: Ján Tomko <jtomko>

v7.8.0-39-gf60bc4f620

Comment 16 Alice Frosi 2021-10-07 11:42:46 UTC
I verified Michal's patch with KubeVirt by setting manually /sys/dev/block/<maj:min>/queue/unpriv_sgio to 1 and by adding sgio=unfiltered. This solves the permission problem for the part that regards libvirt.

Comment 18 Han Han 2021-10-09 06:41:02 UTC
Set verified:tested according to the result of comment16

Comment 22 Han Han 2021-10-21 07:14:36 UTC
Here is a simple reproducing method:

Setup:
Prepare a scsi disk
# modprobe scsi_debug

Prepare a VM xml with scsi lun:
# xmllint --xpath //disk rhel9.xml
<disk type="file" device="disk">
      <driver name="qemu" type="qcow2"/>
      <source file="/var/lib/libvirt/images/rhel9.qcow2" index="2"/>
      <backingStore/>
      <target dev="vda" bus="virtio"/>
      <alias name="virtio-disk0"/>
      <address type="pci" domain="0x0000" bus="0x00" slot="0x07" function="0x0"/>
    </disk><disk type="block" device="lun" sgio="filtered">
      <driver name="qemu" type="raw" cache="none" error_policy="stop" io="native" discard="unmap"/>
      <source dev="/dev/sdb" index="1"/>
      <backingStore/>
      <target dev="sda" bus="scsi"/>
      <alias name="ua-blockpvcdisk"/>
      <address type="drive" controller="0" bus="0" target="0" unit="0"/>
    </disk>

Remove write permission from /sys/dev/block/8:16/queue/unpriv_sgio
# chmod -w /sys/dev/block/8:16/queue/unpriv_sgio


Reproduced on libvirt-7.6.0-4.module+el8.5.0+12786+c4633d9a.x86_64:
# virsh create rhel9.xml                                                                                                                                                                  
error: Failed to create domain from rhel9.xml                                                                                                                                                
error: failed to set /sys/dev/block/8:16/queue/unpriv_sgio: Operation not permitted 

Verified on libvirt-7.6.0-5.module+el8.5.0+12933+58cb48a1
# virsh create ./rhel9.xml
Domain 'rhel9' created from ./rhel9.xml

Comment 24 errata-xmlrpc 2021-11-16 07:55:27 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (virt:av bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:4684