Bug 1775680
Summary: | unable to execute QEMU command '__com.redhat_drive_add': could not open disk image "Operation not permitted" [rhel-7.7.z] | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | RAD team bot copy to z-stream <autobot-eus-copy> |
Component: | libvirt | Assignee: | Michal Privoznik <mprivozn> |
Status: | CLOSED ERRATA | QA Contact: | Han Han <hhan> |
Severity: | urgent | Docs Contact: | |
Priority: | urgent | ||
Version: | 7.6 | CC: | amashah, chhudson, dyuan, hhan, hreitz, jdenemar, jinzhao, jsuchane, juzhang, kwolf, lmen, mprivozn, mtessun, qinwang, virt-maint, xuzhang |
Target Milestone: | rc | Keywords: | Upstream, ZStream |
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | libvirt-4.5.0-23.el7_7.4 | Doc Type: | Bug Fix |
Doc Text: |
Cause:
To enhance security and work around some races with udev (or any other software that mangles SELinux labels on /dev/* nodes whilst a domain is running) libvirt spawns each domain in its own, private namespace with a private /dev. This puts additional burden on libvirt which then has to update /dev/* nodes on some APIs like virDomainAttachDevice (aka virsh attach-device) or remove some on virDomainDetachDevice (aka virsh detach-device). For a generic devices this works perfectly as libvirt looks into the host's /dev and creates corresponding /dev/* nodes in the domain's private namespace exactly as it is in the host. However, there is one exception - disks. On a disk hot unplug libvirt is not removing any /dev/* nodes from the private namespace because they might still be in use by a backing chain of some other disk. And this is what is causing the problem. Imagine /dev/nvme0n1 disk which has some MAJOR:MINOR number (these are there to identify a device uniquely on kernel level). Now hotplug the disk into a domain => libvirt creates the exact copy in the domain's namespace. Then hotunplug the disk from the domain => libvirt keeps the /dev/nvme0n1 in the domain namespace. Now, hotunplug the NVMe disk from the host and hot plug it back again => The MINOR number is likely to change after this. However, at this point there is a discrepancy between MINOR number in the host and the one in the domain's namespace.
Consequence:
Qemu is trying to open a different device than it thinks and because of devices CGroup it is denied the access.
Fix:
The fix consists of forcibly creating /dev/nvme0n1 (or in general any other device) even if it exists in the domain's private namespace. This way we can be sure that it is the exact copy as in the host's namespace.
Result:
Hotplugging disks multiple times work again.
|
Story Points: | --- |
Clone Of: | 1752978 | Environment: | |
Last Closed: | 2020-02-04 19:29:46 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1752978 | ||
Bug Blocks: |
Description
RAD team bot copy to z-stream
2019-11-22 14:55:13 UTC
Verfied on libvirt-4.5.0-23.el7_7.4.x86_64 qemu-kvm-rhev-2.12.0-33.el7_7.7.x86_64 1. Steps 1.1 start an vm 1.2 make a new node in qemu namespace, which has different maj:min number with the device on host 1.3 attach/detach the host device 2. Devices to be tested - block disk device - usb device - input device - serial device 3. Detailed steps First start an VM # virsh list Id Name State ---------------------------------------------------- 1 vm running 3.1 Test block disk device as steps1 ➜ modprobe scsi_debug add_host=3 ➜ lsscsi [0:0:0:0] disk ATA WDC WD2500AAJS-7 3E01 /dev/sda [1:0:0:0] cd/dvd PLDS DVD-ROM DH-16D3S SD11 /dev/sr0 [4:0:0:0] disk Linux scsi_debug 0004 /dev/sdb [5:0:0:0] disk Linux scsi_debug 0004 /dev/sdc [6:0:0:0] disk Linux scsi_debug 0004 /dev/sdd ➜ cat disk.xml <disk type='block' device='disk'> <driver name='qemu' type='raw'/> <source dev='/dev/sdb'/> <target dev='sdb' bus='scsi'/> </disk> ➜ ls -l /dev/sdb brw-rw----. 1 root disk 8, 16 Dec 19 00:52 /dev/sdb ➜ nsenter -m -t $(pgrep qemu) mknod /dev/sdb b 8 17 ➜ virsh attach-device vm disk.xml Device attached successfully ➜ virsh detach-device vm disk.xml Device detached successfully 3.2 Test usb device as steps1 ➜ cat hostdev-usb.xml <hostdev mode='subsystem' type='usb'> <source startupPolicy='mandatory'> <address bus='1' device='2'/> </source> </hostdev> ➜ lsusb Bus 001 Device 004: ID 0624:0854 Avocent Corp. Bus 001 Device 002: ID 2297:0210 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 002 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub ➜ ls /dev/bus/usb/001/002 -l crw-rw-r--. 1 root root 189, 1 Jul 4 02:33 /dev/bus/usb/001/002 ➜ nsenter -m -t $(pgrep qemu) mkdir /dev/bus/usb/001/ -p ➜ nsenter -m -t $(pgrep qemu) mknod /dev/bus/usb/001/002 c 189 2 ➜ virsh attach-device vm hostdev-usb.xml Device attached successfully ➜ virsh detach-device vm hostdev-usb.xml Device detached successfully 3.3 Test input device as steps1 ➜ cat input.xml <input type='passthrough' bus='virtio'> <source evdev='/dev/input/event1'/> </input> ➜ ls -al /dev/input/event1 -al crw-rw----. 1 root input 13, 65 Jul 4 02:33 /dev/input/event1 ➜ nsenter -m -t $(pgrep qemu) mkdir /dev/input/ -p ➜ nsenter -m -t $(pgrep qemu) mknod /dev/input/event1 c 13 33 ➜ virsh attach-device vm input.xml Device attached successfully ➜ virsh detach-device vm input.xml Device detached successfully 3.4 Test serial device as steps1 ➜ cat serial.xml <serial type='dev'> <source path='/dev/ttyS2'/> <target type='pci-serial' port='0'/> </serial> ➜ ls /dev/ttyS2 -l crw-rw----. 1 root dialout 4, 66 Jul 4 02:33 /dev/ttyS2 ➜ nsenter -m -t $(pgrep qemu) mknod /dev/ttyS2 c 4 67 ➜ virsh attach-device vm serial.xmlDevice attached successfully ➜ virsh detach-device vm serial.xml Device detached successfully Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0367 |