Bug 1775679
Summary: | unable to execute QEMU command '__com.redhat_drive_add': could not open disk image "Operation not permitted" [rhel-7.6.z] | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | RAD team bot copy to z-stream <autobot-eus-copy> |
Component: | libvirt | Assignee: | Michal Privoznik <mprivozn> |
Status: | CLOSED ERRATA | QA Contact: | Han Han <hhan> |
Severity: | urgent | Docs Contact: | |
Priority: | urgent | ||
Version: | 7.6 | CC: | amashah, chhudson, dyuan, gveitmic, hhan, hreitz, jdenemar, jinzhao, jsuchane, juzhang, kcleveng, kwolf, libvirt-maint, lmen, mprivozn, mtessun, qinwang, virt-maint, xuzhang |
Target Milestone: | rc | Keywords: | Upstream, ZStream |
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | libvirt-4.5.0-10.el7_6.15 | Doc Type: | Bug Fix |
Doc Text: |
Cause:
To enhance security and work around some races with udev (or any other software that mangles SELinux labels on /dev/* nodes whilst a domain is running) libvirt spawns each domain in its own, private namespace with a private /dev. This puts additional burden on libvirt which then has to update /dev/* nodes on some APIs like virDomainAttachDevice (aka virsh attach-device) or remove some on virDomainDetachDevice (aka virsh detach-device). For a generic devices this works perfectly as libvirt looks into the host's /dev and creates corresponding /dev/* nodes in the domain's private namespace exactly as it is in the host. However, there is one exception - disks. On a disk hot unplug libvirt is not removing any /dev/* nodes from the private namespace because they might still be in use by a backing chain of some other disk. And this is what is causing the problem. Imagine /dev/nvme0n1 disk which has some MAJOR:MINOR number (these are there to identify a device uniquely on kernel level). Now hotplug the disk into a domain => libvirt creates the exact copy in the domain's namespace. Then hotunplug the disk from the domain => libvirt keeps the /dev/nvme0n1 in the domain namespace. Now, hotunplug the NVMe disk from the host and hot plug it back again => The MINOR number is likely to change after this. However, at this point there is a discrepancy between MINOR number in the host and the one in the domain's namespace.
Consequence:
Qemu is trying to open a different device than it thinks and because of devices CGroup it is denied the access.
Fix:
The fix consists of forcibly creating /dev/nvme0n1 (or in general any other device) even if it exists in the domain's private namespace. This way we can be sure that it is the exact copy as in the host's namespace.
Result:
Hotplugging disks multiple times work again.
|
Story Points: | --- |
Clone Of: | 1752978 | Environment: | |
Last Closed: | 2019-12-10 12:38:23 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1752978 | ||
Bug Blocks: |
Description
RAD team bot copy to z-stream
2019-11-22 14:54:45 UTC
Reproduced on libvirt-4.5.0-10.el7_6.14.x86_64, no nvme disk required: 1. Start a vm 2. Check current MAJ:MIN number of host disk: # lsblk sdb 8:16 0 10G 0 disk 3. Change MAJ:MIN number of disk in qemu namespace # nsenter -m -t $(pidof qemu-kvm) mknod /dev/sdb b 8 17 4. Live attach the disk # virsh attach-disk pc /dev/sdb sdb error: Failed to attach disk error: internal error: unable to execute QEMU command '__com.redhat_drive_add': Device 'drive-scsi0-0-0-1' could not be initialized Test on libvirt-4.5.0-10.el7_6.15.x86_64: 1. Prepare a as step1~3 above 2. Live attach the disk # virsh attach-disk pc /dev/sdb sdb Disk attached successfully 3. Detach and reattach the disk # virsh detach-disk pc /dev/sdb Disk detached successfully # virsh attach-disk pc /dev/sdb sdb Disk attached successfully It works as expected. Then I will check the patch and run some regressions test to see if any regressions. BTW, I find the fix may affect other host char or block devices. I will update the results of them then. Verified version: libvirt-4.5.0-10.el7_6.15.x86_64 qemu-kvm-rhev-2.12.0-18.el7_6.7.x86_64 For the following host passthrough devices, we tested bug reproducing scenarios (https://bugzilla.redhat.com/show_bug.cgi?id=1775679#c9), vm start with these devices, device hotplug/hotunplug, all PASS. - block disk - hostdev - usb - mdev - scsi - mdev - nvdimm memory device - char device - input device - rng device For the hostdev scsi_host and tpm device, it is not supported in rhel7.6, skipped. For the graphic gl devices, it does not support hotplug/unplug, only starting vm with this device is PASSed. And I also tested snapshot and blockjob related cases, PASS: - live external disk snapshot on block device(lvm LVs) - blockcommit: - shallow commit from active layer, then pivot to the destination layer - commit from active layer for the all backing chain, then pivot to the destination layer - blockcopy: - shallow copy to a block device, then pivot to the new layer - copy the whole backing chain to a block device, then pivot to the new layer Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:4169 |