RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 1775679 - unable to execute QEMU command '__com.redhat_drive_add': could not open disk image "Operation not permitted" [rhel-7.6.z]
Summary: unable to execute QEMU command '__com.redhat_drive_add': could not open disk ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: libvirt
Version: 7.6
Hardware: Unspecified
OS: Linux
urgent
urgent
Target Milestone: rc
: ---
Assignee: Michal Privoznik
QA Contact: Han Han
URL:
Whiteboard:
Depends On: 1752978
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-11-22 14:54 UTC by RAD team bot copy to z-stream
Modified: 2023-09-07 21:04 UTC (History)
19 users (show)

Fixed In Version: libvirt-4.5.0-10.el7_6.15
Doc Type: Bug Fix
Doc Text:
Cause: To enhance security and work around some races with udev (or any other software that mangles SELinux labels on /dev/* nodes whilst a domain is running) libvirt spawns each domain in its own, private namespace with a private /dev. This puts additional burden on libvirt which then has to update /dev/* nodes on some APIs like virDomainAttachDevice (aka virsh attach-device) or remove some on virDomainDetachDevice (aka virsh detach-device). For a generic devices this works perfectly as libvirt looks into the host's /dev and creates corresponding /dev/* nodes in the domain's private namespace exactly as it is in the host. However, there is one exception - disks. On a disk hot unplug libvirt is not removing any /dev/* nodes from the private namespace because they might still be in use by a backing chain of some other disk. And this is what is causing the problem. Imagine /dev/nvme0n1 disk which has some MAJOR:MINOR number (these are there to identify a device uniquely on kernel level). Now hotplug the disk into a domain => libvirt creates the exact copy in the domain's namespace. Then hotunplug the disk from the domain => libvirt keeps the /dev/nvme0n1 in the domain namespace. Now, hotunplug the NVMe disk from the host and hot plug it back again => The MINOR number is likely to change after this. However, at this point there is a discrepancy between MINOR number in the host and the one in the domain's namespace. Consequence: Qemu is trying to open a different device than it thinks and because of devices CGroup it is denied the access. Fix: The fix consists of forcibly creating /dev/nvme0n1 (or in general any other device) even if it exists in the domain's private namespace. This way we can be sure that it is the exact copy as in the host's namespace. Result: Hotplugging disks multiple times work again.
Clone Of: 1752978
Environment:
Last Closed: 2019-12-10 12:38:23 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2019:4169 0 None None None 2019-12-10 12:38:28 UTC

Description RAD team bot copy to z-stream 2019-11-22 14:54:45 UTC
This bug has been copied from bug #1752978 and has been proposed to be backported to 7.6 z-stream (EUS).

Comment 9 Han Han 2019-11-27 02:37:12 UTC
Reproduced on libvirt-4.5.0-10.el7_6.14.x86_64, no nvme disk required:
1. Start a vm
2. Check current MAJ:MIN number of host disk:
# lsblk
sdb                              8:16   0    10G  0 disk 

3. Change MAJ:MIN number of disk in qemu namespace
# nsenter -m -t $(pidof qemu-kvm) mknod /dev/sdb b 8 17

4. Live attach the disk
# virsh attach-disk pc /dev/sdb sdb
error: Failed to attach disk
error: internal error: unable to execute QEMU command '__com.redhat_drive_add': Device 'drive-scsi0-0-0-1' could not be initialized

Test on libvirt-4.5.0-10.el7_6.15.x86_64:
1. Prepare a as step1~3 above
2. Live attach the disk
# virsh attach-disk pc /dev/sdb sdb
Disk attached successfully

3. Detach and reattach the disk
# virsh detach-disk pc /dev/sdb                                                                                                                 
Disk detached successfully

# virsh attach-disk pc /dev/sdb sdb
Disk attached successfully

It works as expected.
Then I will check the patch and run some regressions test to see if any regressions.

Comment 10 Han Han 2019-11-27 09:21:08 UTC
BTW, I find the fix may affect other host char or block devices. I will update the results of them then.

Comment 11 Han Han 2019-11-29 07:48:12 UTC
Verified version: libvirt-4.5.0-10.el7_6.15.x86_64 qemu-kvm-rhev-2.12.0-18.el7_6.7.x86_64

For the following host passthrough devices, we tested  bug reproducing scenarios
(https://bugzilla.redhat.com/show_bug.cgi?id=1775679#c9), vm start with these devices,
device hotplug/hotunplug, all PASS.
- block disk
- hostdev
  - usb
  - mdev
  - scsi
  - mdev
- nvdimm memory device
- char device
- input device
- rng device

For the hostdev scsi_host and tpm device, it is not supported in rhel7.6, skipped.
For the graphic gl devices, it does not support hotplug/unplug, only starting vm with
this device is PASSed.

And I also tested snapshot and blockjob related cases, PASS:
- live external disk snapshot on block device(lvm LVs)
- blockcommit:
  - shallow commit from active layer, then pivot to the destination  layer
  - commit from active layer for the all backing chain, then pivot to the destination  layer
- blockcopy:
  - shallow copy to a block device, then pivot to the new layer
  - copy the whole backing chain to a block device, then pivot to the new layer

Comment 15 errata-xmlrpc 2019-12-10 12:38:23 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:4169


Note You need to log in before you can comment on or make changes to this bug.