Bug 1775680

Summary: unable to execute QEMU command '__com.redhat_drive_add': could not open disk image "Operation not permitted" [rhel-7.7.z]
Product: Red Hat Enterprise Linux 7 Reporter: RAD team bot copy to z-stream <autobot-eus-copy>
Component: libvirtAssignee: Michal Privoznik <mprivozn>
Status: CLOSED ERRATA QA Contact: Han Han <hhan>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 7.6CC: amashah, chhudson, dyuan, hhan, hreitz, jdenemar, jinzhao, jsuchane, juzhang, kwolf, lmen, mprivozn, mtessun, qinwang, virt-maint, xuzhang
Target Milestone: rcKeywords: Upstream, ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: libvirt-4.5.0-23.el7_7.4 Doc Type: Bug Fix
Doc Text:
Cause: To enhance security and work around some races with udev (or any other software that mangles SELinux labels on /dev/* nodes whilst a domain is running) libvirt spawns each domain in its own, private namespace with a private /dev. This puts additional burden on libvirt which then has to update /dev/* nodes on some APIs like virDomainAttachDevice (aka virsh attach-device) or remove some on virDomainDetachDevice (aka virsh detach-device). For a generic devices this works perfectly as libvirt looks into the host's /dev and creates corresponding /dev/* nodes in the domain's private namespace exactly as it is in the host. However, there is one exception - disks. On a disk hot unplug libvirt is not removing any /dev/* nodes from the private namespace because they might still be in use by a backing chain of some other disk. And this is what is causing the problem. Imagine /dev/nvme0n1 disk which has some MAJOR:MINOR number (these are there to identify a device uniquely on kernel level). Now hotplug the disk into a domain => libvirt creates the exact copy in the domain's namespace. Then hotunplug the disk from the domain => libvirt keeps the /dev/nvme0n1 in the domain namespace. Now, hotunplug the NVMe disk from the host and hot plug it back again => The MINOR number is likely to change after this. However, at this point there is a discrepancy between MINOR number in the host and the one in the domain's namespace. Consequence: Qemu is trying to open a different device than it thinks and because of devices CGroup it is denied the access. Fix: The fix consists of forcibly creating /dev/nvme0n1 (or in general any other device) even if it exists in the domain's private namespace. This way we can be sure that it is the exact copy as in the host's namespace. Result: Hotplugging disks multiple times work again.
Story Points: ---
Clone Of: 1752978 Environment:
Last Closed: 2020-02-04 19:29:46 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1752978    
Bug Blocks:    

Description RAD team bot copy to z-stream 2019-11-22 14:55:13 UTC
This bug has been copied from bug #1752978 and has been proposed to be backported to 7.7 z-stream (EUS).

Comment 4 Han Han 2019-12-19 07:07:49 UTC
Verfied on libvirt-4.5.0-23.el7_7.4.x86_64 qemu-kvm-rhev-2.12.0-33.el7_7.7.x86_64

1. Steps
1.1 start an vm
1.2 make a new node in qemu namespace, which has different maj:min number with the device
on host
1.3 attach/detach the host device

2. Devices to be tested
- block disk device
- usb device
- input device
- serial device


3. Detailed steps
First start an VM
# virsh list
 Id    Name                           State
----------------------------------------------------
 1     vm                             running


3.1 Test block disk device as steps1
➜   modprobe scsi_debug add_host=3
➜   lsscsi
[0:0:0:0]    disk    ATA      WDC WD2500AAJS-7 3E01  /dev/sda
[1:0:0:0]    cd/dvd  PLDS     DVD-ROM DH-16D3S SD11  /dev/sr0
[4:0:0:0]    disk    Linux    scsi_debug       0004  /dev/sdb
[5:0:0:0]    disk    Linux    scsi_debug       0004  /dev/sdc
[6:0:0:0]    disk    Linux    scsi_debug       0004  /dev/sdd
➜   cat disk.xml
  <disk type='block' device='disk'>
    <driver name='qemu' type='raw'/>
    <source dev='/dev/sdb'/>
    <target dev='sdb' bus='scsi'/>
  </disk>

➜   ls -l /dev/sdb
brw-rw----. 1 root disk 8, 16 Dec 19 00:52 /dev/sdb

➜   nsenter -m -t $(pgrep qemu) mknod /dev/sdb b 8 17
➜   virsh attach-device vm disk.xml
Device attached successfully

➜   virsh detach-device vm disk.xml
Device detached successfully


3.2 Test usb device as steps1
➜   cat hostdev-usb.xml 
  <hostdev mode='subsystem' type='usb'>
    <source startupPolicy='mandatory'>
<address bus='1' device='2'/>
    </source>
  </hostdev>

➜   lsusb              
Bus 001 Device 004: ID 0624:0854 Avocent Corp. 
Bus 001 Device 002: ID 2297:0210  
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 002 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub

➜   ls /dev/bus/usb/001/002 -l
crw-rw-r--. 1 root root 189, 1 Jul  4 02:33 /dev/bus/usb/001/002

➜   nsenter -m -t $(pgrep qemu) mkdir /dev/bus/usb/001/ -p
➜   nsenter -m -t $(pgrep qemu) mknod /dev/bus/usb/001/002 c 189 2

➜   virsh attach-device vm hostdev-usb.xml
Device attached successfully

➜   virsh detach-device vm hostdev-usb.xml
Device detached successfully


3.3 Test input device as steps1
➜   cat input.xml 
<input type='passthrough' bus='virtio'>
    <source evdev='/dev/input/event1'/>
  </input>

➜   ls -al /dev/input/event1 -al                               
crw-rw----. 1 root input 13, 65 Jul  4 02:33 /dev/input/event1

➜   nsenter -m -t $(pgrep qemu) mkdir /dev/input/ -p
➜   nsenter -m -t $(pgrep qemu) mknod /dev/input/event1 c 13 33
➜   virsh attach-device vm input.xml
Device attached successfully

➜   virsh detach-device vm input.xml
Device detached successfully


3.4 Test serial device as steps1
➜   cat serial.xml 
  <serial type='dev'>
    <source path='/dev/ttyS2'/>
    <target type='pci-serial' port='0'/>
  </serial>

➜   ls /dev/ttyS2 -l
crw-rw----. 1 root dialout 4, 66 Jul  4 02:33 /dev/ttyS2
➜   nsenter -m -t $(pgrep qemu) mknod /dev/ttyS2 c 4 67
➜   virsh attach-device vm serial.xmlDevice attached successfully

➜   virsh detach-device vm serial.xml
Device detached successfully

Comment 6 errata-xmlrpc 2020-02-04 19:29:46 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0367