Bug 1729022

Summary: Attaching a device with a taken address succeeds when vm inactive
Product: Red Hat Enterprise Linux Advanced Virtualization Reporter: Han Han <hhan>
Component: libvirtAssignee: Michal Privoznik <mprivozn>
Status: CLOSED ERRATA QA Contact: Han Han <hhan>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 8.1CC: bugproxy, chhu, dyuan, hannsj_uhl, jdenemar, jiyan, jsuchane, lmen, mprivozn, xuzhang
Target Milestone: rcKeywords: Patch, Upstream
Target Release: 8.1   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: libvirt-5.6.0-1.el8 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1692296 Environment:
Last Closed: 2019-11-06 07:17:49 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1692296, 1692354    
Bug Blocks: 1624641    

Description Han Han 2019-07-11 08:23:06 UTC
+++ This bug was initially created as a clone of Bug #1692296 +++

Description of problem:
If there's a disk that an user tries to attach and there's <address/> provided in the attach XML which is already assigned to another disk in the domain, the operation doesn' fail and libvirt attaches the disk successfully.

Version-Release number of selected component (if applicable):


How reproducible:
100%

Steps to Reproduce:
1. Prepare a domain which has a disk with address <address type='drive' controller='0' bus='0' target='0' unit='0'/>
2. Prepare a disk XML for attach and let it have the same <address/>, for instance:

    <disk type='file' device='disk'>
      <driver name='qemu' type='raw' cache='none'/>
      <source file='/var/lib/libvirt/images/fd.img'/>
      <target dev='sdc' bus='scsi'/>
      <readonly/>
      <shareable/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    </disk>

3. Try to attach the disk into the domain: virsh attach-device $dom $file

Actual results:
For running domain some error is produced, but it's produced by qemu to which libvirt reponds by performing rollback which may detach the disk from step 1).
For inactive domain, the disk is added into the domain XML leaving the domain unable to start.


Expected results:
Attaching a device in step 3) must fail.


Additional info:
Fixed upstream as:

f1d6585300 domain_conf: check device address before attach

v5.1.0-367-gf1d6585300

--- Additional comment from RHEL Product and Program Management on 2019-03-25 09:45:33 UTC ---

Since this bug report was entered in Red Hat Bugzilla, the release flag has been set to ? to ensure that it is properly evaluated for this release.

--- Additional comment from Michal Privoznik on 2019-04-18 16:46:49 UTC ---

To POST:

http://post-office.corp.redhat.com/archives/rhvirt-patches/2019-April/msg00917.html

--- Additional comment from errata-xmlrpc on 2019-04-23 09:06:01 UTC ---

Bug report changed to ON_QA status by Errata System.
A QE request has been submitted for advisory RHEA-2019:40684-01
https://errata.devel.redhat.com/advisory/40684

--- Additional comment from Han Han on 2019-04-30 07:18:06 UTC ---

Reproduced on upstream libvirt-4.9:
1. Start a vm with scsi disk:
<disk type="file" device="disk">
      <driver name="qemu" type="raw" />
      <source file="/tmp/b" />
      <backingStore />
      <target dev="sda" bus="scsi" />
      <readonly />
      <alias name="scsi0-0-0-0" />
      <address type="drive" controller="0" bus="0" target="0" unit="0" />
    </disk>

# virsh qemu-monitor-command rhel --hmp info block
drive-virtio-disk0 (#block181): /var/lib/libvirt/images/rhel.qcow2 (qcow2)
    Attached to:      /machine/peripheral/virtio-disk0/virtio-backend
    Cache mode:       writeback

drive-scsi0-0-0-0 (#block395): /tmp/b (raw, read-only)
    Attached to:      scsi0-0-0-0
    Cache mode:       writeback


2. Attach a scsi disk with the same address:
# cat /tmp/disk.xml
<disk type="file" device="disk">
      <driver name="qemu" type="raw"/>
      <source file="/tmp/a" />
      <readonly/>
      <target dev="sdb" bus="scsi"/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    </disk>

# virsh attach-device rhel /tmp/disk.xml          
error: Failed to attach device from /tmp/disk.xml
error: internal error: unable to execute QEMU command 'device_add': Duplicate ID 'scsi0-0-0-0' for device

# virsh qemu-monitor-command rhel --hmp info block
drive-virtio-disk0 (#block181): /var/lib/libvirt/images/rhel.qcow2 (qcow2)
    Attached to:      /machine/peripheral/virtio-disk0/virtio-backend
    Cache mode:       writeback

scsi0-0-0-0: [not inserted]
    Attached to:      scsi0-0-0-0

Check the libvirtd log, and the following is found:
2019-04-30 07:09:11.465+0000: 21782: debug : qemuMonitorJSONCheckError:384 : unable to execute QEMU command {"execute":"device_add","arguments":{"driver":"scsi-hd","bus":"scsi0.0","channel":"0","scsi-id":"0","lun":"0","drive":"drive-scsi0-0-0-0","id":"scsi0-0-0-0"},"id":"libvirt-23"}: {"id":"libvirt-23","error":{"class":"GenericError","desc":"Duplicate ID 'scsi0-0-0-0' for device"}}
2019-04-30 07:09:11.465+0000: 21782: error : qemuMonitorJSONCheckError:395 : internal error: unable to execute QEMU command 'device_add': Duplicate ID 'scsi0-0-0-0' for device
2019-04-30 07:09:11.466+0000: 21782: debug : qemuMonitorDriveDel:2982 : drivestr=drive-scsi0-0-0-0
2019-04-30 07:09:11.466+0000: 21782: debug : qemuMonitorDriveDel:2984 : mon:0x7f4004023840 vm:0x7f3fd4053d60 json:1 fd:22
2019-04-30 07:09:11.466+0000: 21782: debug : qemuMonitorHMPCommandWithFd:1288 : mon:0x7f4004023840 vm:0x7f3fd4053d60 json:1 fd:22
2019-04-30 07:09:11.466+0000: 21782: debug : qemuMonitorJSONCommandWithFd:304 : Send command '{"execute":"human-monitor-command","arguments":{"command-line":"drive_del drive-scsi0-0-0-0"},"id":"libvirt-24"}' for write with FD -1
2019-04-30 07:09:11.466+0000: 21782: info : qemuMonitorSend:1083 : QEMU_MONITOR_SEND_MSG: mon=0x7f4004023840 msg={"execute":"human-monitor-command","arguments":{"command-line":"drive_del drive-scsi0-0-0-0"},"id":"libvirt-24"}^M
 fd=-1
2019-04-30 07:09:11.466+0000: 21778: info : qemuMonitorIOWrite:551 : QEMU_MONITOR_IO_WRITE: mon=0x7f4004023840 buf={"execute":"human-monitor-command","arguments":{"command-line":"drive_del drive-scsi0-0-0-0"},"id":"libvirt-24"}^M
 len=114 ret=114 errno=0
2019-04-30 07:09:11.468+0000: 21778: debug : qemuMonitorJSONIOProcessLine:196 : Line [{"return": "", "id": "libvirt-24"}]

--- Additional comment from Han Han on 2019-04-30 08:48:00 UTC ---

Verified on libvirt-4.5.0-15.virtcov.el7.x86_64 qemu-kvm-rhev-2.12.0-27.el7.x86_64:
1. Prepare a running vm with a scsi disk:
<disk type="file" device="disk">
      <driver name="qemu" type="raw" />
      <source file="/tmp/a" />
      <backingStore />
      <target dev="sda" bus="scsi" />
      <alias name="scsi0-0-0-0" />
      <address type="drive" controller="0" bus="0" target="0" unit="0" />
    </disk>


2. Attach a scsi disk with the same address:
# cat /tmp/disk.xml
<disk type="file" device="disk">
      <driver name="qemu" type="raw" />
      <source file="/tmp/b" />
      <backingStore />
      <target dev="sdb" bus="scsi" />
      <address type="drive" controller="0" bus="0" target="0" unit="0" />
    </disk>


# virsh attach-device rhel7.5 /tmp/disk.xml    
error: Failed to attach device from /tmp/disk.xml
error: Requested operation is not valid: Domain already contains a disk with that address

3. Check the disk status:
# virsh qemu-monitor-command rhel7.5 --hmp info block
drive-virtio-disk0 (#block119): /var/lib/libvirt/images/rhel7.5.img (qcow2)
    Attached to:      /machine/peripheral/virtio-disk0/virtio-backend
    Cache mode:       writeback

drive-scsi0-0-0-0 (#block381): /tmp/a (raw)
    Attached to:      scsi0-0-0-0
    Cache mode:       writeback

The disk is not offline.

--- Additional comment from Han Han on 2019-06-25 08:58:52 UTC ---

Hi Michel,
I find the qemuDomainAttachSCSIDisk is only called by qemuDomainAttachDeviceDiskLive. The fix doesn't contain the duplicated scsi addr check for inactive vm, right? Is there enough capacity to fix it for inactive vm?

--- Additional comment from Han Han on 2019-06-25 09:04:58 UTC ---

Rerun as comment5 for inactive vm on libvirt-4.5.0-23.virtcov.el7.x86_64. Libvirt doesn't check duplicated scsi addr for inactive vm when attach.

1. 
# cat /tmp/a.xml 
<disk type="file" device="disk">
      <driver name="qemu" type="raw" />
      <source file="/tmp/b" />
      <backingStore />
      <target dev="sda" bus="scsi" />
      <readonly />
      <alias name="scsi0-0-0-0" />
      <address type="drive" controller="0" bus="0" target="0" unit="0" />
    </disk>

# cat /tmp/a.xml 
<disk type="file" device="disk">
      <driver name="qemu" type="raw" />
      <source file="/tmp/b" />
      <backingStore />
      <target dev="sda" bus="scsi" />
      <readonly />
      <alias name="scsi0-0-0-0" />
      <address type="drive" controller="0" bus="0" target="0" unit="0" />
    </disk>

2. 
➜  libvirt-4.5.0 virsh -k0 attach-device PC /tmp/a.xml --config                                                                                                             
Device attached successfully

➜  libvirt-4.5.0 virsh -k0 attach-device PC /tmp/b.xml --config
Device attached successfully

➜  libvirt-4.5.0 virsh start PC --console                      
error: Failed to start domain PC
error: unsupported configuration: Found duplicate drive address for disk with target name 'sda' controller='0' bus='0' target='0' unit='0'

--- Additional comment from Michal Privoznik on 2019-06-26 13:48:51 UTC ---

(In reply to Han Han from comment #7)
>

Yes, that is a bug. I've posted a patch here:

https://www.redhat.com/archives/libvir-list/2019-June/msg01251.html

--- Additional comment from Michal Privoznik on 2019-06-26 13:50:53 UTC ---

As stated in comment 8 and comment 7 cold plug is still buggy, but I'm not sure if it's worth taking this bug back to ASSIGNED. I mean, the scenario is a corner case and does not lead to any data loss or something. It's going to be fixed upstream and thus we will eventually pick it up by rebase. Not in RHEL-7 though.

--- Additional comment from Han Han on 2019-06-27 01:48:08 UTC ---

Hi, chhu
Do you know how RHV or RHOS attach scsi device to inactive VM? Is there specific scsi address in the device xml?

--- Additional comment from  on 2019-06-28 02:52:58 UTC ---

Hi, Yanjing

Will you please help to double check below items in RHV ? Thank you!
- cold-plug scsi device to VM and start the VM
  -- check VM xml after cold-plug without starting the VM - without address ?
  -- check VM xml after cold-plug and start the VM - with address ?
  -- check if the operation successfully or not - successfully ?

--- Additional comment from jiyan on 2019-07-02 05:46:20 UTC ---

Hi @chhu @hhan
This scenario works well in RHV.

Version:
qemu-kvm-rhev-2.12.0-33.el7.x86_64
vdsm-4.30.22-1.el7ev.x86_64
libvirt-4.5.0-23.el7.x86_64
kernel-3.10.0-1057.el7.x86_64

Steps:
1. Operate in RHV.
   1> New a VM
   2> Add 3 disks for VM, all are with scsi bus
   3> Configure other paramters

2. Start VM, before VM totally start, check vdsm.log
# vim /var/log/vdsm/vdsm.log
domain xml info:

<disk snapshot="no" type="file" device="disk">
<target dev="sda" bus="scsi"></target>
<source file="/rhev/data-center/34c4810a-989d-11e9-9ac1-1866dae68908/3a2fd551-106c-4a84-9f52-1d0a255e8104/images/b02eacb3-cf98-48b0-bc73-7b55d3ad3c7a/48e0fe6c-c2ad-444d-abe7-d6bcaf96c60a">
<seclabel model="dac" type="none" relabel="no"></seclabel>
</source>
<driver name="qemu" io="threads" type="raw" error_policy="stop" cache="none"></driver>
<alias name="ua-b02eacb3-cf98-48b0-bc73-7b55d3ad3c7a"></alias>
<address bus="0" controller="0" unit="0" type="drive" target="0"></address>   ****************
<boot order="1"></boot>
<serial>b02eacb3-cf98-48b0-bc73-7b55d3ad3c7a</serial>
</disk>

<disk snapshot="no" type="file" device="disk">
<target dev="sdb" bus="scsi"></target>
<source file="/rhev/data-center/34c4810a-989d-11e9-9ac1-1866dae68908/3a2fd551-106c-4a84-9f52-1d0a255e8104/images/59478617-bf21-4484-b92d-5a02dfdb0a61/46f16e14-4ad5-4de9-a5be-86548c6ec5dd">
<seclabel model="dac" type="none" relabel="no"></seclabel>
</source>
<driver name="qemu" io="threads" type="raw" error_policy="stop" cache="none"></driver>
<alias name="ua-59478617-bf21-4484-b92d-5a02dfdb0a61"></alias>
<address bus="0" controller="0" unit="1" type="drive" target="0"></address>   ****************
<serial>59478617-bf21-4484-b92d-5a02dfdb0a61</serial>
</disk>

<disk snapshot="no" type="file" device="disk">
<target dev="sdc" bus="scsi">
</target><source file="/rhev/data-center/34c4810a-989d-11e9-9ac1-1866dae68908/3a2fd551-106c-4a84-9f52-1d0a255e8104/images/6515f144-ccb9-45a5-aac3-ffab78f05a19/c0754e43-a55d-496e-8f2c-9466967f3f8e">
<seclabel model="dac" type="none" relabel="no"></seclabel>
</source>
<driver name="qemu" io="threads" type="raw" error_policy="stop" cache="none"></driver>
<alias name="ua-6515f144-ccb9-45a5-aac3-ffab78f05a19"></alias>
<address bus="0" controller="0" unit="2" type="drive" target="0"></address>   ****************
<serial>6515f144-ccb9-45a5-aac3-ffab78f05a19</serial>
</disk>



3. VM start successfully, check active XML
# virsh dumpxml vm
sasl...

    <disk type='file' device='cdrom'>
      <driver name='qemu' error_policy='report'/>
      <source startupPolicy='optional'/>
      <target dev='hdc' bus='ide'/>
      <readonly/>
      <alias name='ua-c58b9452-d966-485f-b56b-1eb9ff72074c'/>
      <address type='drive' controller='0' bus='1' target='0' unit='0'/>
    </disk>

    <disk type='file' device='disk' snapshot='no'>
      <driver name='qemu' type='raw' cache='none' error_policy='stop' io='threads'/>
      <source file='/rhev/data-center/mnt/intel-i72600-03.qe.lab.eng.nay.redhat.com:_home_nfs/3a2fd551-106c-4a84-9f52-1d0a255e8104/images/b02eacb3-cf98-48b0-bc73-7b55d3ad3c7a/48e0fe6c-c2ad-444d-abe7-d6bcaf96c60a'>
        <seclabel model='dac' relabel='no'/>
      </source>
      <backingStore/>
      <target dev='sda' bus='scsi'/>
      <serial>b02eacb3-cf98-48b0-bc73-7b55d3ad3c7a</serial>
      <boot order='1'/>
      <alias name='ua-b02eacb3-cf98-48b0-bc73-7b55d3ad3c7a'/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    </disk>

    <disk type='file' device='disk' snapshot='no'>
      <driver name='qemu' type='raw' cache='none' error_policy='stop' io='threads'/>
      <source file='/rhev/data-center/mnt/intel-i72600-03.qe.lab.eng.nay.redhat.com:_home_nfs/3a2fd551-106c-4a84-9f52-1d0a255e8104/images/59478617-bf21-4484-b92d-5a02dfdb0a61/46f16e14-4ad5-4de9-a5be-86548c6ec5dd'>
        <seclabel model='dac' relabel='no'/>
      </source>
      <backingStore/>
      <target dev='sdb' bus='scsi'/>
      <serial>59478617-bf21-4484-b92d-5a02dfdb0a61</serial>
      <alias name='ua-59478617-bf21-4484-b92d-5a02dfdb0a61'/>
      <address type='drive' controller='0' bus='0' target='0' unit='1'/>
    </disk>

    <disk type='file' device='disk' snapshot='no'>
      <driver name='qemu' type='raw' cache='none' error_policy='stop' io='threads'/>
      <source file='/rhev/data-center/mnt/intel-i72600-03.qe.lab.eng.nay.redhat.com:_home_nfs/3a2fd551-106c-4a84-9f52-1d0a255e8104/images/6515f144-ccb9-45a5-aac3-ffab78f05a19/c0754e43-a55d-496e-8f2c-9466967f3f8e'>
        <seclabel model='dac' relabel='no'/>
      </source>
      <backingStore/>
      <target dev='sdc' bus='scsi'/>
      <serial>6515f144-ccb9-45a5-aac3-ffab78f05a19</serial>
      <alias name='ua-6515f144-ccb9-45a5-aac3-ffab78f05a19'/>
      <address type='drive' controller='0' bus='0' target='0' unit='2'/>
    </disk>

--- Additional comment from Michal Privoznik on 2019-07-08 12:30:37 UTC ---

Patch referenced in comment 8 is now merged:

commit 881686d4b15306fd5a5f0592d502ddb33ee6437e
Author:     Michal Privoznik <mprivozn>
AuthorDate: Wed Jun 26 15:35:11 2019 +0200
Commit:     Michal Privoznik <mprivozn>
CommitDate: Mon Jul 8 14:26:20 2019 +0200

    qemu: Validate disk against domain def on coldplug
    
    https://bugzilla.redhat.com/show_bug.cgi?id=1692296#c7
    
    This is a counterpart for ddc72f99027 and implements the same
    check for coldplug.
    
    Signed-off-by: Michal Privoznik <mprivozn>
    Reviewed-by: Erik Skultety <eskultet>

v5.5.0-61-g881686d4b1

Comment 1 Han Han 2019-08-06 06:47:52 UTC
Test on upstream libvirt-5.6:
1. Start a vm with scsi disk:
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2'/>
      <source file='/tmp/scsi.qcow2'/>
      <target dev='sdb' bus='scsi'/>
      <serial>second-scsi</serial>
      <address type='drive' controller='0' bus='0' target='0' unit='1'/>
    </disk>


2. Cold plug a scsi disk with the same addr:
# cat /tmp/scsi-cdrom.xml
    <disk type='file' device='cdrom'>
      <driver name='qemu' type='raw'/>
      <target dev='sdd' bus='scsi'/>
      <readonly/>
      <address type='drive' controller='0' bus='0' target='0' unit='1'/>
    </disk>

# virsh attach-device copy /tmp/scsi-cdrom.xml --config
error: Failed to attach device from /tmp/scsi-cdrom.xml
error: Requested operation is not valid: Domain already contains a disk with that address

Works as expected.

Comment 2 Han Han 2019-08-06 09:14:36 UTC
Although it works for this bug, I find such cases still failed:
- scsi disk cold update with already used addr
- sata disk cold plug with already used addr
- sata disk cold update with already used addr
- ide disk cold plug with with already used addr
- ide disk cold update with already used addr

Michal, could you fix them all together?

Comment 4 Han Han 2019-09-18 07:15:22 UTC
Verified as comment1 on libvirt-5.6.0-5.module+el8.1.0+4229+2e4e348c.x86_64 qemu-kvm-4.1.0-10.module+el8.1.0+4234+33aa4f57.x86_64

Comment 5 Han Han 2019-09-18 09:33:14 UTC
For the issues of comment2, how about solve them by calling virDomainDefValidate at the end of qemuDomainAttachDeviceConfig and qemuDomainUpdateDeviceConfig?

Comment 6 Michal Privoznik 2019-09-25 13:33:57 UTC
(In reply to Han Han from comment #5)
> For the issues of comment2, how about solve them by calling
> virDomainDefValidate at the end of qemuDomainAttachDeviceConfig and
> qemuDomainUpdateDeviceConfig?

Sorry for late response. Yeah that looks like a good solution. Currently I'm swamped with some other work so if you want to cook the patches I will happily review them.

Comment 8 errata-xmlrpc 2019-11-06 07:17:49 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:3723