Bug 1208009
Summary: | worldwide name (WWN) for storage need to be uniquely identifiable | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 7 | Reporter: | Sibiao Luo <sluo> |
Component: | libvirt | Assignee: | Peter Krempa <pkrempa> |
Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 7.2 | CC: | chayang, dyuan, famz, gprocunier, juzhang, kwolf, lmen, michen, mkenneth, mzhan, pbonzini, pkrempa, pzhang, qzhang, rbalakri, rpacheco, virt-bugs, virt-maint, xfu, xuzhang, yanyang |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | libvirt-1.2.17-2.el7 | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | 1207998 | Environment: | |
Last Closed: | 2015-11-19 06:26:37 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1207993, 1207998 | ||
Bug Blocks: |
Description
Sibiao Luo
2015-04-01 08:10:46 UTC
host info: # uname -r && rpm -q qemu-kvm 3.10.0-234.el7.x86_64 qemu-kvm-1.5.3-86.el7.x86_64 Moving to libvirt. QEMU doesn't do sanity checks on the virtual machine as a whole. *** Bug 1207998 has been marked as a duplicate of this bug. *** Fixed upstream: commit 714b38cb232bcbbd7487af4c058fa6d0999b3326 Author: Peter Krempa <pkrempa> Date: Tue Apr 7 16:08:32 2015 +0200 qemu: Enforce WWN to be unique among VM's disks Operating systems use the identifier to name the disks. As the name suggests the ID should be unique. commit c35b011087548a9272c5339b6a88e4797536e093 Author: Peter Krempa <pkrempa> Date: Tue Apr 7 16:00:16 2015 +0200 conf: ABI: Check WWN in disk abi stability check Since the WWN influences guest behavior in naming disks we should treat this as vm ABI. v1.2.14-130-g714b38c I can reproduce it . and verify it like following : verify version : libvirt-1.2.17-1.el7.x86_64 qemu-kvm-rhev-2.3.0-7.el7.x86_64 verify steps : 1.check document : # firefox /usr/share/doc/libvirt-docs-1.2.17/html/formatdomain.html wwn If present, this element specifies the WWN (World Wide Name) of a virtual hard disk or CD-ROM drive. It must be composed of 16 hexadecimal digits and must be unique (at least among disks of a single domain) 2.start a guest with multi-disks using same WWN #virsh edit testvm ...... <disk type='block' device='disk'> <driver name='qemu' type='raw'/> <source dev='/dev/sdb'/> <target dev='sda' bus='scsi'/> <wwn>0x5000c50015ea71aa</wwn> </disk> <disk type='file' device='disk'> <driver name='qemu' type='qcow2'/> <source file='/var/lib/libvirt/images/qcow2.img'/> <target dev='sdb' bus='scsi'/> <wwn>0x5000c50015ea71aa</wwn> </disk> ...... #virsh start testvm error: Failed to start domain testvm error: unsupported configuration: Disks 'sda' and 'sdb' have identical WWN 3.change scsi disk to cdrom , start guest # virsh dumpxml testvm | grep disk -A 9 <disk type='file' device='cdrom'> <driver name='qemu' type='raw'/> <source startupPolicy='optional'/> <target dev='hda' bus='ide'/> <readonly/> <wwn>0x5000c50015ea71aa</wwn> <address type='drive' controller='0' bus='0' target='0' unit='1'/> </disk> <disk type='file' device='disk'> <driver name='qemu' type='qcow2'/> <source file='/var/lib/libvirt/images/qcow2.img'/> <target dev='sde' bus='scsi'/> <wwn>0x5000c50015ea71aa</wwn> <address type='drive' controller='0' bus='0' target='0' unit='4'/> </disk> #virsh start testvm error: Failed to start domain testvm error: unsupported configuration: Disks 'hda' and 'sde' have identical WWN 4.stability testing : # virsh dumpxml testvm | grep disk -A 9 ..... <disk type='file' device='disk'> <driver name='qemu' type='qcow2'/> <source file='/var/lib/libvirt/images/qcow2.img'/> <backingStore/> <target dev='sde' bus='scsi'/> <wwn>0x5000c50015ea71aa</wwn> <alias name='scsi0-0-0-4'/> <address type='drive' controller='0' bus='0' target='0' unit='4'/> </disk> #virsh dumpxml testvm > testvm.xml edit testvm.xml change <wwn>0x5000c50015ea71aa</wwn> to <wwn>0x5000c50015ea71bb</wwn> # virsh migrate testvm --live qemu+ssh://$IP/system --verbose --xml testvm.xml error: unsupported configuration: Target disk wwn '0x5000c50015ea71bb' does not match source '0x5000c50015ea71aa' Hi, peter Since WWN has impact on naming disks for operating system , I also tested hotplug , and I found following issue . I wonder if it also needs further modification in this bug . Thanks in advance. 1.start a healthy guest with a disk having wwn ,then attach disks using same wwn which is already used . #virsh dumpxml testvm | grep disk -A 9 <disk type='block' device='disk'> <driver name='qemu' type='qcow2'/> <source dev='/dev/sdb'/> <target dev='sdb' bus='scsi'/> <wwn>0x5000c50015ea71aa</wwn> <address type='drive' controller='0' bus='0' target='0' unit='0'/> </disk> login guest and check : # lsblk --scsi NAME HCTL TYPE VENDOR MODEL REV TRAN sda 0:0:0:1 disk QEMU QEMU HARDDISK 2.3. # ls -al /dev/disk/by-id/* lrwxrwxrwx. 1 root root 9 Jul 7 16:22 /dev/disk/by-id/scsi-35000c50015ea71aa -> ../../sda lrwxrwxrwx. 1 root root 9 Jul 7 16:22 /dev/disk/by-id/wwn-0x5000c50015ea71aa -> ../../sda 1.1 attach disks ,attach successfully . # cat disk-wwn.xml <disk type='file' device='disk'> <driver name='qemu' type='qcow2'/> <source file='/var/lib/libvirt/images/qcow2.img'/> <backingStore/> <target dev='sde' bus='scsi'/> <wwn>0x5000c50015ea71aa</wwn> <alias name='scsi0-0-0-4'/> <address type='drive' controller='0' bus='0' target='0' unit='4'/> </disk> # virsh attach-device testvm disk-wwn.xml --persistent Device attached successfully # virsh attach-disk testvm /var/lib/libvirt/images/raw.img sdf --targetbus scsi --wwn 0x5000c50015ea71aa Disk attached successfully #virsh dumpxml testvm | grep disk -A 9 <disk type='block' device='disk'> <driver name='qemu' type='raw'/> <source dev='/dev/sdb'/> <backingStore/> <target dev='sdb' bus='scsi'/> <wwn>0x5000c50015ea71aa</wwn> <alias name='scsi0-0-0-1'/> <address type='drive' controller='0' bus='0' target='0' unit='1'/> </disk> <disk type='file' device='disk'> <driver name='qemu' type='qcow2'/> <source file='/var/lib/libvirt/images/qcow2.img'/> <backingStore/> <target dev='sde' bus='scsi'/> <wwn>0x5000c50015ea71aa</wwn> <alias name='scsi0-0-0-4'/> <address type='drive' controller='0' bus='0' target='0' unit='4'/> </disk> <disk type='file' device='disk'> <driver name='qemu' type='raw'/> <source file='/var/lib/libvirt/images/raw.img'/> <backingStore/> <target dev='sdf' bus='scsi'/> <wwn>0x5000c50015ea71aa</wwn> <alias name='scsi0-0-0-5'/> <address type='drive' controller='0' bus='0' target='0' unit='5'/> </disk> login guest and check , in /dev/disk/by-id/ directory only sdc exists : # lsblk --scsi NAME HCTL TYPE VENDOR MODEL REV TRAN sda 0:0:0:1 disk QEMU QEMU HARDDISK 2.3. sdb 0:0:0:4 disk QEMU QEMU HARDDISK 2.3. sdc 0:0:0:5 disk QEMU QEMU HARDDISK 2.3. # ls -al /dev/disk/by-id/* ...... lrwxrwxrwx. 1 root root 9 Jul 7 16:32 /dev/disk/by-id/scsi-35000c50015ea71aa -> ../../sdc lrwxrwxrwx. 1 root root 9 Jul 7 16:32 /dev/disk/by-id/wwn-0x5000c50015ea71aa -> ../../sdc We definitely should make sure that the wwn check can't be bypassed by disk hotplug. I posted a patch upstream. commit 780fe4e4baf7e2f10f65ba1a34f9274fc547cad2 Author: Peter Krempa <pkrempa> Date: Wed Jul 8 16:10:05 2015 +0200 qemu: Check duplicate WWNs also for hotplugged disks In commit 714b38cb232bcbbd7487af4c058fa6d0999b3326 I tried to avoid having two disks with the same WWN in a VM. I forgot to check the hotplug paths though which make it possible bypass that check. Reinforce the fix by checking the wwn when attaching the disk. v1.2.17-75-g780fe4e Verify version : libvirt-1.2.17-2.el7.x86_64 qemu-kvm-rhev-2.3.0-9.el7.x86_64 steps : 1. define and start g guest which has a disk with WWN : #virsh dumpxml testvm <disk type='file' device='disk'> <driver name='qemu' type='qcow2'/> <source file='/var/lib/libvirt/images/qcow2.img'/> <backingStore/> <target dev='sde' bus='scsi'/> <wwn>0x5000c50015ea71aa</wwn> <alias name='scsi0-0-0-4'/> <address type='drive' controller='0' bus='0' target='0' unit='4'/> </disk> # virsh attach-disk testvm /var/lib/libvirt/images/raw.img sdf --targetbus scsi --wwn 0x5000c50015ea71aa --live error: Failed to attach disk error: unsupported configuration: Domain already has a disk with wwn '0x5000c50015ea71aa' # virsh attach-disk testvm /var/lib/libvirt/images/raw.img sdf --targetbus scsi --wwn 0x5000c50015ea71aa --config error: Failed to attach disk error: unsupported configuration: Domain already has a disk with wwn '0x5000c50015ea71aa' Fail to hotplug disk with same WWN . According comment7 and above steps , move this bug to verified . Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2015-2202.html I think this needs to be re-opened / reverted. This change prevents me from configuring multipath on guests that I am doing hba pass through of disks to guests via storage pools. Given: mpathc (360060e80105ade30056eafd300000005) dm-5 HITACHI ,DF600F size=64G features='1 queue_if_no_path' hwhandler='0' wp=rw |-+- policy='service-time 0' prio=0 status=active | |- 3:0:0:0 sde 8:64 active undef running | `- 4:0:1:0 sdq 65:0 active undef running Where: [root@lgrhpar01 home]# sg_inq -p 0x83 /dev/sde VPD INQUIRY: Device Identification page Designation descriptor number 1, descriptor length: 24 designator_type: T10 vendor identification, code_set: ASCII associated with the addressed logical unit vendor id: HITACHI vendor specific: 911400510005 Designation descriptor number 2, descriptor length: 6 designator_type: vendor specific [0x0], code_set: ASCII associated with the target port vendor specific: 01 Designation descriptor number 3, descriptor length: 20 designator_type: NAA, code_set: Binary associated with the addressed logical unit NAA 6, IEEE Company_id: 0x60e8 Vendor Specific Identifier: 0x105ade30 Vendor Specific Identifier Extension: 0x56eafd300000005 [0x60060e80105ade30056eafd300000005] [root@lgrhpar01 home]# sg_inq -p 0x83 /dev/sdq VPD INQUIRY: Device Identification page Designation descriptor number 1, descriptor length: 24 designator_type: T10 vendor identification, code_set: ASCII associated with the addressed logical unit vendor id: HITACHI vendor specific: 911400510005 Designation descriptor number 2, descriptor length: 6 designator_type: vendor specific [0x0], code_set: ASCII associated with the target port vendor specific: 03 Designation descriptor number 3, descriptor length: 20 designator_type: NAA, code_set: Binary associated with the addressed logical unit NAA 6, IEEE Company_id: 0x60e8 Vendor Specific Identifier: 0x105ade30 Vendor Specific Identifier Extension: 0x56eafd300000005 [0x60060e80105ade30056eafd300000005] I create : [root@lgrhpar01 home]# virsh pool-dumpxml pool-fc0p1 <pool type='scsi'> <name>pool-fc0p1</name> <uuid>9bcdbd52-231c-4172-825a-9684401f6dcd</uuid> <capacity unit='bytes'>137438953472</capacity> <allocation unit='bytes'>137438953472</allocation> <available unit='bytes'>0</available> <source> <adapter type='fc_host' wwnn='20000025b5000000' wwpn='20000025b50a0000'/> </source> <target> <path>/dev/disk/by-path</path> <permissions> <mode>0700</mode> <owner>0</owner> <group>0</group> </permissions> </target> </pool> [root@lgrhpar01 home]# virsh pool-dumpxml pool-fc1p1 <pool type='scsi'> <name>pool-fc1p1</name> <uuid>781c874a-6f10-4d9d-ac01-739b411a7c59</uuid> <capacity unit='bytes'>137438953472</capacity> <allocation unit='bytes'>137438953472</allocation> <available unit='bytes'>0</available> <source> <adapter type='fc_host' wwnn='20000025b5000000' wwpn='20000025b5000b00'/> </source> <target> <path>/dev/disk/by-path</path> <permissions> <mode>0700</mode> <owner>0</owner> <group>0</group> </permissions> </target> </pool> When I list the volumes in pool-fc0p1: [root@lgrhpar01 home]# virsh vol-list pool-fc0p1 --details Name Path Type Capacity Allocation ---------------------------------------------------------------------------------------------------------- unit:0:0:0 /dev/disk/by-path/pci-0000:10:00.0-fc-0x50060e80105ade31-lun-0 block 64.00 GiB 64.00 GiB unit:0:1:0 /dev/disk/by-path/pci-0000:10:00.0-fc-0x50060e80105ade3b-lun-0 block 64.00 GiB 64.00 GiB I then present the disk to my guest: <disk type='volume' device='disk'> <driver name='qemu' type='raw'/> <source pool='pool-fc0p1' volume='unit:0:0:0'/> <target dev='sda' bus='scsi'/> <wwn>056eafd300000005</wwn> </disk> <disk type='volume' device='disk'> <driver name='qemu' type='raw'/> <source pool='pool-fc0p1' volume='unit:0:1:0'/> <target dev='sdb' bus='scsi'/> <wwn>056eafd300000005</wwn> </disk> Which then generates a parse failure because of this patch. error: unsupported configuration: Disks 'sda' and 'sdb' have identical WWN TL;DR this patch fails to consider lun passthrough from storage pools where the targets need to be managed by multipathd in the guest. I agree that the patch should probably be reverted or, alternatively, libvirt should support QEMU's scsi-disk.port_index property and accept a unique (WWN,port_index) pair. That said, I'm surprised that multipath works with this configuration. I would have expected that this is necessary: <disk type='volume' device='lun'> <driver name='qemu' type='raw'/> <source pool='pool-fc0p1' volume='unit:0:0:0'/> <target dev='sda' bus='scsi'/> </disk> <disk type='volume' device='lun'> <driver name='qemu' type='raw'/> <source pool='pool-fc0p1' volume='unit:0:1:0'/> <target dev='sdb' bus='scsi'/> </disk> where device='lun' means that the WWN is passed through from the host without specifying it manually. Perhaps this configuration would suit your usecase better anyway. Sadly this is not the case, which lead me to find this issue in the first place: Within the guest: [root@lvocrad01 ~]# for i in sd{a..d}; do printf "%s %s\n" $i "$(/lib/udev/scsi_id -g -p 0x83 /dev/$i)"; done sda 0QEMU QEMU HARDDISK drive-scsi0-0-0-0 sdb 0QEMU QEMU HARDDISK drive-scsi0-0-0-3 sdc 0QEMU QEMU HARDDISK drive-scsi0-0-0-2 sdd 0QEMU QEMU HARDDISK drive-scsi0-0-0-1 Within the host: [root@lgrhpar01 home]# for i in sd{e,q,f,n}; do printf "%s %s\n" $i "$(/lib/udev/scsi_id -g -p 0x83 /dev/$i)"; done sde 360060e80105ade30056eafd300000005 sdq 360060e80105ade30056eafd300000005 sdf 360060e80105ade30056eafd300000005 sdn 360060e80105ade30056eafd300000005 Sorry please disregard my last comment, setting device='lun' did indeed correctly pass the duplicate wwn's in and multipath is working as expected. Thanks. RHEV should use this configuration if you use DirectLUN together with virtio-scsi. the results are not the same for coldplug and hotplug version: libvirt-daemon-3.2.0-10.el7.x86_64 qemu-kvm-rhev-2.9.0-10.el7.x86_64 steps: 1.coldplug start a guest with 2 disk: <disk type='file' device='disk'> <driver name='qemu' type='qcow2'/> <source file='/var/lib/libvirt/images/RHEL-7.4-x86_64-latest.qcow2'/> <target dev='hda' bus='ide'/> *** <wwn>0x5000c50015ea71aa</wwn> *** <address type='drive' controller='0' bus='0' target='0' unit='0'/> </disk> <disk type='file' device='disk'> <driver name='qemu' type='raw'/> <source file='/var/lib/libvirt/images/test1.img'/> <target dev='hdb' bus='ide'/> *** <wwn>0x5000c50015ea71aa</wwn> *** </disk> [root@lmen1 ~]# virsh destroy test;virsh start test Domain test destroyed Domain test started in the guest,there are 2 disk,and the 2nd disk can be read/wrote 2.hotplug a disk with the same wwn start a guest with xml: <disk type='file' device='disk'> <driver name='qemu' type='qcow2'/> <source file='/var/lib/libvirt/images/RHEL-7.4-x86_64-latest.qcow2'/> <target dev='hda' bus='ide'/> <wwn>0x5000c50015ea71aa</wwn> <address type='drive' controller='0' bus='0' target='0' unit='0'/> </disk> [root@lmen1 ~]# virsh destroy test;virsh start test Domain test destroyed Domain test started [root@lmen1 ~]# virsh attach-disk test /var/lib/libvirt/images/test1.img sdf --targetbus scsi --wwn 0x5000c50015ea71aa --live error: Failed to attach disk error: unsupported configuration: Domain already has a disk with wwn '0x5000c50015ea71aa' so my question is: 1)should WWN need to be uniquely or not? 2)should coldplug and hotplug have the same action for WWN? Yes this is expected. In fact it broke multipath configurations thus this was reverted: commit 5da28cc3069b573f54f0bcaf8eb75476bcfdc6e9 Author: Peter Krempa <pkrempa> Date: Fri Jun 24 17:01:27 2016 +0200 conf: Allow disks with identical WWN or serial Disallowing them broke a use case of testing multipath configurations for storage. Originally this was added as it was impossible to use certain /dev/disk-by... links but the disks worked properly. Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1349895 (In reply to Peter Krempa from comment #21) > Yes this is expected. In fact it broke multipath configurations thus this > was reverted: > > commit 5da28cc3069b573f54f0bcaf8eb75476bcfdc6e9 > Author: Peter Krempa <pkrempa> > Date: Fri Jun 24 17:01:27 2016 +0200 > > conf: Allow disks with identical WWN or serial > > Disallowing them broke a use case of testing multipath configurations > for storage. Originally this was added as it was impossible to > use certain /dev/disk-by... links but the disks worked properly. > > Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1349895 But the actions of coldplug and hotplug are not the same, coldplug allows 2 disks with the same WWN,but hotplug does not,as said in comment20 Is it also expected? Ah right, I forgot to revert 780fe4e4baf7e2f10f65ba1a34f9274fc547cad2. I'll post a patch. (In reply to Peter Krempa from comment #23) > Ah right, I forgot to revert 780fe4e4baf7e2f10f65ba1a34f9274fc547cad2. > > I'll post a patch. I filed a new bug 1464975 to track the hotplug issue |