Description ----------- libvirt's QEMU drive host alias generating code creates the same alias ('drive-virtio-disk1') for different two disk targets (vdb, vdb1). Daniel Berrangé points out that 'vdb1' is a partition number, and hot plugging it should not be allowed in most cases -- "except with Xen which allows that for paravirtualized disks. We should reject such disk names with an error right away in all other cases." - - - [Refer to some more analysis on this issue at the end with Laine Stump & Peter Krempa] Version ------- libvirt-daemon-kvm-1.3.5-1.fc24.x86_64 qemu-system-x86-2.6.0-3.fc24.x86_64 How reproducible: Consistently. Steps to reproduce ------------------ - On a guest that is shut down, attach two disks with labels 'vdb', and 'vdb1' $ sudo virsh attach-disk cvm1 \ /export/vmimages/1.raw vdb --config $ sudo virsh attach-disk cvm1 \ /export/vmimages/2.raw vdb1 --config $ virsh domblklist cvm1 Target Source ------------------------------------------------ vda /var/lib/libvirt/images/cirros-0.3.3-x86_64-disk.img vdb /export/vmimages/1.raw vdb1 /export/vmimages/2.raw - Which results in guest XML: [...] <disk type='file' device='disk'> <driver name='qemu' type='raw'/> <source file='/export/vmimages/1.raw'/> <target dev='vdb' bus='virtio'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/> </disk> <disk type='file' device='disk'> <driver name='qemu' type='raw'/> <source file='/export/vmimages/2.raw'/> <target dev='vdb1' bus='virtio'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/> </disk> [...] - Start the guest: $ virsh start cvm1 Actual results -------------- - Starting the guest (error message manually wrapped) fails with: $ virsh start cvm1 error: internal error: process exited while connecting to monitor: qemu-system-x86_64: -drive file=/export/vmimages/2.raw,format=raw,if=none,id=drive-virtio-disk1: Duplicate ID 'drive-virtio-disk1' for drive - It results in QEMU CLI (notice the 'drive-virtio-disk1'): [...] -drive file=/export/vmimages/1.raw,format=raw,if=none,id=drive-virtio-disk1 \ -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-virtio-disk1,id=virtio-disk1 \ -drive file=/export/vmimages/2.raw,format=raw,if=none,id=drive-virtio-disk1 \ -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x8,drive=drive-virtio-disk1,id=virtio-disk1 [...] Expected results ---------------- Diallow (and thrown an appropriate error) supplying invalid disk labels (e.g. partition numbers, like 'vdb1'). Discussion from IRC ------------------- [laine] The problem is that the "alias" libvirt generates based on the target device name should be unique, but it isn't. I can see why it's doing that. When putting the index of the disk it's processing within the entries into drive-virtio-disk%d, it learns the index by calling virDiskNameToIndex(). It's going through a list of disks, 0, 1, 2, 3, etc creating the alias. But at each index, it calls this other function to learn the index, and that function just searches for the first entry that matches the name. If they *really* don't know the index (because they only have a pointer to the entry) they should just search for the entry with the matching *address in memory* [pkrempa] The alias generator calls virDiskNameToIndex() which calls virDiskNameParse(). virDiskNameParse() parses the name including the partition number and returns it. virDiskNameToIndex() discards the partition number and returns the disk index. If we accept both 'sdb' and 'sdb1' but generate the same alias then that's a bug.
A comment from John Ferlan on #virt (OFTC) IRC channel: "The addendum would be that virDiskNameToIndex passes NULL to virDiskNameParse. There's no callers other than a test to virDiskNameParse, so the virNameToIndex probably should not accept numerical suffices. The libvirt disk backend has it's own very special/unique way to handle partitions." - - - And re-posting this comment (by John) from the related upstream thread: http://www.redhat.com/archives/libvir-list/2016-June/msg00890.html -- libvirt disk labels and QEMU drive host alias generation ------------------------------------------------------------------------- "oh - this brings back some shorter term bad memories. This all gets even more complicated with <hostdev>'s. Try to follow both virDomainDiskDefAssignAddress and virDomainHostdevAssignAddress especially when the <address> is not supplied. I do agree with Dan - there should have been some failure on the second command since the source file is duplicated. It seems the only check is for duplicated target dev (see virDomainDiskDefCheckDuplicateInfo). A quick scan finds no disk->src->path comparison for FILE type - not sure a vanilla comparison would work for all "src->type"'s... Beyond that - if one just "reads" the comments for virDiskNameToIndex (and virDiskNameParse) they may assume that the code is attempting to "serially assign an address" based on the target name. Thus "vdb" would equate to the "2nd" disk while "vdb1" would equate to the "27th" disk. However, that would only be "true" if the "partition" parameter was supplied (e.g. a disk partition). If it's not (which is the more general case), then the partition number is "dropped": /* Count the trailing digits. */ size_t n_digits = strspn(ptr, "0123456789"); if (ptr[n_digits] != '\0') return -1; *disk = idx; If the "digit" mattered, then the "n_digits" would have been used to alter "idx" by "idx += *ptr - '0'; similar to how the 'a' math was done. The order of the code perhaps tricks the eyes into believing that the idx was adjusted. In any case, for a non "disk partition" case, the second target perhaps should have been "sdbb" and not "sdb1" (look at tests/utiltest.c). Furthermore, maybe the "solution" for this is the XML catching that the supplied target has a numerically ending value, but does that work for all cases? If one looks at : tests/qemuxml2argvdata/qemuxml2argv-pci-many.xml one will see usage of "vdaa", "vdab", etc. and not "vda1", "vda2", etc. I did not dig for other tests for many disks per controller. Still even with all that "resolved", there is another issue. There's no code currently that makes the "post parse" check if there was some disk or hostdev somehow assigned the same address. I believe this is something determined during review of the series: http://www.redhat.com/archives/libvir-list/2015-July/msg00870.html which went into August, but perhaps most relevant is patch 12: http://www.redhat.com/archives/libvir-list/2015-August/msg00058.html Based on that I started writing a patch that does a very nasty post parse address duplication check for both "disks" and "hostdevs"; however, I believe I was "waiting for" something that allowed us to call it before domain start. IOW: Before what now is Peter's domain configuration validation series (essentially through commit id '5972f185e'). The problem being that prior Peter's series, the only way we could stop something with a duplicate address was at startup time. Now with Peter's series, it would be possible to check for duplicates prior to start and actually list them before actually attempting the start." -------------------------------------------------------------------------
I've pushed patches upstream: 2346b2f656 remove a now redundant call to virDiskNameToIndex() ca437d0603 qemu: Refuse partitions in disk targets v5.8.0-rc1-4-g2346b2f656