Description of problem: VM doesn't start with the following error in vdsm.log: libvirtError: unsupported configuration: Disks 'hdc' and 'hdd' have identical serial How reproducible: Start a VM with enabled Initial Run. Cause: There are two cdrom devices with identical (empty) serials in domxml: <disk device="cdrom" snapshot="no" type="file"> <source file="/var/run/vdsm/payload/2eaf9c8e-2123-4b48-9b62-f96168ac7a36.41f22cb1676858ad4e22e8440519032d.img" startupPolicy="optional"/> <target bus="ide" dev="hdd"/> <readonly/> <serial/> </disk> <disk device="cdrom" snapshot="no" type="file"> <source file="" startupPolicy="optional"/> <target bus="ide" dev="hdc"/> <readonly/> <serial/> </disk> The bug is related to https://bugzilla.redhat.com/show_bug.cgi?id=1245013 The following before_vm_start hook resolves the issue: ---------- cut here ---------- #!/usr/bin/python import hooking import uuid domxml = hooking.read_domxml() for disk in domxml.getElementsByTagName('disk'): if disk.getAttribute('device') == 'cdrom': for source in disk.getElementsByTagName('source'): if source.getAttribute('file').find('/payload/') > 0: for serial in disk.getElementsByTagName('serial'): if not serial.hasChildNodes(): serial.appendChild(domxml.createTextNode(str(uuid.uuid4()))) hooking.write_domxml(domxml) ---------- cut here ----------
(In reply to Pavel Gashev from comment #0) > Description of problem: > VM doesn't start with the following error in vdsm.log: > libvirtError: unsupported configuration: Disks 'hdc' and 'hdd' have > identical serial > > How reproducible: > Start a VM with enabled Initial Run. > > Cause: > There are two cdrom devices with identical (empty) serials in domxml: > > <disk device="cdrom" snapshot="no" type="file"> > <source > file="/var/run/vdsm/payload/2eaf9c8e-2123-4b48-9b62-f96168ac7a36. > 41f22cb1676858ad4e22e8440519032d.img" startupPolicy="optional"/> > <target bus="ide" dev="hdd"/> > <readonly/> > <serial/> > </disk> > <disk device="cdrom" snapshot="no" type="file"> > <source file="" startupPolicy="optional"/> > <target bus="ide" dev="hdc"/> > <readonly/> > <serial/> > </disk> > > The bug is related to https://bugzilla.redhat.com/show_bug.cgi?id=1245013 Indeed we send duplicate *empty* serial for both cdroms. Peter, is this covered by rhbz#1245013 or is a different case? In general, what does libvirt >= 1.3.2 do in the scenario above?
If this is oVirt's fault, could be fixed in few ways: - engine could just send fake imageId for the payload device, and everything should just work (tm) - Vdsm could inject a fake imageId, maybe just a new UUID each time. Payload devices are throwaway devices, so this should be OK.
I think the problem is in the approach libvirt uses to parse the <serial> element. If it's empty we apparently allocate an empty string rather than not storing it at all. Two disks with an empty string as serial are of course rejected for having the same serial rather than none. I'll look at the code whether we can reasonably fix the parser to avoid this weird semantics.
Okay, so while our code should not parse that as an empty string, using an empty <serial/> element does not conform to libvirt's XML schema as the serial is defined by the following regex "[A-Za-z0-9_\.\+\- ]+". I'm afraid that by enforcing that your existing configs would stop to work, so I suggest fixing that on oVirts side. I'll look into tweaking the code though.
(In reply to Peter Krempa from comment #4) > Okay, so while our code should not parse that as an empty string, using an > empty <serial/> element does not conform to libvirt's XML schema as the > serial is defined by the following regex "[A-Za-z0-9_\.\+\- ]+". > > I'm afraid that by enforcing that your existing configs would stop to work, > so I suggest fixing that on oVirts side. I'll look into tweaking the code > though. That (and Dan's remark) make me realize that we should handle also the migrations from 1.2.x to 1.3.x, fixing our XML in the process.
(In reply to Francesco Romani from comment #5) > (In reply to Peter Krempa from comment #4) > > Okay, so while our code should not parse that as an empty string, using an > > empty <serial/> element does not conform to libvirt's XML schema as the > > serial is defined by the following regex "[A-Za-z0-9_\.\+\- ]+". > > > > I'm afraid that by enforcing that your existing configs would stop to work, > > so I suggest fixing that on oVirts side. I'll look into tweaking the code > > though. > > That (and Dan's remark) make me realize that we should handle also the > migrations from 1.2.x to 1.3.x, fixing our XML in the process. While doing additional testing, it seems to me that libvirt 1.3.1 (rpm rebuilt from git sources) accepts inbound migration with multiple empty serial attributes (please check the attached libvirt logs). Peter, is this intentional, and should we rely on that?
Created attachment 1148177 [details] libvirt debug log demonstrating migrations with empty disk serial
(In reply to Francesco Romani from comment #6) > While doing additional testing, it seems to me that libvirt 1.3.1 (rpm > rebuilt from git sources) accepts inbound migration with multiple empty > serial attributes (please check the attached libvirt logs). > Peter, is this intentional, and should we rely on that? Yes. The code rejecting identical <serial> elements for multiple disks is executed only on fresh starts so that existing VMs still can be migrated even if they are configured in a way that would prevent to start them normally. Additionally, removing the disk serial at migration stage would make the migration fail since that would be a guest visible change and those are forbidden.
(In reply to Peter Krempa from comment #8) > (In reply to Francesco Romani from comment #6) > > While doing additional testing, it seems to me that libvirt 1.3.1 (rpm > > rebuilt from git sources) accepts inbound migration with multiple empty > > serial attributes (please check the attached libvirt logs). > > Peter, is this intentional, and should we rely on that? > > Yes. The code rejecting identical <serial> elements for multiple disks is > executed only on fresh starts so that existing VMs still can be migrated > even if they are configured in a way that would prevent to start them > normally. > > Additionally, removing the disk serial at migration stage would make the > migration fail since that would be a guest visible change and those are > forbidden. Thanks Peter, very good to know. Also tested just in case hibernation/dehibernation, works as expected. Moving to MODIFIED
I'm not sure it's already fixed here, but Direct LUNs have no serials as well. See https://bugzilla.redhat.com/1328383
*** Bug 1328383 has been marked as a duplicate of this bug. ***
Verified using rhel 7.2 host with vdsm-4.17.28-0.el7ev.noarch running on top of libvirt-1.3.1-1.el7.x86_64: [root@monique-vds01 x86_64]# rpm -qa | grep libvirt libvirt-daemon-driver-secret-1.3.1-1.el7.x86_64 libvirt-daemon-kvm-1.3.1-1.el7.x86_64 libvirt-daemon-driver-interface-1.3.1-1.el7.x86_64 libvirt-docs-1.3.1-1.el7.x86_64 libvirt-python-1.2.17-2.el7.x86_64 libvirt-daemon-1.3.1-1.el7.x86_64 libvirt-daemon-driver-nodedev-1.3.1-1.el7.x86_64 libvirt-daemon-driver-qemu-1.3.1-1.el7.x86_64 libvirt-devel-1.3.1-1.el7.x86_64 libvirt-lock-sanlock-1.3.1-1.el7.x86_64 libvirt-daemon-driver-network-1.3.1-1.el7.x86_64 libvirt-daemon-driver-storage-1.3.1-1.el7.x86_64 libvirt-daemon-config-nwfilter-1.3.1-1.el7.x86_64 libvirt-1.3.1-1.el7.x86_64 libvirt-login-shell-1.3.1-1.el7.x86_64 libvirt-daemon-driver-nwfilter-1.3.1-1.el7.x86_64 libvirt-daemon-config-network-1.3.1-1.el7.x86_64 libvirt-debuginfo-1.3.1-1.el7.x86_64 libvirt-client-1.3.1-1.el7.x86_64 libvirt-daemon-driver-lxc-1.3.1-1.el7.x86_64 libvirt-daemon-lxc-1.3.1-1.el7.x86_64 1. After installation of new libvirt and vdsm ran vdsm-tool configure, verified that virtlogd service is not running. 2. Installed host on 3.6.6 engine, started a new vm with cloud-init payload. serial tag was not included in disk attribute of domain's xml, both on cdrom device and the payload: <disk type='file' device='cdrom'> <driver name='qemu' type='raw'/> <source startupPolicy='optional'/> <backingStore/> <target dev='hdc' bus='ide'/> <readonly/> <alias name='ide0-1-0'/> <address type='drive' controller='0' bus='1' target='0' unit='0'/> </disk> <disk type='file' device='cdrom'> <driver name='qemu' type='raw'/> <source file='/var/run/vdsm/payload/5b1d76a5-d320-4825-8f98-47e0c1ab1586.c5a2f8bd8fe24053f78fa1b2458b6731.img' startupPolicy='optional'/> <backingStore/> <target dev='hdd' bus='ide'/> <readonly/> <alias name='ide0-1-1'/> <address type='drive' controller='0' bus='1' target='0' unit='1'/> </disk>