Bug 1397077
Summary: | [z-stream clone - 4.0.5] VM fails to start after changing IDE disk boot order | ||
---|---|---|---|
Product: | Red Hat Enterprise Virtualization Manager | Reporter: | rhev-integ |
Component: | ovirt-engine | Assignee: | Fred Rolland <frolland> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Kevin Alon Goldblatt <kgoldbla> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 3.4.4 | CC: | amureini, bazulay, dyuan, eedri, frolland, fromani, iheim, jbuchta, jdenemar, jsuchane, kgoldbla, lpeer, lsurette, michal.skrivanek, mkalinin, mzhan, ppostler, ratamir, rbalakri, Rhev-m-bugs, rhodain, srevivo, tnisan, xuzhang, ykaul, ylavi |
Target Milestone: | ovirt-4.0.5 | Keywords: | ZStream |
Target Release: | --- | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | 1186817 | Environment: | |
Last Closed: | 2016-11-24 09:40:23 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | Storage | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 1186817 | ||
Bug Blocks: |
Comment 1
rhev-integ
2016-11-21 14:32:49 UTC
this might be a libvirt issue, i was able to reproduce this locally using the steps described, and this is what i got: engine send to vdsm create with 3 ide devices: 1 - cdrom, with address (assigned by libvirt from previous run) 2 - disk with address (assigned by libvirt from previous run) 3 - disk without address (the new disk) but with request to be boot (boot_order=1) <disk device="cdrom" snapshot="no" type="file"> <address bus="1" controller="0" target="0" type="drive" unit="0"/> <source file="" startupPolicy="optional"/> <target bus="ide" dev="hdc"/> <readonly/> <serial/> </disk> <disk device="disk" snapshot="no" type="file"> <source file="/rhev/data-center/.../..."/> <target bus="ide" dev="hda"/> <serial>483d2327-84bb-4de6-90a9-d282d5237ae1</serial> <boot order="1"/> <driver cache="none" error_policy="stop" io="threads" name="qemu" type="raw"/> </disk> <disk device="disk" snapshot="no" type="file"> <address bus="0" controller="0" target="0" type="drive" unit="0"/> <source file="/rhev/data-center/.../..."/> <target bus="ide" dev="hdb"/> <serial>901e7fe4-f479-4c9c-871e-d18ac1317cad</serial> <driver cache="none" error_policy="stop" io="threads" name="qemu" type="raw"/> </disk> this results with Thread-97773::ERROR::2015-02-22 18:26:41,923::vm::2331::vm.Vm::(_startUnderlyingVm) vmId=`517dce2c-f140-4079-bf61-e7eb116887a1`::The vm start process failed Traceback (most recent call last): File "/usr/share/vdsm/virt/vm.py", line 2271, in _startUnderlyingVm self._run() File "/usr/share/vdsm/virt/vm.py", line 3335, in _run self._connection.createXML(domxml, flags), File "/usr/lib/python2.6/site-packages/vdsm/libvirtconnection.py", line 111, in wrapper ret = f(*args, **kwargs) File "/usr/lib64/python2.6/site-packages/libvirt.py", line 2665, in createXML if ret is None:raise libvirtError('virDomainCreateXML() failed', conn=self) libvirtError: internal error process exited while connecting to monitor: qemu-kvm: -drive file=/rhev/data-center/00000002-0002-0002-0002-00000000033d/f23b9542-85ac-4957-84f4-a8ed99154502/images/901e7fe4-f479-4c9c-871e-d18ac1317cad/b7e35ad0-e7f8-4958-a44d-495fa19725e2,if=none,id=drive-ide0-0-0,format=raw,serial=901e7fe4-f479-4c9c-871e-d18ac1317cad,cache=none,werror=stop,rerror=stop,aio=threads: Duplicate ID 'drive-ide0-0-0' for drive the error above refer to the older disk. i don't think that the engine should have cleared the first disk address because like to have stable addresses between boot cycles.. Michal what do you think? This comment was originaly posted by ofrenkel looks like libvirt bug in disk assignment? This comment was originaly posted by michal.skrivanek ... <address bus="0" controller="0" target="0" type="drive" unit="0"/> <source file="/rhev/data-center/.../..."/> <target bus="ide" dev="hdb"/> ... This doesn't make any sense. The address is for hda while target is specified as hdb. If you then try to add an hda disk, libvirt will try to assign it the right address, which is currently occupied by this second disk. Whoever changed the target of the former hda to hdb should have also changed the address (or alternatively, should have removed it). This comment was originaly posted by jdenemar Could you attach the original domain XML and the XML vdsm generated after changing the boot order? This comment was originaly posted by jdenemar I don't have the reproducer anymore, as i can't keep them for months. This comment was originaly posted by akotov Omer, do you have the domain xmls from the first and second runs? This comment was originaly posted by michal.skrivanek Let's hope VDSM guys can provide the XMLs then. So far looking at the only XML we got here (comment 2) the bug would be in VDSM which provided wrong XML. This comment was originaly posted by jdenemar Created attachment 1000437 [details]
engine log of the scenario
This comment was originaly posted by ofrenkel
Created attachment 1000439 [details]
vdsm log
This comment was originaly posted by ofrenkel
attached engine and vdsm log of the scenario, please let me know if anything else is needed vm name: ideVm vm id: 106acd37-f29b-468b-aa86-99db62c95070 first run on 2015-03-11 15:09:35 single ide disk, and ide cd-rom after this run, the engine db has: engine=> select device,is_managed,boot_order,address from vm_device where vm_id in (select vm_guid from vm_static where vm_name ilike 'ideVm') and type='disk'; device | is_managed | boot_order | address --------+------------+------------+----------------------------------------------------- disk | t | 1 | {bus=0, controller=0, type=drive, target=0, unit=0} cdrom | t | 0 | {bus=1, controller=0, type=drive, target=0, unit=0} after adding another disk and setting it to be bootable engine=> select device,is_managed,boot_order,address from vm_device where vm_id in (select vm_guid from vm_static where vm_name ilike 'ideVm') and type='disk'; device | is_managed | boot_order | address --------+------------+------------+----------------------------------------------------- disk | t | 0 | {bus=0, controller=0, type=drive, target=0, unit=0} disk | t | 1 | cdrom | t | 0 | {bus=1, controller=0, type=drive, target=0, unit=0} second run (attempt) time: 2015-03-11 15:18:16 This comment was originaly posted by ofrenkel Thanks, comment 5 is correct according to the vdsm.log. It seems the original hda disk was renamed as hdb but its address was left untouched and so it conflicts with the automatically assigned address of the new hda disk. VDSM should either add hdb without changing hda or change or even remove the address of the original hda when it renames it to hdb. This comment was originaly posted by jdenemar yep. Taking back, I suspect normalizeDrivesIndices and related stuff not taking into account the existing drives, hence flagging storage This comment was originaly posted by michal.skrivanek *** Bug 1313406 has been marked as a duplicate of this bug. *** This comment was originaly posted by michal.skrivanek oVirt 4.0 Alpha has been released, moving to oVirt 4.0 Beta target. This comment was originaly posted by ydary Tested with the following code: ---------------------------------------- rhevm-4.0.5-0.1.el7ev.noarch vdsm-4.18.15-1.el7ev.x86_64 Tested with the following scenario: Steps to Reproduce: 1. Create new VM with IDE disk, verify it starts 2. Add second IDE disk 3. Turn off bootable flag from 1st disk, make 2nd disk bootable 4. Start the VM - Now starts successfully Actual results: The VM now starts successfully Expected results: Moving to VERIFIED! This comment was originaly posted by kgoldbla 4.0.5 has been released, closing. 4.0.5 vdsm errata: https://access.redhat.com/errata/RHBA-2016:2787 4.0.5-1 vdsm errata: https://access.redhat.com/errata/RHBA-2016:2818 So, I guess, it is somewhere in there. 4.0.5-1 to be on the safe side. |