Bug 1047356 - VM payload cdrom/floppy fail to mkdir/mount
Summary: VM payload cdrom/floppy fail to mkdir/mount
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm
Version: 3.3.0
Hardware: x86_64
OS: Linux
urgent
urgent
Target Milestone: ---
: 3.3.0
Assignee: Francesco Romani
QA Contact: Ilanit Stein
URL:
Whiteboard: virt
: 1047360 1047944 1054201 (view as bug list)
Depends On:
Blocks: 990572 GSS_RHEV_33_BETA
TreeView+ depends on / blocked
 
Reported: 2013-12-30 16:13 UTC by Meni Yakove
Modified: 2014-01-21 16:26 UTC (History)
19 users (show)

Fixed In Version: vdsm-4.13.2-0.6
Doc Type: Bug Fix
Doc Text:
VDSM was using a condition which was too broad, causing a device path misdetection when cloud-init was used. A CDROM with vmPayload was created as an empty CDROM because no path was detected, so virtual machines could not start when "initial init" was selected for the Run Once operation. This update tightens the prepareVolumePath check, so virtual machines with vmPayload can run as expected.
Clone Of:
Environment:
Last Closed: 2014-01-21 16:26:01 UTC
oVirt Team: ---
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
logs (1.81 MB, application/zip)
2013-12-30 16:13 UTC, Meni Yakove
no flags Details
supervdsm (40.83 KB, text/x-log)
2014-01-02 13:41 UTC, Meni Yakove
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2014:0040 0 normal SHIPPED_LIVE vdsm bug fix and enhancement update 2014-01-21 20:26:21 UTC
oVirt gerrit 22925 0 None MERGED vdsm: prepareVolumePath payload misdetection fix 2020-04-21 14:53:33 UTC
oVirt gerrit 22935 0 None MERGED vdsm: prepareVolumePath payload misdetection fix 2020-04-21 14:53:33 UTC

Description Meni Yakove 2013-12-30 16:13:59 UTC
Created attachment 843472 [details]
logs

Description of problem:
VM fail to start when checking "initial init" on "run once" dialog.
error:
2013-12-30 18:01:37,562 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-50) Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: VM CLOUD-INIT-VM is down. Exit message: internal error process exited while connecting to monitor: qemu-kvm: -drive if=none,media=cdrom,id=drive-ide0-1-1,readonly=on,format=raw,serial=: Duplicate ID 'drive-ide0-1-1' for drive

Version-Release number of selected component (if applicable):
rhevm: rhevm-3.3.0-0.42.el6ev.noarch
vdsm: vdsm-4.13.2-0.4.el6ev.x86_64
cloud-init: cloud-init.noarch 0:0.7.2-2.el6

How reproducible:
100%

Steps to Reproduce:
1. Create linux VM.
2. Open "run once" dialog and check "initial init" checkbox.
3. Start the VM.

Actual results:
VM fail to start.

Expected results:
VM strated

Additional info:

Comment 1 Ilanit Stein 2014-01-01 13:44:48 UTC
Please see the bellow regression "fail" test steps:
It started to fail on version is28. On previous versions (is27 and bellow) it passed.

Payload_cdrom_via_create
=========================
018-Create VM from template to test payload type CDROM via CREATE - pass                
019-Start VM to test cdrom payload status - pass        
020-Wait for VM to become up to fetch the IP - pass
021-Mount cdrom in the VM and create a temp directory - fail        
022-Verify file existence and content in the VM  - fail        
023-umount cdrom and delete temp directory - pass        
024-Stop VM, so we can remove the VM; - fail        
025-Remove VM cd-payload - fail

Payload_floppy_via_create
=========================
038-Create VM from template to test payload type FLOPPY via CREATE - pass        
039-Start VM to test payload floppy status - pass        
040-Wait for VM to become up to fetch the IP - pass        
041-modprobe floppy and mount /dev/fd0 and create a temp directory - fail        
042-Verify file content and existence on the VM (floppy success) - fail        
043-umount /dev/fd0 and delete temp directory - pass        
044-Stop VM, so we can remove the VM; - fail                
045-Remove VM floppy - fail

Comment 3 Ilanit Stein 2014-01-01 14:17:55 UTC
More info:
=========
1. vm payload auto test link: 
http://jenkins.qa.lab.tlv.redhat.com:8080/view/Compute/view/3.3-git/view/Virt/job/3.3-git-compute-virt-payloads-rest-nfs/

2. vdsm versions: 

is27 (payload worked OK): 
'vdsm': {'release': '0.2.rc.el6ev', 'buildtime': 1386683396L, 'version': '4.13.2'},

is28 (payload didn't work OK): 'vdsm': {'release': '0.3.el6ev', 'buildtime': 1387357537L, 'version': '4.13.2'},

Comment 4 Omer Frenkel 2014-01-01 14:23:38 UTC
looks like vdsm issue,
it seems that 'payload' folder is missing from /var/run/vdsm
and any attempt to use vm-payload from the engine, makes vdsm behave very strange (next run to the vm will fail, and stopping afterwards throw weird exceptions, only vdsmd restart helps)

from vdsm log:
vmCreate get the payload correctly:

Thread-109::DEBUG::2013-12-30 17:59:15,515::BindingXMLRPC::991::vds::(wrapper) return vmCreate with {'status': {'message': 'Done', 'code': 0
}, 'vmList': {'status': 'WaitForLaunch', 'acpiEnable': 'true', 'emulatedMachine': 'rhel6.5.0', 'vmId': '15e1be9a-3432-4bbd-acd1-2bbc6bc60000
', 'pid': '0', 'memGuaranteedSize': 512, 'timeOffset': '-1', 'keyboardLayout': 'en-us', 'displayPort': '-1', 'displaySecurePort': '-1', 'spi
ceSslCipherSuite': 'DEFAULT', 'cpuType': 'Conroe', 'custom': {}, 'clientIp': '', 'nicModel': 'rtl8139,pv', 'smartcardEnable': 'false', 'kvmE
nable': 'true', 'pitReinjection': 'false', 'transparentHugePages': 'true', 'devices': [{'index': '3', 'iface': 'ide', 'specParams': {'vmPayl
oad': {'volId': 'config-2', 'file': {'openstack/latest/meta_data.json': 'ewogICJsYXVuY2hfaW5kZXgiIDogIjAiLAogICJhdmFpbGFiaWxpdHlfem9uZSIgOiA
ibm92YSIs\nCiAgInV1aWQiIDogIjhmODJkZDY3LTkyMTAtNGU4Ni04Yzc2LTkyZGU5YTRkZGYxMyIsCiAgIm1l\ndGEiIDogewogICAgImVzc2VudGlhbCIgOiAiZmFsc2UiLAogICA
gInJvbGUiIDogInNlcnZlciIs\nCiAgICAiZHNtb2RlIiA6ICJsb2NhbCIKICB9Cn0=\n', 'openstack/latest/user_data': 'I2Nsb3VkLWNvbmZpZwpvdXRwdXQ6CiAgYWxsO
iAnPj4gL3Zhci9sb2cvY2xvdWQtaW5pdC1vdXRw\ndXQubG9nJwp1c2VyOiByb290CnJ1bmNtZDoKLSAnc2VkIC1pICcnL15kYXRhc291cmNlX2xpc3Q6\nIC9kJycgL2V0Yy9jbG91Z
C9jbG91ZC5jZmc7IGVjaG8gJydkYXRhc291cmNlX2xpc3Q6CiAgWyJO\nb0Nsb3VkIiwgIkNvbmZpZ0RyaXZlIl0nJyA+PiAvZXRjL2Nsb3VkL2Nsb3VkLmNmZycK\n'}}}, 'readon
ly': 'true', 'deviceId': '754f34a3-1359-4588-a146-5baac58a4ac6', 'path': '', 'device': 'cdrom', 'shared': 'false', 'type': 'disk'}, {'device
': 'qxl', 'specParams': {'vram': '32768', 'ram': '65536', 'heads': '1'}, 'type': 'video', 'deviceId': 'd07ecc88-5d5c-494e-96f4-3e0a707da340'
, 'address': {'domain': '0x0000', '  slot': '0x02', '  function': '0x0', '  type': 'pci', ' bus': '0x00'}}, {'index': '2', 'iface': 'ide', '
address': {'bus': '1', '  target': '0', '  controller': '0', '  type': 'drive', ' unit': '1'}, 'specParams': {'path': ''}, 'readonly': 'true
', 'deviceId': 'a7d49b85-9243-4d72-b787-df1bf3e33ffc', 'path': '', 'device': 'cdrom', 'shared': 'false', 'type': 'disk'}, {'index': 0, 'ifac
e': 'virtio', 'format': 'cow', 'bootOrder': '1', 'poolID': 'bc126acc-82b0-4157-bbe5-6c03660520dc', 'volumeID': '6c0a1ebe-c8b3-4884-beef-ca06
92793614', 'imageID': '6e00d761-273e-4735-953d-504b931471ce', 'specParams': {}, 'readonly': 'false', 'domainID': 'a69080fa-5e68-4b2d-9999-34
4f0710fc87', 'optional': 'false', 'deviceId': '6e00d761-273e-4735-953d-504b931471ce', 'address': {'domain': '0x0000', '  slot': '0x06', '  f
unction': '0x0', '  type': 'pci', ' bus': '0x00'}, 'device': 'disk', 'shared': 'false', 'propagateErrors': 'off', 'type': 'disk'}, {'nicMode
l': 'pv', 'macAddr': '00:1a:4a:16:88:55', 'linkActive': 'true', 'network': 'rhevm', 'bootOrder': '2', 'filter': 'vdsm-no-mac-spoofing', 'spe
cParams': {}, 'deviceId': '3121005e-0cf0-48f0-83bb-cd5d8a5a9e64', 'address': {'domain': '0x0000', '  slot': '0x03', '  function': '0x0', '  
type': 'pci', ' bus': '0x00'}, 'device': 'bridge', 'type': 'interface'}, {'nicModel': 'pv', 'macAddr': '00:1a:4a:16:88:65', 'linkActive': 't
rue', 'network': 'rhevm', 'bootOrder': '3', 'filter': 'vdsm-no-mac-spoofing', 'specParams': {}, 'deviceId': 'b029b9a3-32b4-471e-a71e-18fdde2
33f4b', 'address': {'domain': '0x0000', '  slot': '0x05', '  function': '0x0', '  type': 'pci', ' bus': '0x00'}, 'device': 'bridge', 'type':
 'interface'}, {'device': 'memballoon', 'specParams': {'model': 'virtio'}, 'type': 'balloon', 'deviceId': '916e2571-5715-46e8-a447-9462f9d32
394'}, {'specParams': {}, 'deviceId': 'f09c336e-ee80-40dd-816e-de350ea99bab', 'address': {'domain': '0x0000', '  slot': '0x04', '  function'
: '0x0', '  type': 'pci', ' bus': '0x00'}, 'device': 'scsi', 'model': 'virtio-scsi', 'type': 'controller'}], 'smp': '1', 'vmType': 'kvm', 'm
emSize': 1024, 'displayIp': '0', 'spiceSecureChannels': 'smain,sinputs,scursor,splayback,srecord,sdisplay,susbredir,ssmartcard', 'smpCoresPe
rSocket': '1', 'vmName': 'CLOUD-INIT-VM', 'display': 'qxl', 'nice': '0'}}


but libvirt xml doesn't get any payload image (file attribute is empty):
                <disk device="cdrom" snapshot="no" type="file">
                        <source file="" startupPolicy="optional"/>
                        <target bus="ide" dev="hdd"/>
                        <readonly/>
                        <serial/>
                </disk>


and after the vm fails, vdsm misbehaves:

Thread-114::DEBUG::2013-12-30 17:59:18,100::BindingXMLRPC::984::vds::(wrapper) client [10.35.161.136]::call vmDestroy with ('15e1be9a-3432-4bbd-acd1-2bbc6bc60000',) {}
Thread-114::INFO::2013-12-30 17:59:18,100::API::324::vds::(destroy) vmContainerLock acquired by vm 15e1be9a-3432-4bbd-acd1-2bbc6bc60000
Thread-114::DEBUG::2013-12-30 17:59:18,100::vm::4477::vm.Vm::(destroy) vmId=`15e1be9a-3432-4bbd-acd1-2bbc6bc60000`::destroy Called
Thread-114::ERROR::2013-12-30 17:59:18,118::BindingXMLRPC::1003::vds::(wrapper) unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/BindingXMLRPC.py", line 989, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/BindingXMLRPC.py", line 211, in vmDestroy
    return vm.destroy()
  File "/usr/share/vdsm/API.py", line 329, in destroy
    res = v.destroy()
  File "/usr/share/vdsm/vm.py", line 4485, in destroy
    response = self.releaseVm()
  File "/usr/share/vdsm/vm.py", line 4417, in releaseVm
    supervdsm.getProxy().removeFs(drive.path)
  File "/usr/share/vdsm/supervdsm.py", line 50, in __call__
    return callMethod()
  File "/usr/share/vdsm/supervdsm.py", line 48, in <lambda>
    **kwargs)
  File "<string>", line 2, in removeFs
  File "/usr/lib64/python2.6/multiprocessing/managers.py", line 740, in _callmethod
    raise convert_to_error(kind, result)
Exception: Cannot remove Fs that does not exists in: /var/run/vdsm/payload

as mentioned above, only vdsmd restart helps.
this only happens when trying to use vm payload, regular run works ok

Comment 5 Omer Frenkel 2014-01-02 09:12:30 UTC
*** Bug 1047360 has been marked as a duplicate of this bug. ***

Comment 6 Dan Kenigsberg 2014-01-02 13:36:22 UTC
Meni, could you add supervdsm.log, to see if vdsm attempted to create the payload cd image?

Comment 7 Meni Yakove 2014-01-02 13:41:25 UTC
Created attachment 844562 [details]
supervdsm

Comment 8 Francesco Romani 2014-01-02 14:38:08 UTC
This change seems to be the cause of misbehaviour

http://gerrit.ovirt.org/#/c/22324/3

it was merged between is27 and is28

Comment 9 Michal Skrivanek 2014-01-02 15:12:25 UTC
the case is specParams vmPayload attribute has "file"(with content), "path" non existent. Device's (drive's) "path" is set to ''.

the new code in is28 mistakenly not take into account specParam's "file", considering it a case of empty CDROM and creating it as such. It did work before because the condition results false when there is no 'path' in drive['specParams'] (which is the case here) and goes to the next condition section (the right one with payload handling)

changing the order of conditions would solve this, or adding the check for "file"...

Comment 10 Michal Skrivanek 2014-01-03 14:44:05 UTC
merged to master and ovirt-3.3 branches

Comment 12 Omer Frenkel 2014-01-05 08:30:00 UTC
*** Bug 1047944 has been marked as a duplicate of this bug. ***

Comment 14 Dan Kenigsberg 2014-01-09 01:05:57 UTC
Cheryn, again, this bug is a regression introduced during the rhev-3.3 development cycle. The bug was not visible to any user, and as such, it should not be documented.

Comment 16 Michal Skrivanek 2014-01-17 11:35:30 UTC
*** Bug 1054201 has been marked as a duplicate of this bug. ***

Comment 17 errata-xmlrpc 2014-01-21 16:26:01 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-0040.html


Note You need to log in before you can comment on or make changes to this bug.