Created attachment 795557 [details] ## Logs rhevm, vdsm, libvirt, thread dump, superVdsm Summary: Description of problem: Failed run VM on GlusterFS Data Center Version-Release number of selected component (if applicable): RHEVM 3.3 - IS13 environment: RHEVM: rhevm-3.3.0-0.19.master.el6ev.noarch PythonSDK: rhevm-sdk-python-3.3.0.13-1.el6ev.noarch VDSM: vdsm-4.12.0-105.git0da1561.el6ev.x86_64 LIBVIRT: libvirt-0.10.2-18.el6_4.9.x86_64 QEMU & KVM: qemu-kvm-rhev-0.12.1.2-2.355.el6_4.7.x86_64 SANLOCK: sanlock-2.8-1.el6.x86_64 How reproducible: 100% Steps to Reproduce: 1. Create VM on GlusterFS Data Center 2. Power On VM Actual results: Failed run VM Expected results: Succeeded run VM Impact on user: Failed run VM Workaround: none Additional info: /var/log/ovirt-engine/engine.log 2013-09-09 12:05:00,068 ERROR [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] (DefaultQuartzScheduler_Worker-88) Rerun vm 1dc8b9cf-78c0-49b1-b473-aeaf5fb6e70b. Called from vds tigris01.scl.lab.tlv.redhat.com 2013-09-09 12:05:00,080 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (pool-5-thread-49) Correlation ID: 635a3f41, Job ID: 7b32d2c4-84d0-4808-bd7d-ee0d7272212d, Call Stack: null, Custom Event ID: -1, Message: Failed to run VM vm-2000 on Host tigris01.scl.lab.tlv.redhat.com. 2013-09-09 12:05:00,165 INFO [org.ovirt.engine.core.bll.RunVmCommand] (pool-5-thread-49) Lock Acquired to object EngineLock [exclusiveLocks= key: 1dc8b9cf-78c0-49b1-b473-aeaf5fb6e70b value: VM , sharedLocks= ] 2013-09-09 12:05:00,217 INFO [org.ovirt.engine.core.vdsbroker.IsVmDuringInitiatingVDSCommand] (pool-5-thread-49) START, IsVmDuringInitiatingVDSCommand( vmId = 1dc8b9cf-78c0-49b1-b473-aeaf5fb6e70b), log id: 3d0f83c8 2013-09-09 12:05:00,218 INFO [org.ovirt.engine.core.vdsbroker.IsVmDuringInitiatingVDSCommand] (pool-5-thread-49) FINISH, IsVmDuringInitiatingVDSCommand, return: false, log id: 3d0f83c8 2013-09-09 12:05:00,258 INFO [org.ovirt.engine.core.bll.RunVmCommand] (pool-5-thread-49) Running command: RunVmCommand internal: false. Entities affected : ID: 1dc8b9cf-78c0-49b1-b473-aeaf5fb6e70b Type: VM 2013-09-09 12:05:00,399 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.IsoPrefixVDSCommand] (pool-5-thread-49) START, IsoPrefixVDSCommand(HostName = tigris02.scl.lab.tlv.redhat.com, HostId = b158905b-54f4-4069-ac7c-0ac8791edd98, storagePoolId=89c81ea5-26db-4c9a-a993-4e9cc4a9ffda), log id: 2a306d66 2013-09-09 12:05:00,402 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.IsoPrefixVDSCommand] (pool-5-thread-49) FINISH, IsoPrefixVDSCommand, return: , log id: 2a306d66 2013-09-09 12:05:00,409 INFO [org.ovirt.engine.core.vdsbroker.CreateVmVDSCommand] (pool-5-thread-49) START, CreateVmVDSCommand(HostName = tigris02.scl.lab.tlv.redhat.com, HostId = b158905b-54f4-4069-ac7c-0ac8791edd98, vmId=1dc8b9cf-78c0-49b1-b473-aeaf5fb6e70b, vm=VM [vm-2000]), log id: 1749c7ad 2013-09-09 12:05:00,422 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.CreateVDSCommand] (pool-5-thread-49) START, CreateVDSCommand(HostName = tigris02.scl.lab.tlv.redhat.com, HostId = b158905b-54f4-4069-ac7c-0ac8791edd98, vmId=1dc8b9cf-78c0-49b1-b473-aeaf5fb6e70b, vm=VM [vm-2000]), log id: 13e831fc 2013-09-09 12:05:00,553 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.CreateVDSCommand] (pool-5-thread-49) org.ovirt.engine.core.vdsbroker.vdsbroker.CreateVDSCommand spiceSslCipherSuite=DEFAULT,memSize=1024,kvmEnable=true,smp=1,vmType=kvm,emulatedMachine=rhel6.4.0,keyboardLayout=en-us,memGuaranteedSize=1024,nice=0,display=qxl,smartcardEnable=false,smpCoresPerSocket=1,spiceSecureChannels=smain,sinputs,scursor,splayback,srecord,sdisplay,susbredir,ssmartcard,timeOffset=0,transparentHugePages=true,vmId=1dc8b9cf-78c0-49b1-b473-aeaf5fb6e70b,devices=[Ljava.util.HashMap;@25d4cc4a,acpiEnable=true,vmName=vm-2000,cpuType=Opteron_G3,custom={} 2013-09-09 12:05:00,554 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.CreateVDSCommand] (pool-5-thread-49) FINISH, CreateVDSCommand, log id: 13e831fc 2013-09-09 12:05:00,563 INFO [org.ovirt.engine.core.vdsbroker.CreateVmVDSCommand] (pool-5-thread-49) FINISH, CreateVmVDSCommand, return: WaitForLaunch, log id: 1749c7ad 2013-09-09 12:05:00,563 INFO [org.ovirt.engine.core.bll.RunVmCommand] (pool-5-thread-49) Lock freed to object EngineLock [exclusiveLocks= key: 1dc8b9cf-78c0-49b1-b473-aeaf5fb6e70b value: VM , sharedLocks= ] 2013-09-09 12:05:00,571 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (pool-5-thread-49) Correlation ID: 635a3f41, Job ID: 7b32d2c4-84d0-4808-bd7d-ee0d7272212d, Call Stack: null, Custom Event ID: -1, Message: VM vm-2000 was started by admin@internal (Host: tigris02.scl.lab.tlv.redhat.com). 2013-09-09 12:05:00,960 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (DefaultQuartzScheduler_Worker-17) START, DestroyVDSCommand(HostName = tigris02.scl.lab.tlv.redhat.com, HostId = b158905b-54f4-4069-ac7c-0ac8791edd98, vmId=1dc8b9cf-78c0-49b1-b473-aeaf5fb6e70b, force=false, secondsToWait=0, gracefully=false), log id: 34710d52 2013-09-09 12:05:00,977 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand] (DefaultQuartzScheduler_Worker-17) FINISH, DestroyVDSCommand, log id: 34710d52 2013-09-09 12:05:01,043 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler_Worker-17) Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: VM vm-2000 is down. Exit message: Bad volume specification {'index': 0, 'iface': 'virtio', 'reqsize': '0', 'format': 'raw', 'type': 'disk', 'volumeID': '99702009-bcb4-494d-a396-d8e433bd7593', 'apparentsize': '21474836480', 'imageID': '83b6eecc-540f-4957-8d89-b8ae24bfbb95', 'specParams': {}, 'readonly': 'false', 'domainID': 'ad765e5d-b9e0-4993-a852-5d49fc4a61f6', 'deviceId': '83b6eecc-540f-4957-8d89-b8ae24bfbb95', 'truesize': '0', 'poolID': '89c81ea5-26db-4c9a-a993-4e9cc4a9ffda', 'device': 'disk', 'shared': 'false', 'propagateErrors': 'off', 'optional': 'false'}. 2013-09-09 12:05:01,043 INFO [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] (DefaultQuartzScheduler_Worker-17) Running on vds during rerun failed vm: null 2013-09-09 12:05:01,044 INFO [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] (DefaultQuartzScheduler_Worker-17) vm vm-2000 running in db and not running in vds - add to rerun treatment. vds tigris02.scl.lab.tlv.redhat.com 2013-09-09 12:05:01,053 ERROR [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] (DefaultQuartzScheduler_Worker-17) Rerun vm 1dc8b9cf-78c0-49b1-b473-aeaf5fb6e70b. Called from vds tigris02.scl.lab.tlv.redhat.com 2013-09-09 12:05:01,059 INFO [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (pool-5-thread-50) Correlation ID: 635a3f41, Job ID: 7b32d2c4-84d0-4808-bd7d-ee0d7272212d, Call Stack: null, Custom Event ID: -1, Message: Failed to run VM vm-2000 on Host tigris02.scl.lab.tlv.redhat.com. 2013-09-09 12:05:01,124 INFO [org.ovirt.engine.core.bll.RunVmCommand] (pool-5-thread-50) Lock Acquired to object EngineLock [exclusiveLocks= key: 1dc8b9cf-78c0-49b1-b473-aeaf5fb6e70b value: VM , sharedLocks= ] /var/log/vdsm/vdsm.log Thread-43466::INFO::2013-09-09 10:05:02,774::image::215::Storage.Image::(getChain) sdUUID=ad765e5d-b9e0-4993-a852-5d49fc4a61f6 imgUUID=83b6eecc-540f-4957-8d89-b8ae24bfbb95 chain =[<storage.glusterVolume.GlusterVolume object at 0x1b92850>] Thread-43466::DEBUG::2013-09-09 10:05:02,782::fileVolume::521::Storage.Volume::(validateVolumePath) validate path for 99702009-bcb4-494d-a396-d8e433bd7593 Thread-43466::ERROR::2013-09-09 10:05:02,814::task::850::TaskManager.Task::(_setError) Task=`06de0db8-0531-490d-b5e4-b68be92513a2`::Unexpected error Traceback (most recent call last): File "/usr/share/vdsm/storage/task.py", line 857, in _run return fn(*args, **kargs) File "/usr/share/vdsm/logUtils.py", line 45, in wrapper res = f(*args, **kwargs) File "/usr/share/vdsm/storage/hsm.py", line 3236, in prepareImage 'vmVolInfo': vol.getVmVolumeInfo()} File "/usr/share/vdsm/storage/glusterVolume.py", line 30, in getVmVolumeInfo volInfo = svdsmProxy.glusterVolumeInfo(volname, volfileServer) File "/usr/share/vdsm/supervdsm.py", line 50, in __call__ return callMethod() File "/usr/share/vdsm/supervdsm.py", line 48, in <lambda> **kwargs) File "<string>", line 2, in glusterVolumeInfo File "/usr/lib64/python2.6/multiprocessing/managers.py", line 740, in _callmethod raise convert_to_error(kind, result) OSError: [Errno 2] No such file or directory: gluster Thread-43466::DEBUG::2013-09-09 10:05:02,821::task::869::TaskManager.Task::(_run) Task=`06de0db8-0531-490d-b5e4-b68be92513a2`::Task._run: 06de0db8-0531-490d-b5e4-b68be92513a2 ('ad765e5d-b9e0-4993-a852-5d49fc4a61f6', '89c81ea5-26db-4c9a-a993-4e9cc4a9ffda', '83b6eecc-540f-4957-8d89-b8ae24bfbb95', '99702009-bcb4-494d-a396-d8e433bd7593') {} failed - stopping task Thread-43466::DEBUG::2013-09-09 10:05:02,821::task::1194::TaskManager.Task::(stop) Task=`06de0db8-0531-490d-b5e4-b68be92513a2`::stopping in state preparing (force False) Thread-43466::DEBUG::2013-09-09 10:05:02,822::task::974::TaskManager.Task::(_decref) Task=`06de0db8-0531-490d-b5e4-b68be92513a2`::ref 1 aborting True Thread-43466::INFO::2013-09-09 10:05:02,822::task::1151::TaskManager.Task::(prepare) Task=`06de0db8-0531-490d-b5e4-b68be92513a2`::aborting: Task is aborted: u'[Errno 2] No such file or directory: gluster' - code 100 Thread-43466::DEBUG::2013-09-09 10:05:02,822::task::1156::TaskManager.Task::(prepare) Task=`06de0db8-0531-490d-b5e4-b68be92513a2`::Prepare: aborted: [Errno 2] No such file or directory: gluster Thread-43466::DEBUG::2013-09-09 10:05:02,822::task::974::TaskManager.Task::(_decref) Task=`06de0db8-0531-490d-b5e4-b68be92513a2`::ref 0 aborting True Thread-43466::DEBUG::2013-09-09 10:05:02,823::task::909::TaskManager.Task::(_doAbort) Task=`06de0db8-0531-490d-b5e4-b68be92513a2`::Task._doAbort: force False Thread-43466::DEBUG::2013-09-09 10:05:02,823::resourceManager::976::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-43466::DEBUG::2013-09-09 10:05:02,823::task::579::TaskManager.Task::(_updateState) Task=`06de0db8-0531-490d-b5e4-b68be92513a2`::moving from state preparing -> state aborting Thread-43466::DEBUG::2013-09-09 10:05:02,823::task::534::TaskManager.Task::(__state_aborting) Task=`06de0db8-0531-490d-b5e4-b68be92513a2`::_aborting: recover policy none Thread-43466::DEBUG::2013-09-09 10:05:02,823::task::579::TaskManager.Task::(_updateState) Task=`06de0db8-0531-490d-b5e4-b68be92513a2`::moving from state aborting -> state failed Thread-43466::DEBUG::2013-09-09 10:05:02,824::resourceManager::939::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {'Storage.ad765e5d-b9e0-4993-a852-5d49fc4a61f6': < ResourceRef 'Storage.ad765e5d-b9e0-4993-a852-5d49fc4a61f6', isValid: 'True' obj: 'None'>} Thread-43466::DEBUG::2013-09-09 10:05:02,824::resourceManager::976::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-43466::DEBUG::2013-09-09 10:05:02,824::resourceManager::615::ResourceManager::(releaseResource) Trying to release resource 'Storage.ad765e5d-b9e0-4993-a852-5d49fc4a61f6' Thread-43466::DEBUG::2013-09-09 10:05:02,825::resourceManager::634::ResourceManager::(releaseResource) Released resource 'Storage.ad765e5d-b9e0-4993-a852-5d49fc4a61f6' (0 active users) Thread-43466::DEBUG::2013-09-09 10:05:02,825::resourceManager::640::ResourceManager::(releaseResource) Resource 'Storage.ad765e5d-b9e0-4993-a852-5d49fc4a61f6' is free, finding out if anyone is waiting for it. Thread-43466::DEBUG::2013-09-09 10:05:02,825::resourceManager::648::ResourceManager::(releaseResource) No one is waiting for resource 'Storage.ad765e5d-b9e0-4993-a852-5d49fc4a61f6', Clearing records. Thread-43466::ERROR::2013-09-09 10:05:02,825::dispatcher::70::Storage.Dispatcher.Protect::(run) [Errno 2] No such file or directory: gluster Traceback (most recent call last): File "/usr/share/vdsm/storage/dispatcher.py", line 62, in run result = ctask.prepare(self.func, *args, **kwargs) File "/usr/share/vdsm/storage/task.py", line 1159, in prepare raise self.error OSError: [Errno 2] No such file or directory: gluster Thread-43466::DEBUG::2013-09-09 10:05:02,826::vm::2038::vm.Vm::(_startUnderlyingVm) vmId=`1dc8b9cf-78c0-49b1-b473-aeaf5fb6e70b`::_ongoingCreations released Thread-43466::ERROR::2013-09-09 10:05:02,826::vm::2064::vm.Vm::(_startUnderlyingVm) vmId=`1dc8b9cf-78c0-49b1-b473-aeaf5fb6e70b`::The vm start process failed Traceback (most recent call last): File "/usr/share/vdsm/vm.py", line 2024, in _startUnderlyingVm self._run() File "/usr/share/vdsm/vm.py", line 2833, in _run self.preparePaths(devices[DISK_DEVICES]) File "/usr/share/vdsm/vm.py", line 2086, in preparePaths drive['path'] = self.cif.prepareVolumePath(drive, self.id) File "/usr/share/vdsm/clientIF.py", line 280, in prepareVolumePath raise vm.VolumeError(drive) VolumeError: Bad volume specification {'index': 0, 'iface': 'virtio', 'reqsize': '0', 'format': 'raw', 'type': 'disk', 'volumeID': '99702009-bcb4-494d-a396-d8e433bd7593', 'apparentsize': '21474836480', 'imageID': '83b6eecc-540f-4957-8d89-b8ae24bfbb95', 'specParams': {}, 'readonly': 'false', 'domainID': 'ad765e5d-b9e0-4993-a852-5d49fc4a61f6', 'deviceId': '83b6eecc-540f-4957-8d89-b8ae24bfbb95', 'truesize': '0', 'poolID': '89c81ea5-26db-4c9a-a993-4e9cc4a9ffda', 'device': 'disk', 'shared': 'false', 'propagateErrors': 'off', 'optional': 'false'} Thread-43466::DEBUG::2013-09-09 10:05:02,831::vm::2454::vm.Vm::(setDownStatus) vmId=`1dc8b9cf-78c0-49b1-b473-aeaf5fb6e70b`::Changed state to Down: Bad volume specification {'index': 0, 'iface': 'virtio', 'reqsize': '0', 'format': 'raw', 'type': 'disk', 'volumeID': '99702009-bcb4-494d-a396-d8e433bd7593', 'apparentsize': '21474836480', 'imageID': '83b6eecc-540f-4957-8d89-b8ae24bfbb95', 'specParams': {}, 'readonly': 'false', 'domainID': 'ad765e5d-b9e0-4993-a852-5d49fc4a61f6', 'deviceId': '83b6eecc-540f-4957-8d89-b8ae24bfbb95', 'truesize': '0', 'poolID': '89c81ea5-26db-4c9a-a993-4e9cc4a9ffda', 'device': 'disk', 'shared': 'false', 'propagateErrors': 'off', 'optional': 'false'} Thread-43468::DEBUG::2013-09-09 10:05:03,134::task::579::TaskManager.Task::(_updateState) Task=`21e0a7f0-dfae-4f55-be62-9b6d00bcbae6`::moving from state init -> state preparing Thread-43468::INFO::2013-09-09 10:05:03,135::logUtils::44::dispatcher::(wrapper) Run and protect: repoStats(options=None)
does this qemu-kvm version support gluster?
I upgrade all relevant components: Failed, tested on RHEVM 3.3 - IS14 environment: Host OS: RHEL 6.5 RHEVM: rhevm-3.3.0-0.21.master.el6ev.noarch PythonSDK: rhevm-sdk-python-3.3.0.13-1.el6ev.noarch VDSM: vdsm-4.12.0-127.gitedb88bf.el6ev.x86_64 LIBVIRT: libvirt-0.10.2-23.el6.x86_64 QEMU & KVM: qemu-kvm-rhev-0.12.1.2-2.401.el6.x86_64 SANLOCK: sanlock-2.8-1.el6.x86_64 Failed run VM.
Created attachment 796865 [details] ## Logs rhevm, vdsm, libvirt, thread dump, superVdsm
The problem you're hitting now [1] has nothing to do with the original problem and is probably a duplicate of Bug 1006952 - Unable to start a QEMU process due to selinux permission errors. Can you disable selinux and test again? Also, what vdsm version is this? (it looks like a custom build) [1] Thread-3342::DEBUG::2013-09-12 14:33:41,787::libvirtconnection::101::libvirtconnection::(wrapper) Unknown libvirterror: ecode: 1 edom: 10 level: 2 message: internal error process exited while connecting to monitor: qemu-kvm: -drive file=/var/run/vdsm/storage/ad765e5d-b9e0-4993-a852-5d49fc4a61f6/83b6eecc-540f-4957-8d89-b8ae24bfbb95/99702009-bcb4-494d-a396-d8e433bd7593,if=none,id=drive-virtio-disk0,format=raw,serial=83b6eecc-540f-4957-8d89-b8ae24bfbb95,cache=none,werror=stop,rerror=stop,aio=threads: could not open disk image /var/run/vdsm/storage/ad765e5d-b9e0-4993-a852-5d49fc4a61f6/83b6eecc-540f-4957-8d89-b8ae24bfbb95/99702009-bcb4-494d-a396-d8e433bd7593: Permission denied
The error in vdsm log is less useful then it could be becuase the code seems to use relative path instead of absolute path: OSError: [Errno 2] No such file or directory: gluster Using absolute path would make it easier to understand: OSError: [Errno 2] No such file or directory: /path/to/gluster
1. All hosts have Permissive switch 2. This problem exist only with GlusterFS. On iSCSI, FCP, NFS no problem power-on or run VM on same hosts. 3. VDSM: vdsm-4.12.0-127.gitedb88bf.el6ev.x86_64
(In reply to vvyazmin from comment #7) > 1. All hosts have Permissive switch Can you share the results of sestatus please? > 2. This problem exist only with GlusterFS. On iSCSI, FCP, NFS no problem > power-on or run VM on same hosts. What happens if you create a POSIX domain on top of the gluster storage? does this work?
Vlad, what are the permissions on the files in the gluster mount? what about /var/run/vdsm ?
using is19 it works fine