Bug 1190071 - [HE] ovirt-ha-agent daemon is passing wrong values in connectStorageServer
Summary: [HE] ovirt-ha-agent daemon is passing wrong values in connectStorageServer
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: oVirt
Classification: Retired
Component: ovirt-hosted-engine-ha
Version: 3.6
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: 3.6.0
Assignee: Simone Tiraboschi
QA Contact: Artyom
URL:
Whiteboard: sla
Depends On:
Blocks: 1234906
TreeView+ depends on / blocked
 
Reported: 2015-02-06 09:24 UTC by Sandro Bonazzola
Modified: 2016-02-10 19:42 UTC (History)
11 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2015-11-04 13:55:19 UTC
oVirt Team: SLA
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 44306 0 master MERGED Directly use vdscli to connectStorageServer Never

Description Sandro Bonazzola 2015-02-06 09:24:26 UTC
The Hosted Engine VM is not started anymore.

HA agent daemon logs say:

MainThread::WARNING::2015-02-06 10:12:17,063::hosted_engine::501::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(_initialize_vdsm) Failed to connect storage, waiting '15' seconds before the next attem
pt

and looking at vdsm.log I can see:


Thread-97::DEBUG::2015-02-06 10:12:17,053::BindingXMLRPC::318::vds::(wrapper) client [127.0.0.1]
Thread-97::DEBUG::2015-02-06 10:12:17,053::task::592::Storage.TaskManager.Task::(_updateState) Task=`6164cf9c-29b9-43ce-881a-c683481b64c9`::moving from state init -> state preparing
Thread-97::INFO::2015-02-06 10:12:17,053::logUtils::48::dispatcher::(wrapper) Run and protect: connectStorageServer(domType=1, spUUID='b95765f8-c8d1-4641-9da8-7ce6e41ff059', conList=[{'connection': '192.168.1.107:/storage', 'iqn': ',', 'protocol_version': '3', 'kvm': 'password', '=': 'user', ',': '='}], options=None)
Thread-97::DEBUG::2015-02-06 10:12:17,054::hsm::2404::Storage.HSM::(__prefetchDomains) nfs local path: /rhev/data-center/mnt/192.168.1.107:_storage
Thread-97::DEBUG::2015-02-06 10:12:17,055::hsm::2428::Storage.HSM::(__prefetchDomains) Found SD uuids: (u'd326997c-b097-4e67-a1f4-758cfe96a339',)
Thread-97::DEBUG::2015-02-06 10:12:17,055::hsm::2484::Storage.HSM::(connectStorageServer) knownSDs: {d326997c-b097-4e67-a1f4-758cfe96a339: storage.nfsSD.findDomain}
Thread-97::ERROR::2015-02-06 10:12:17,055::task::863::Storage.TaskManager.Task::(_setError) Task=`6164cf9c-29b9-43ce-881a-c683481b64c9`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 870, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/logUtils.py", line 49, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/storage/hsm.py", line 2486, in connectStorageServer
    res.append({'id': conDef["id"], 'status': status})
KeyError: 'id'
Thread-97::DEBUG::2015-02-06 10:12:17,055::task::882::Storage.TaskManager.Task::(_run) Task=`6164cf9c-29b9-43ce-881a-c683481b64c9`::Task._run: 6164cf9c-29b9-43ce-881a-c683481b64c9 (1, 'b95765f8-c8d1-4641-9da8-7ce6e41ff059', [{'kvm': 'password', ',': '=', 'connection': '192.168.1.107:/storage', 'iqn': ',', 'protocol_version': '3', '=': 'user'}]) {} failed - stopping task
Thread-97::DEBUG::2015-02-06 10:12:17,055::task::1214::Storage.TaskManager.Task::(stop) Task=`6164cf9c-29b9-43ce-881a-c683481b64c9`::stopping in state preparing (force False)
Thread-97::DEBUG::2015-02-06 10:12:17,055::task::990::Storage.TaskManager.Task::(_decref) Task=`6164cf9c-29b9-43ce-881a-c683481b64c9`::ref 1 aborting True
Thread-97::INFO::2015-02-06 10:12:17,055::task::1168::Storage.TaskManager.Task::(prepare) Task=`6164cf9c-29b9-43ce-881a-c683481b64c9`::aborting: Task is aborted: u"'id'" - code 100
Thread-97::DEBUG::2015-02-06 10:12:17,055::task::1173::Storage.TaskManager.Task::(prepare) Task=`6164cf9c-29b9-43ce-881a-c683481b64c9`::Prepare: aborted: 'id'
Thread-97::DEBUG::2015-02-06 10:12:17,056::task::990::Storage.TaskManager.Task::(_decref) Task=`6164cf9c-29b9-43ce-881a-c683481b64c9`::ref 0 aborting True
Thread-97::DEBUG::2015-02-06 10:12:17,056::task::925::Storage.TaskManager.Task::(_doAbort) Task=`6164cf9c-29b9-43ce-881a-c683481b64c9`::Task._doAbort: force False
Thread-97::DEBUG::2015-02-06 10:12:17,056::resourceManager::977::Storage.ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {}
Thread-97::DEBUG::2015-02-06 10:12:17,056::task::592::Storage.TaskManager.Task::(_updateState) Task=`6164cf9c-29b9-43ce-881a-c683481b64c9`::moving from state preparing -> state aborting
Thread-97::DEBUG::2015-02-06 10:12:17,056::task::547::Storage.TaskManager.Task::(__state_aborting) Task=`6164cf9c-29b9-43ce-881a-c683481b64c9`::_aborting: recover policy none
Thread-97::DEBUG::2015-02-06 10:12:17,056::task::592::Storage.TaskManager.Task::(_updateState) Task=`6164cf9c-29b9-43ce-881a-c683481b64c9`::moving from state aborting -> state failed
Thread-97::DEBUG::2015-02-06 10:12:17,056::resourceManager::940::Storage.ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {}
Thread-97::DEBUG::2015-02-06 10:12:17,056::resourceManager::977::Storage.ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {}
Thread-97::ERROR::2015-02-06 10:12:17,056::dispatcher::79::Storage.Dispatcher::(wrapper) 'id'
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/dispatcher.py", line 71, in wrapper
    result = ctask.prepare(func, *args, **kwargs)
  File "/usr/share/vdsm/storage/task.py", line 103, in wrapper
    return m(self, *a, **kw)
  File "/usr/share/vdsm/storage/task.py", line 1176, in prepare
    raise self.error
KeyError: 'id'


Looks like ha daemon is passing wrong values in connectStorageServer:

conList=[{',': '=',
  '=': 'user',
  'connection': '192.168.1.107:/storage',
  'iqn': ',',
  'kvm': 'password',
  'protocol_version': '3'}]

 # rpm -qa |egrep "(vdsm|ovirt-hosted-engine-ha)"|sort
 ovirt-hosted-engine-ha-1.3.0-0.0.master.20150126112715.20150126112712.gita3c842d.el7.noarch
 vdsm-4.17.0-381.git160954e.el7.x86_64
 vdsm-cli-4.17.0-381.git160954e.el7.noarch
 vdsm-gluster-4.17.0-381.git160954e.el7.noarch
 vdsm-infra-4.17.0-381.git160954e.el7.noarch
 vdsm-jsonrpc-4.17.0-381.git160954e.el7.noarch
 vdsm-python-4.17.0-381.git160954e.el7.noarch
 vdsm-python-zombiereaper-4.16.12-0.el7.noarch
 vdsm-xmlrpc-4.17.0-381.git160954e.el7.noarch 
 vdsm-yajsonrpc-4.17.0-381.git160954e.el7.noarch


# cat /etc/ovirt-hosted-engine/hosted-engine.conf
fqdn=ovirt.home
vm_disk_id=20d55944-7177-4883-979f-a4c5ce789d8d
vmid=7e3d35ae-3611-4d50-a634-1bdbee81e67a
storage=192.168.1.107:/storage
conf=/etc/ovirt-hosted-engine/vm.conf
service_start_time=0
host_id=1
console=vnc
domainType=nfs3
spUUID=b95765f8-c8d1-4641-9da8-7ce6e41ff059
sdUUID=d326997c-b097-4e67-a1f4-758cfe96a339
connectionUUID=c65694a0-6d9b-4bd4-af05-6720c07aa897
ca_cert=/etc/pki/vdsm/libvirt-spice/ca-cert.pem
ca_subject="C=EN, L=Test, O=Test, CN=Test"
vdsm_use_ssl=true
gateway=192.168.1.1
bridge=ovirtmgmt
metadata_volume_UUID=82664f1b-cdfb-4854-9077-2cc6082fc87b
metadata_image_UUID=5d65488d-b5b4-4992-ad58-eb132125d99e
lockspace_volume_UUID=e364cba3-4dcc-4997-8973-534380a0c006
lockspace_image_UUID=007f210b-8de4-471e-bb76-417519423a8b

# The following are used only for iSCSI storage
iqn=
portal=
user=
password=
port=


# cat /etc/ovirt-hosted-engine/vm.conf
vmId=7e3d35ae-3611-4d50-a634-1bdbee81e67a
memSize=4096
display=vnc
devices={index:2,iface:ide,address:{ controller:0, target:0,unit:0, bus:1, type:drive},specParams:{},readonly:true,deviceId:e4c5cdbb-4511-4fd9-8c83-10d3d940a721,path:/var/tmp/RHEL-7.0-20140507.0-Server-x86_64-dvd1.iso,device:cdrom,shared:false,type:disk}
devices={index:0,iface:virtio,format:raw,poolID:00000000-0000-0000-0000-000000000000,volumeID:75d72f29-5ec4-498e-913e-f67b4fc72f9c,imageID:20d55944-7177-4883-979f-a4c5ce789d8d,specParams:{},readonly:false,domainID:d326997c-b097-4e67-a1f4-758cfe96a339,optional:false,deviceId:20d55944-7177-4883-979f-a4c5ce789d8d,address:{bus:0x00, slot:0x06, domain:0x0000, type:pci, function:0x0},device:disk,shared:exclusive,propagateErrors:off,type:disk,bootOrder:1}
devices={device:scsi,model:virtio-scsi,type:controller}
devices={nicModel:pv,macAddr:00:16:3e:45:e1:88,linkActive:true,network:ovirtmgmt,filter:vdsm-no-mac-spoofing,specParams:{},deviceId:97eef999-decd-4e2f-b7f2-7cc2fe3c40e6,address:{bus:0x00, slot:0x03, domain:0x0000, type:pci, function:0x0},device:bridge,type:interface}
devices={device:console,specParams:{},type:console,deviceId:59548720-db25-45c5-b076-78c8ea745098,alias:console0}
vmName=HostedEngine
spiceSecureChannels=smain,sdisplay,sinputs,scursor,splayback,srecord,ssmartcard,susbredir
smp=4
cpuType=SandyBridge
emulatedMachine=rhel6.5.0

Comment 1 Artyom 2015-09-02 12:14:45 UTC
Hi Simone, can you provide me information how to verified this bug.
I did deployment with two hosts on NFS storage(except problem with self.storageType) all works fine
And did deployment with one host on iscsi storage(have bug with adding additional host on ISCSI) all works fine
If it enough or you want some additional testing?

Comment 2 Simone Tiraboschi 2015-09-02 12:20:13 UTC
It's enough, thanks.

Comment 3 Artyom 2015-09-02 12:32:26 UTC
Verified on ovirt-hosted-engine-ha-1.3.0-0.3.beta.git183a4ff.el7ev.noarch

Comment 4 Sandro Bonazzola 2015-11-04 13:55:19 UTC
oVirt 3.6.0 has been released on November 4th, 2015 and should fix this issue.
If problems still persist, please open a new BZ and reference this one.


Note You need to log in before you can comment on or make changes to this bug.