Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 849446

Summary: [engine-core] spmStart is sent when storage is not connected (pool is not recovered)
Product: Red Hat Enterprise Virtualization Manager Reporter: Ido Begun <ibegun>
Component: ovirt-engineAssignee: Greg Padgett <gpadgett>
Status: CLOSED CURRENTRELEASE QA Contact: vvyazmin <vvyazmin>
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.1.0CC: abaron, amureini, dron, dyasny, hateya, iheim, lpeer, oramraz, Rhev-m-bugs, sgrinber, yeylon, ykaul
Target Milestone: ---   
Target Release: 3.1.0   
Hardware: x86_64   
OS: Linux   
Whiteboard: storage
Fixed In Version: SI19 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-12-04 20:06:57 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Logs none

Description Ido Begun 2012-08-19 13:48:35 UTC
Description of problem:
While running an automated test, which does the following:
-Create a template from a master VM.
-Create 7 VM's from the template.
-Run each VM until it gets an IP address, then move on to the next VM.

Due to a network problem, when trying to start the 4th VM, the VDSM service was terminated due to inaccessibility to the master domain. later on, we see that engine tries to start SPM on host without storage being connected, using the connectStorageServer command.


Version-Release number of selected component (if applicable):
3.1.0

Happened once so far.

Comment 1 Ido Begun 2012-08-19 13:49:25 UTC
Created attachment 605497 [details]
Logs

Comment 2 Haim 2012-08-19 15:11:23 UTC
just to put a focus around this issue; please start and take a look at vdsm.log.1.xz, and you can see that disconnectStorageServer was sent followed by  spmStart attempts failed on reading metadata which is pretty much obvious given the fact that underlying storage was disconnected intentionally by engine (take a look on the engine log to determine the reason, attached as well).

cat /tmp/log | egrep 'Run and|ERROR' | egrep 'connectStorageServer|connectStoragePool|deactivateStorage|spmStart|spmStop'
Thread-268::INFO::2012-08-19 10:24:21,403::logUtils::39::dispatcher::(wrapper) Run and protect: disconnectStorageServer, Return response: {'statuslist': [{'status': 0, 'id': '4db1c216-4d7e-4ee6-80f2-401b56e84628'}]}
Thread-274::INFO::2012-08-19 10:24:31,088::logUtils::37::dispatcher::(wrapper) Run and protect: spmStart(spUUID='b1ab4917-ffcc-49ea-8345-bb50c181d7b7', prevID=-1, prevLVER='1', recoveryMode=None, scsiFencing='false', maxHostID=250, domVersion='3', options=None)
Thread-274::INFO::2012-08-19 10:24:31,090::logUtils::39::dispatcher::(wrapper) Run and protect: spmStart, Return response: None
Thread-287::INFO::2012-08-19 10:24:42,450::logUtils::37::dispatcher::(wrapper) Run and protect: spmStop(spUUID='b1ab4917-ffcc-49ea-8345-bb50c181d7b7', options=None)
Thread-492::INFO::2012-08-19 10:29:42,063::logUtils::37::dispatcher::(wrapper) Run and protect: disconnectStorageServer(domType=1, spUUID='00000000-0000-0000-0000-000000000000', conList=[{'connection': '10.35.64.102:/volumes/wolf/Lime_ksm_0_nfs_20120806171531359900', 'iqn': '', 'portal': '', 'user': '', 'password': '******', 'id': '4db1c216-4d7e-4ee6-80f2-401b56e84628', 'port': ''}], options=None)
Thread-492::ERROR::2012-08-19 10:29:42,078::hsm::2049::Storage.HSM::(disconnectStorageServer) Could not disconnect from storageServer
Thread-492::INFO::2012-08-19 10:29:42,559::logUtils::39::dispatcher::(wrapper) Run and protect: disconnectStorageServer, Return response: {'statuslist': [{'status': 477, 'id': '4db1c216-4d7e-4ee6-80f2-401b56e84628'}]}
Thread-1293::INFO::2012-08-19 10:49:09,788::logUtils::37::dispatcher::(wrapper) Run and protect: connectStorageServer(domType=1, spUUID='00000000-0000-0000-0000-000000000000', conList=[{'connection': '10.35.64.102:/volumes/wolf/Lime_ksm_0_nfs_20120806171531359900', 'iqn': '', 'portal': '', 'user': '', 'password': '******', 'id': '4db1c216-4d7e-4ee6-80f2-401b56e84628', 'port': ''}], options=None)
Thread-1293::INFO::2012-08-19 10:49:09,865::logUtils::39::dispatcher::(wrapper) Run and protect: connectStorageServer, Return response: {'statuslist': [{'status': 0, 'id': '4db1c216-4d7e-4ee6-80f2-401b56e84628'}]}
Thread-1296::INFO::2012-08-19 10:49:10,084::logUtils::37::dispatcher::(wrapper) Run and protect: spmStart(spUUID='b1ab4917-ffcc-49ea-8345-bb50c181d7b7', prevID=-1, prevLVER='1', recoveryMode=None, scsiFencing='false', maxHostID=250, domVersion='3', options=None)
Thread-1296::INFO::2012-08-19 10:49:10,087::logUtils::39::dispatcher::(wrapper) Run and protect: spmStart, Return response: None
Thread-1303::INFO::2012-08-19 10:49:11,334::logUtils::37::dispatcher::(wrapper) Run and protect: connectStoragePool(spUUID='b1ab4917-ffcc-49ea-8345-bb50c181d7b7', hostID=1, scsiKey='b1ab4917-ffcc-49ea-8345-bb50c181d7b7', msdUUID='b9e831ca-a12e-46c6-b0b1-395f6542d6fc', masterVersion=1, options=None)
Thread-1303::INFO::2012-08-19 10:49:11,337::logUtils::39::dispatcher::(wrapper) Run and protect: connectStoragePool, Return response: None
Thread-1539::INFO::2012-08-19 10:53:36,481::logUtils::37::dispatcher::(wrapper) Run and protect: deactivateStorageDomain(sdUUID='b9e831ca-a12e-46c6-b0b1-395f6542d6fc', spUUID='b1ab4917-ffcc-49ea-8345-bb50c181d7b7', msdUUID='00000000-0000-0000-0000-000000000000', masterVersion=1, options=None)
Thread-1539::INFO::2012-08-19 10:53:36,489::logUtils::39::dispatcher::(wrapper) Run and protect: deactivateStorageDomain, Return response: None
Thread-1541::INFO::2012-08-19 10:53:36,542::logUtils::37::dispatcher::(wrapper) Run and protect: spmStop(spUUID='b1ab4917-ffcc-49ea-8345-bb50c181d7b7', options=None)
Thread-1541::INFO::2012-08-19 10:53:36,611::logUtils::39::dispatcher::(wrapper) Run and protect: spmStop, Return response: None
Thread-1542::INFO::2012-08-19 10:53:36,656::logUtils::37::dispatcher::(wrapper) Run and protect: disconnectStoragePool(spUUID='b1ab4917-ffcc-49ea-8345-bb50c181d7b7', hostID=1, scsiKey='b1ab4917-ffcc-49ea-8345-bb50c181d7b7', remove=False, options=None)
Thread-1542::INFO::2012-08-19 10:53:36,659::logUtils::39::dispatcher::(wrapper) Run and protect: disconnectStoragePool, Return response: True
Thread-1543::INFO::2012-08-19 10:53:36,682::logUtils::37::dispatcher::(wrapper) Run and protect: disconnectStorageServer(domType=1, spUUID='00000000-0000-0000-0000-000000000000', conList=[{'connection': '10.35.64.102:/volumes/wolf/Lime_ksm_0_nfs_20120806171531359900', 'iqn': '', 'portal': '', 'user': '', 'password': '******', 'id': '4db1c216-4d7e-4ee6-80f2-401b56e84628', 'port': ''}], options=None)
Thread-1543::INFO::2012-08-19 10:53:37,211::logUtils::39::dispatcher::(wrapper) Run and protect: disconnectStorageServer, Return response: {'statuslist': [{'status': 0, 'id': '4db1c216-4d7e-4ee6-80f2-401b56e84628'}]}
Thread-1560::INFO::2012-08-19 10:54:02,447::logUtils::37::dispatcher::(wrapper) Run and protect: connectStorageServer(domType=1, spUUID='00000000-0000-0000-0000-000000000000', conList=[{'connection': '10.35.64.102:/volumes/wolf/Lime_ksm_0_nfs_20120806171531359900', 'iqn': '', 'portal': '', 'user': '', 'password': '******', 'id': '4db1c216-4d7e-4ee6-80f2-401b56e84628', 'port': ''}], options=None)
Thread-1560::INFO::2012-08-19 10:54:02,501::logUtils::39::dispatcher::(wrapper) Run and protect: connectStorageServer, Return response: {'statuslist': [{'status': 0, 'id': '4db1c216-4d7e-4ee6-80f2-401b56e84628'}]}
Thread-1561::INFO::2012-08-19 10:54:02,522::logUtils::37::dispatcher::(wrapper) Run and protect: connectStoragePool(spUUID='b1ab4917-ffcc-49ea-8345-bb50c181d7b7', hostID=1, scsiKey='b1ab4917-ffcc-49ea-8345-bb50c181d7b7', msdUUID='b9e831ca-a12e-46c6-b0b1-395f6542d6fc', masterVersion=1, options=None)
Thread-1561::INFO::2012-08-19 10:54:03,276::logUtils::39::dispatcher::(wrapper) Run and protect: connectStoragePool, Return response: True
Thread-1567::INFO::2012-08-19 10:54:03,453::logUtils::37::dispatcher::(wrapper) Run and protect: spmStart(spUUID='b1ab4917-ffcc-49ea-8345-bb50c181d7b7', prevID=-1, prevLVER='2', recoveryMode=None, scsiFencing='false', maxHostID=250, domVersion='3', options=None)
Thread-1567::INFO::2012-08-19 10:54:03,455::logUtils::39::dispatcher::(wrapper) Run and protect: spmStart, Return response: None
Thread-1575::INFO::2012-08-19 10:54:08,080::logUtils::37::dispatcher::(wrapper) Run and protect: disconnectStorageServer(domType=1, spUUID='00000000-0000-0000-0000-000000000000', conList=[{'connection': '10.35.64.102:/volumes/wolf/Lime_ksm_0_nfs_20120806171531359900', 'iqn': '', 'portal': '', 'user': '', 'password': '******', 'id': '4db1c216-4d7e-4ee6-80f2-401b56e84628', 'port': ''}], options=None)
Thread-1575::INFO::2012-08-19 10:54:08,535::logUtils::39::dispatcher::(wrapper) Run and protect: disconnectStorageServer, Return response: {'statuslist': [{'status': 0, 'id': '4db1c216-4d7e-4ee6-80f2-401b56e84628'}]}
Thread-1576::INFO::2012-08-19 10:54:08,567::logUtils::37::dispatcher::(wrapper) Run and protect: disconnectStoragePool(spUUID='b1ab4917-ffcc-49ea-8345-bb50c181d7b7', hostID=1, scsiKey='b1ab4917-ffcc-49ea-8345-bb50c181d7b7', remove=False, options=None)
Thread-1576::INFO::2012-08-19 10:54:08,568::logUtils::39::dispatcher::(wrapper) Run and protect: disconnectStoragePool, Return response: None
Thread-1586::INFO::2012-08-19 10:54:20,336::logUtils::37::dispatcher::(wrapper) Run and protect: connectStorageServer(domType=1, spUUID='00000000-0000-0000-0000-000000000000', conList=[{'connection': '10.35.64.102:/volumes/wolf/Lime_ksm_0_nfs_20120806171531359900', 'iqn': '', 'portal': '', 'user': '', 'password': '******', 'id': '4db1c216-4d7e-4ee6-80f2-401b56e84628', 'port': ''}], options=None)
Thread-1586::INFO::2012-08-19 10:54:20,385::logUtils::39::dispatcher::(wrapper) Run and protect: connectStorageServer, Return response: {'statuslist': [{'status': 0, 'id': '4db1c216-4d7e-4ee6-80f2-401b56e84628'}]}
Thread-1591::INFO::2012-08-19 10:54:21,186::logUtils::37::dispatcher::(wrapper) Run and protect: disconnectStorageServer(domType=1, spUUID='00000000-0000-0000-0000-000000000000', conList=[{'connection': '10.35.64.102:/volumes/wolf/Lime_ksm_0_nfs_20120806171531359900', 'iqn': '', 'portal': '', 'user': '', 'password': '******', 'id': '4db1c216-4d7e-4ee6-80f2-401b56e84628', 'port': ''}], options=None)
Thread-1591::INFO::2012-08-19 10:54:21,653::logUtils::39::dispatcher::(wrapper) Run and protect: disconnectStorageServer, Return response: {'statuslist': [{'status': 0, 'id': '4db1c216-4d7e-4ee6-80f2-401b56e84628'}]}

Comment 6 vvyazmin@redhat.com 2012-10-09 11:49:02 UTC
Verified on RHEVM 3.1 - SI19.1

RHEVM: rhevm-3.1.0-18.el6ev.noarch
VDSM: vdsm-4.9.6-36.0.el6_3.x86_64
LIBVIRT: libvirt-0.9.10-21.el6_3.4.x86_64
QEMU & KVM: qemu-kvm-rhev-0.12.1.2-2.295.el6_3.2.x86_64
SANLOCK: sanlock-2.3-4.el6_3.x86_64