Bug 860229

Summary: When removing HSM host, SPM host is disconnected from storage
Product: Red Hat Enterprise Virtualization Manager Reporter: Petr Dufek <pdufek>
Component: ovirt-engineAssignee: Nobody's working on this, feel free to take it <nobody>
Status: CLOSED DUPLICATE QA Contact:
Severity: high Docs Contact:
Priority: high    
Version: 3.1.0CC: abaron, amureini, bugzilla-qe-tlv, dyasny, iheim, lpeer, Rhev-m-bugs, sgrinber, yeylon, ykaul
Target Milestone: ---Keywords: Regression
Target Release: 3.1.0   
Hardware: x86_64   
OS: Linux   
Whiteboard: storage
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-10-17 10:48:40 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
logs none

Description Petr Dufek 2012-09-25 10:50:58 UTC
Created attachment 616962 [details]
logs

Description of problem:
-----------------------
When removing HSM host, SPM host is disconnected from storage


Version-Release number of selected component (if applicable):
-------------------------------------------------------------
rhevm-3.1.0-16.el6ev.noarch
vdsm-4.9.6-34.0.el6_3.x86_64


Steps to Reproduce:
-------------------
1. 2 hosts in setup
2. remove HSM host (putting to maintanance, then removing host: b5d11c56-06e4-11e2-955d-001a4a169750)
3. DisconnectStorageServerVDSCommand is applied for the second host (SPM: b8e475c8-06e4-11e2-85a6-001a4a169750) - can be seen in attached engine.log 


2012-09-25 10:21:22,587 INFO  [org.ovirt.engine.core.bll.MaintananceNumberOfVdssCommand] (ajp-/127.0.0.1:8009-8) [71ea295c] Running command: MaintananceNumberOfVdssCommand internal: false. Entities affected :  ID: b5d11c56-06e4-11e2-955d-001a4a169750 Type: VDS$
2012-09-25 10:21:22,588 INFO  [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (ajp-/127.0.0.1:8009-8) [71ea295c] START, SetVdsStatusVDSCommand(vdsId = b5d11c56-06e4-11e2-955d-001a4a169750, status=PreparingForMaintenance, nonOperationalReason=NONE), log id: 3c7d9a05$
2012-09-25 10:21:22,615 INFO  [org.ovirt.engine.core.vdsbroker.SetVdsStatusVDSCommand] (ajp-/127.0.0.1:8009-8) [71ea295c] FINISH, SetVdsStatusVDSCommand, log id: 3c7d9a05$
2012-09-25 10:21:22,665 INFO  [org.ovirt.engine.core.bll.MaintananceVdsCommand] (ajp-/127.0.0.1:8009-8) [71ea295c] Running command: MaintananceVdsCommand internal: true. Entities affected :  ID: b5d11c56-06e4-11e2-955d-001a4a169750 Type: VDS$
2012-09-25 10:21:22,727 INFO  [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] (QuartzScheduler_Worker-73) vds::Updated vds status from Preparing for Maintenance to Maintenance in database,  vds = b5d11c56-06e4-11e2-955d-001a4a169750 : 10.35.160.97$
2012-09-25 10:21:22,819 INFO  [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-73) Clearing cache of pool: b63f7208-603d-49ed-9b3e-0119e53a470f for problematic entities of VDS: 10.35.160.97.$
2012-09-25 10:21:22,821 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.DisconnectStoragePoolVDSCommand] (QuartzScheduler_Worker-73) START, DisconnectStoragePoolVDSCommand(vdsId = b5d11c56-06e4-11e2-955d-001a4a169750, storagePoolId = b63f7208-603d-49ed-9b3e-0119e53a470f, vds_spm_id = 1, masterDomainId = 00000000-0000-0000-0000-000000000000, masterVersion = 0), log id: 3be5de49$
2012-09-25 10:21:23,451 INFO  [org.ovirt.engine.core.bll.RemoveVdsCommand] (ajp-/127.0.0.1:8009-15) [496e001c] Lock Acquired to object EngineLock [exclusiveLocks= key: b5d11c56-06e4-11e2-955d-001a4a169750 value: VDS$
, sharedLocks= ]$
2012-09-25 10:21:23,457 INFO  [org.ovirt.engine.core.bll.RemoveVdsCommand] (ajp-/127.0.0.1:8009-15) [496e001c] Running command: RemoveVdsCommand internal: false. Entities affected :  ID: b5d11c56-06e4-11e2-955d-001a4a169750 Type: VDS$
2012-09-25 10:21:23,475 INFO  [org.ovirt.engine.core.vdsbroker.RemoveVdsVDSCommand] (ajp-/127.0.0.1:8009-15) [496e001c] START, RemoveVdsVDSCommand(vdsId = b5d11c56-06e4-11e2-955d-001a4a169750), log id: fb82503$
2012-09-25 10:21:24,853 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.DisconnectStoragePoolVDSCommand] (QuartzScheduler_Worker-73) FINISH, DisconnectStoragePoolVDSCommand, log id: 3be5de49$
2012-09-25 10:21:24,868 INFO  [org.ovirt.engine.core.bll.storage.DisconnectHostFromStoragePoolServersCommand] (QuartzScheduler_Worker-73) [1e01a814] Running command: DisconnectHostFromStoragePoolServersCommand internal: true. Entities affected :  ID: b63f7208-603d-49ed-9b3e-0119e53a470f Type: StoragePool$
2012-09-25 10:21:24,908 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.DisconnectStorageServerVDSCommand] (QuartzScheduler_Worker-73) [1e01a814] START, DisconnectStorageServerVDSCommand(vdsId = b8e475c8-06e4-11e2-85a6-001a4a169750, storagePoolId = b63f7208-603d-49ed-9b3e-0119e53a470f, storageType = NFS, connectionList = [{ id: f8c8d3f5-9604-4978-aaa8-58b0d757a1be, connection: 10.35.64.102:/volumes/wolf/jenkins-vm-01_nfs_2012092585016845257, iqn: null, vfsType: null, mountOptions: null, nfsVersion: null, nfsRetrans: null, nfsTimeo: null };]), log id: 54582bb$
2012-09-25 10:21:25,499 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-79) IrsBroker::Failed::GetStoragePoolInfoVDS$
2012-09-25 10:21:25,499 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (QuartzScheduler_Worker-79) Exception: IRSGenericException: IRSErrorException: IRSNoMasterDomainException: Cannot find master domain: 'spUUID=b63f7208-603d-49ed-9b3e-0119e53a470f, msdUUID=ca36fa96-f918-4881-92ce-18c867781c5f'


Attached are logs from vdsm, engine and testing flow log.

Comment 2 Itamar Heim 2012-10-02 17:35:34 UTC
can you please try and reproduce with si19 (a patch to change this flow was introduced in si19 for bug 821634)

Comment 3 Petr Dufek 2012-10-11 13:26:19 UTC
verified si20

Comment 4 Ayal Baron 2012-10-14 06:30:34 UTC
(In reply to comment #3)
> verified si20

you mean it doesn't reproduce in SI20? if so, why not close the bug?

Comment 5 Petr Dufek 2012-10-15 06:55:12 UTC
yes, I didn't reproduce it.
I didn't close it cos it's in NEW state.

Comment 6 Petr Dufek 2012-10-17 10:48:40 UTC

*** This bug has been marked as a duplicate of bug 821634 ***