Red Hat Bugzilla – Bug 895996
SPM doesn't switch to non-operational after block connectivity to storage
Last modified: 2016-02-10 14:43:06 EST
Created attachment 679539 [details]
Description of problem:
SPM doesn't become non-operational after disconnecting all hosts in the cluster from the storage
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. run 3 hosts in a clutser
2. block the connection to the storage from all hosts
SPM doesn't become non-operational,
eventually all the 3 hosts move to 'up' state.
SPM become non-operational
----- Original Message -----
> From: "Omer Frenkel" <firstname.lastname@example.org>
> To: "Michael Kublin" <email@example.com>
> Cc: "Arik Hadas" <firstname.lastname@example.org>
> Sent: Wednesday, January 16, 2013 2:29:33 PM
> Subject: Re: log for bug
> ----- Original Message -----
> > From: "Michael Kublin" <email@example.com>
> > To: "Arik Hadas" <firstname.lastname@example.org>, "Omer Frenkel"
> > <email@example.com>
> > Sent: Wednesday, January 16, 2013 2:04:25 PM
> > Subject: Re: log for bug
> > Hi, I took look at logs.
> > For some reason we did not do InitVdsOnUp after 12:27, but these is
> > less important for you case.
> the relevant initVdsOnUp was in 12:28:30
> > I take a look around 11:59.
> i hope it's the same scenario..
> > I think that spm was bamba.
> > During InitVdsOnUp we failed to connect host to pool because
> > missing
> > master domain, so I triggered a
> > reconstruct.
> > 2013-01-15 11:59:10,352 INFO
> > [org.ovirt.engine.core.bll.storage.ReconstructMasterDomainCommand]
> > (pool-11-thread-41) [43e1763b] Running command:
> > ReconstructMasterDomainCommand internal: true. Entities affected :
> > ID: 6ff7ee1a-eecd-4ef9-b303-d894d6f595e9 Type: Storage
> > (Thread name is changed because of using a queue)
> > Now, the master was not found so no reconstruct is done.
> > 2013-01-15 11:59:14,639 INFO
> > [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
> > (pool-11-thread-41) [43e1763b] No string for
> > RECONSTRUCT_MASTER_FAILED_NO_MASTER type. Use default Log
> > At such case ReconstructMasterDomainCommand finished with success,
> > it
> > means InitVdsOnUp successes - it is a bug.
> i am not sure how its possible that reconstruct has succeeded, the
> storage is disconnected,
> (and there is only one domain in the pool)
the storage was disconnected because wrong parameter passed to ReconstructMasterDomainCommand
ReconstructMasterParameters.isInActive == false. (I think these wrong , these is my mistake)
> if the reconstruct was really successful then maybe the host really
> should be up?
The command ReconstructMasterDomainCommand.isSuccessed == true, because last master and it is usually will be true.
These a way that it is working now after re factoring made by storage team.
> > Thanks guys, it is mine. Can you please open a bug or if you have
> > already opened assign it to me.
> > ----- Original Message -----
> > From: "Arik Hadas" <firstname.lastname@example.org>
> > To: "Michael Kublin" <email@example.com>
> > Sent: Wednesday, January 16, 2013 12:37:15 PM
> > Subject: log for bug
> > on 12:27 I blocked connection to storage from all 3 hosts (knight,
> > honda, bamba)
> > knight was the SPM before the disconnection
> > knight moved to status 'connecting'
> > the storage domain moved to maintenance
> > knight moved to status 'up'
> > honda become SPM (the storage is not activated..)
> > honda is cleared from being SPM
> > -> all three hosts are in status 'up'
sf5. fixed. after blocking connection from all hosts to SD ,SPM becomes non operational.
3.2 has been released