Bug 895996 - SPM doesn't switch to non-operational after block connectivity to storage
SPM doesn't switch to non-operational after block connectivity to storage
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine (Show other bugs)
3.2.0
Unspecified Unspecified
unspecified Severity high
: ---
: 3.2.0
Assigned To: mkublin
vvyazmin@redhat.com
infra
:
Depends On:
Blocks: 915537
  Show dependency treegraph
 
Reported: 2013-01-16 08:05 EST by Arik
Modified: 2016-02-10 14:43 EST (History)
14 users (show)

See Also:
Fixed In Version: sf5
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Infra
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
engine log (159.62 KB, application/x-gzip)
2013-01-16 08:05 EST, Arik
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
oVirt gerrit 11100 None None None Never
oVirt gerrit 11101 None None None Never
oVirt gerrit 11102 None None None Never

  None (edit)
Description Arik 2013-01-16 08:05:52 EST
Created attachment 679539 [details]
engine log

Description of problem:
SPM doesn't become non-operational after disconnecting all hosts in the cluster from the storage

Version-Release number of selected component (if applicable):


How reproducible:
100%

Steps to Reproduce:
1. run 3 hosts in a clutser
2. block the connection to the storage from all hosts
3.
  
Actual results:
SPM doesn't become non-operational,
eventually all the 3 hosts move to 'up' state.

Expected results:
SPM become non-operational

Additional info:
----- Original Message -----
> From: "Omer Frenkel" <ofrenkel@redhat.com>
> To: "Michael Kublin" <mkublin@redhat.com>
> Cc: "Arik Hadas" <ahadas@redhat.com>
> Sent: Wednesday, January 16, 2013 2:29:33 PM
> Subject: Re: log for bug
>
>
>
> ----- Original Message -----
> > From: "Michael Kublin" <mkublin@redhat.com>
> > To: "Arik Hadas" <ahadas@redhat.com>, "Omer Frenkel"
> > <ofrenkel@redhat.com>
> > Sent: Wednesday, January 16, 2013 2:04:25 PM
> > Subject: Re: log for bug
> >
> >
> > Hi, I took look at logs.
> > For some reason we did not do InitVdsOnUp after 12:27, but these is
> > less important for you case.
>
> the relevant initVdsOnUp was in 12:28:30
>
> > I take a look around 11:59.
>
> i hope it's the same scenario..
>
> > I think that spm was bamba.
> > During InitVdsOnUp we failed to connect host to pool because
> > missing
> > master domain, so I triggered a
> > reconstruct.
> > 2013-01-15 11:59:10,352 INFO
> >  [org.ovirt.engine.core.bll.storage.ReconstructMasterDomainCommand]
> > (pool-11-thread-41) [43e1763b] Running command:
> > ReconstructMasterDomainCommand internal: true. Entities affected :
> >  ID: 6ff7ee1a-eecd-4ef9-b303-d894d6f595e9 Type: Storage
> > (Thread name is changed because of using a queue)
> > Now, the master was not found so no reconstruct is done.
> > 2013-01-15 11:59:14,639 INFO
> >  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
> > (pool-11-thread-41) [43e1763b] No string for
> > RECONSTRUCT_MASTER_FAILED_NO_MASTER type. Use default Log
> >
> > At such case ReconstructMasterDomainCommand finished with success,
> > it
> > means InitVdsOnUp successes - it is a bug.
>
> i am not sure how its possible that reconstruct has succeeded, the
> storage is disconnected,
> (and there is only one domain in the pool)
>
the storage was disconnected because wrong parameter passed to ReconstructMasterDomainCommand
ReconstructMasterParameters.isInActive == false. (I think these wrong , these is my mistake)

> if the reconstruct was really successful then maybe the host really
> should be up?
The command ReconstructMasterDomainCommand.isSuccessed == true, because last master and it is usually will be true.
These a way that it is working now after re factoring made by storage team.

> > Thanks guys, it is mine. Can you please open a bug or if you have
> > already opened assign it to me.
> >
> > ----- Original Message -----
> > From: "Arik Hadas" <ahadas@redhat.com>
> > To: "Michael Kublin" <mkublin@redhat.com>
> > Sent: Wednesday, January 16, 2013 12:37:15 PM
> > Subject: log for bug
> >
> > on 12:27 I blocked connection to storage from all 3 hosts (knight,
> > honda, bamba)
> > knight was the SPM before the disconnection
> >
> > knight moved to status 'connecting'
> >
> > the storage domain moved to maintenance
> >
> > knight moved to status 'up'
> >
> > honda become SPM (the storage is not activated..)
> >
> > honda is cleared from being SPM
> >
> > -> all three hosts are in status 'up'
> >
>
Comment 4 Leonid Natapov 2013-01-30 10:03:56 EST
sf5. fixed. after blocking connection from all hosts to SD ,SPM becomes non operational.
Comment 5 Itamar Heim 2013-06-11 04:33:02 EDT
3.2 has been released
Comment 6 Itamar Heim 2013-06-11 04:33:05 EDT
3.2 has been released
Comment 7 Itamar Heim 2013-06-11 04:33:58 EDT
3.2 has been released
Comment 8 Itamar Heim 2013-06-11 04:42:31 EDT
3.2 has been released

Note You need to log in before you can comment on or make changes to this bug.