Bug 838206 - oVirt manually shuting down the spm host will take the entire data center down.
oVirt manually shuting down the spm host will take the entire data center down.
Status: CLOSED WONTFIX
Product: oVirt
Classification: Community
Component: ovirt-engine-core (Show other bugs)
3.1 RC
x86_64 Linux
unspecified Severity urgent
: ---
: 3.3.4
Assigned To: Nobody's working on this, feel free to take it
storage
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-07-07 03:16 EDT by Robert Middleswarth
Modified: 2016-02-10 11:38 EST (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-03-11 17:58:28 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Storage
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Robert Middleswarth 2012-07-07 03:16:37 EDT
Description of problem:
I came across this by accendent I shutdown one of my host by mistake.  It turns out it was also the SPM host and it took down the entire network.  I was able to recover the datacenter / cluster but it was a lot of downtime.  This could be an issue in a production network.

Version-Release number of selected component (if applicable):
oVirt Engine Version: 3.1.0-3.9.el6 
vdsm-cli: 4.10.0-0.58.gita6f4929.el6


How reproducible:
I have done it 3 times in my test network.

Steps to Reproduce:
1. Build a 3 node network not sure if it makes a diff I am using glusterfs
2. manually shutdown the Current SPM node.
3. Watch as everything crashes.
  
Actual results:
The primarary data store goes down taking the entire data center down at the same time.

Expected results:
All the VM's move and the data store continues to runs in degraded state 

Additional info:
Not sure if this an engine ore vdsm issue and what logs are needed.  Since it is very repeatable were do you want me to do to help debug this very problematic issue.

Thanks
Robert
Comment 1 Itamar Heim 2012-07-07 09:01:17 EDT
type of data center (if posixfs/gluster, i assume duplicate of the can't elect spm bug)?
if not, logs...
Comment 2 Robert Middleswarth 2012-07-07 13:32:29 EDT
It is NFS.  Later today I will route out the logs and then force the event to happen again to generate logs.

Thanks
Robert
Comment 3 Itamar Heim 2013-03-11 17:58:28 EDT
Closing old bugs. If this issue is still relevant/important in current version, please re-open the bug.

Note You need to log in before you can comment on or make changes to this bug.