Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1313593

Summary:	[HC] Ovirt does not detect gluser peer down
Product:	[oVirt] ovirt-engine	Reporter:	Badalyan Vyacheslav <v.badalyan>
Component:	General	Assignee:	bugs <bugs>
Status:	CLOSED INSUFFICIENT_DATA	QA Contact:
Severity:	high	Docs Contact:
Priority:	unspecified
Version:	3.6.4	CC:	bugs, knarra, sabose, v.badalyan
Target Milestone:	---	Flags:	rule-engine: planning_ack? rule-engine: devel_ack? rule-engine: testing_ack?
Target Release:	---
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2016-03-16 13:10:41 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	Gluster	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Badalyan Vyacheslav 2016-03-02 01:17:55 UTC

Description of problem:

If you will reboot gluster server you get 
1. Read errors in dmesg
2. Many postgres zombie processes
3. Reboot hosted engine by Agent-HA
4. Runned in Pause mode hosted engine becouse glutser bad

How reproducible:
Always

Steps to Reproduce:
1. create replica 3 gluster with 3 different servers
2. install hosted engine on gluster
3. run and configure
4. stop one gluster server

Actual results:
Maintance of gluster server in auto mode in night stop service. Need unpause hosted engine by hands. 

Expected results:
ovirt engine or agent-ha muse see down one of servers and 
Minimal) suspend hosted VM and resome after gluster up
Maximal) Add second master postgess server linked to memory and do replication to secondary in glustered disk. All needed to work engine must be cloned in memory or ram dist to prevert small network issues on storage domains.

Or gluster must normail work without one peer. Without speed degrace. 


Additional info:

Comment 1 Sahina Bose 2016-03-09 07:31:21 UTC

Can you post the ha-agent.log, vdsm.log and gluster mount log from the node where hosted engine was running and then went to paused state?

Comment 2 Badalyan Vyacheslav 2016-03-13 16:57:25 UTC

Sorry. Can't. I Remove all gluster and go to NFS

Comment 3 Badalyan Vyacheslav 2016-03-13 16:57:31 UTC

Sorry. Can't. I Remove all gluster and go to NFS

Comment 4 Sahina Bose 2016-03-14 08:47:28 UTC

Kasturi, can you check if you see this behaviour in your setup?

Else we can close this as insufficient data

Comment 5 RamaKasturi 2016-03-16 12:33:49 UTC

sahina, i put down gluster server in one of my node in the cluster. As reported  UI does not indicate that gluster service is down on that node. Looks like there is already a bug file https://bugzilla.redhat.com/show_bug.cgi?id=1262046.

But i did not see any pause in any of my app vms , hosted engine is up and running fine.

Comment 6 Sahina Bose 2016-03-16 13:10:41 UTC

Closing this as could not reproduce it, and user could not provide logs.