1711830 – RHV manager spontaneously fencing nodes when lots of concurrent qemu snapshots are executed

Bug 1711830 - RHV manager spontaneously fencing nodes when lots of concurrent qemu snapshots are executed

Summary: RHV manager spontaneously fencing nodes when lots of concurrent qemu snapshot...

Keywords:
Status:	CLOSED INSUFFICIENT_DATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	core
Sub Component:
Version:	rhgs-3.4
Hardware:	x86_64
OS:	Linux
Priority:	urgent
Severity:	urgent
Target Milestone:	---
Target Release:	---
Assignee:	Krutika Dhananjay
QA Contact:	Rahul Hinduja
Docs Contact:
URL:
Whiteboard:
Depends On:	1614430 1712654
Blocks:
TreeView+	depends on / blocked

Reported:	2019-05-20 08:19 UTC by Jay Samson
Modified:	2023-09-14 05:28 UTC (History)
CC List:	10 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2019-06-18 07:48:04 UTC
Embargoed:
Dependent Products:
Flags:	kdhananj: needinfo-

Attachments	(Terms of Use)

Comment 3 Sahina Bose 2019-05-20 11:26:20 UTC

Is the fencing of nodes causing quorum loss? Can you ensure that customer has set the fencing policies related to gluster at the cluster level (i.e not fencing if brick is online or if it could lead to quorum loss)


Also can you confirm if these are gluster snapshots or qemu snapshots on gluster volume?

Comment 5 Sahina Bose 2019-05-20 14:45:13 UTC

vmstore1 and vmstore2 are distributed-replica volumes. whenever concurrent delete of VM snapshots occurs, there's an issue with I/O latency , sanlock logs - "2019-05-07 00:19:43 2301165 [25167]: s10 delta_renew long write time 43 sec"

Krutika, could you check the logs to see if there are any gluster issues causing this high latency?

Comment 30 Red Hat Bugzilla 2023-09-14 05:28:52 UTC

The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days

Note You need to log in before you can comment on or make changes to this bug.