Bug 1663368

Summary: [RHV-RHGS] Deleting 1TB image file, leads to errors in RHV
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: SATHEESARAN <sasundar>
Component: rhhiAssignee: Sahina Bose <sabose>
Status: CLOSED WORKSFORME QA Contact: SATHEESARAN <sasundar>
Severity: high Docs Contact:
Priority: unspecified    
Version: rhgs-3.4CC: kdhananj, rhs-bugs, sankarshan, storage-qa-internal
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1663367 Environment:
Last Closed: 2019-01-21 11:03:05 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1663367    
Bug Blocks:    

Description SATHEESARAN 2019-01-04 05:19:05 UTC
Description of problem:
-----------------------
When deleting the VM image file of size 1TB, there are sequence of issues/errors seen in RHV Manager. SPM goes non-operational and reboots, sanlock errors are  seen. Possible guess is that the latency in the gluster storage domain is causing such problem.


Version-Release number of selected component (if applicable):
--------------------------------------------------------------
RHV 4.0
RHGS 3.4.2

How reproducible:
-----------------
Always

Steps to Reproduce:
--------------------
1. Create a gluster storage domain
2. Create a disk of size 1TB ( either preallocate the disk or thin-allocate and write some data in to the disk )
3. Delete the VM disk from RHV Manager UI

Actual results:
---------------
On the hosts tab, host with SPM role goes inactive, events tab shows that sanlock error has occurred, vdsm heartbeat exceeded on that host, and the SPM host goes to reboot. VMs running on the SPM host goes to unknown state

Expected results:
-----------------
No errors and healthy VMs

Comment 1 SATHEESARAN 2019-01-21 11:03:05 UTC
This issue is seen with RHGS 3.0 & RHV 4.0.7.

When updating to the latest RHGS 3.4.2 ( glusterfs-3.12.2-32.el7rhgs ) and RHV 4.2.7,
this issue is not seen any more.

I have discussed the same with Sahina, and I'm closing this bug as the issue is not seen with latest gluster builds