Bug 1794260

Summary: Missing brick directory while deploying RHHI-V
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Gobinda Das <godas>
Component: gluster-ansibleAssignee: Gobinda Das <godas>
Status: CLOSED ERRATA QA Contact: SATHEESARAN <sasundar>
Severity: urgent Docs Contact:
Priority: urgent    
Version: rhgs-3.5CC: akrishna, mwaykole, pprakash, puebele, rhs-bugs, sabose, sasundar, sheggodu
Target Milestone: ---Keywords: ZStream
Target Release: RHGS 3.5.z Batch Update 1   
Hardware: x86_64   
OS: Unspecified   
Whiteboard:
Fixed In Version: gluster-ansible-infra-1.0.4-5 Doc Type: Bug Fix
Doc Text:
RHHI-V re-deployment fails to mount few bricks as systemd silently unmounts those bricks and returns success without error, because of which brick is created directly on the root filesystem /gluster_bricks/engine/engine. Reload of systemd, mounts XFS filesystem back and /gluster_bricks/engine becomes the mount point and it is empty, killing the already running glusterfsd(brick) process. With this update we reload systemd daemon before mounting the bricks to ensure bricks are available.
Story Points: ---
Clone Of: 1787001 Environment:
Last Closed: 2020-01-30 06:45:36 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1787001    

Description Gobinda Das 2020-01-23 05:56:39 UTC
Description of problem:
while hosted engine deployment with non 4k devices .The brick is been killed and the directory  '/gluster_bricks/engine/engine/'  is missing 

Version-Release number of selected component (if applicable):
RHGS 3.5.1 (6.0-25)
RHVH-4.3.8


Steps to Reproduce:
step1 : complete the gluster deployment with  engine brick on 4k and other brick on VDO 4k device  
step2 : Now complete the hosted engine deployment

Actual results:
directory /gluster_bricks/engine/engine/ is missing 
and the brick is not online 
Expected results:

The directory "/gluster_bricks/engine/engine/" should be present 
should not prompt ant error on the terminal

--- Additional comment from RHEL Program Management on 2019-12-30 09:15:15 UTC ---

This bug is automatically being proposed for RHHI-V 1.7 release at Red Hat Hyperconverged Infrastructure for Virtualization product, by setting the release flag 'rhiv‑1.7' to '?'.


This issue happens because of RHEL 7 systemd bug:
https://bugzilla.redhat.com/show_bug.cgi?id=1494014

So ideally, this issue happens with redeployment scenario.
Suppose, if some one umount /gluster_bricks/engine and in the attempt
to redeploy, again tries to mount engine brick at the same location - /gluster_bricks/engine,
then systemd silently mounts and umounts the brick, with rc=0

This is evident from:
<snip>
Jan 22 08:19:24 rhsqa-grafton10-nic2 systemd: Unit gluster_bricks-engine.mount is bound to inactive unit dev-disk-by\x2duuid-41e87589\x2d905e\x2d4fea\x
2daef2\x2dd4157129add3.device. Stopping, too.
Jan 22 08:19:24 rhsqa-grafton10-nic2 systemd: Stopped Migrate local SELinux policy changes from the old store structure to the new structure.
Jan 22 08:19:24 rhsqa-grafton10-nic2 systemd: Stopped target Local File Systems.
Jan 22 08:19:24 rhsqa-grafton10-nic2 systemd: Unmounting /gluster_bricks/engine...
Jan 22 08:19:24 rhsqa-grafton10-nic2 kernel: XFS (dm-13): Unmounting Filesystem
Jan 22 08:19:24 rhsqa-grafton10-nic2 systemd: Unmounted /gluster_bricks/engine.
Jan 22 08:19:30 rhsqa-grafton10-nic2 python: ansible-mount Invoked with src=UUID=7702205b-2c11-49b9-8a0e-fb3bbe6ed5db dump=None boot=True fstab=None pa
ssno=None fstype=xfs state=mounted path=/gluster_bricks/test backup=False opts=inode64,noatime,nodiratime
Jan 22 08:19:30 rhsqa-grafton10-nic2 kernel: XFS (dm-24): Mounting V5 Filesystem
Jan 22 08:19:30 rhsqa-grafton10-nic2 kernel: XFS (dm-24): Ending clean mount
Jan 22 08:19:30 rhsqa-grafton10-nic2 systemd: Unit gluster_bricks-test.mount is bound to inactive unit dev-disk-by\x2duuid-17f47f26\x2d61d8\x2d40f3\x2d
b47e\x2dd56512794a7c.device. Stopping, too.
</snip>

In this case, remember /gluster_bricks/engine is not yet mounted,
but ansible is unaware of this. When creating the gluster volume, brick is created
directly on the root filesystem /gluster_bricks/engine/engine

Now when deployment reaches HE deployment, some step here calls - 'systemctl daemon-reload'
which leads to the mount point to appear from nowhere and its mounted over the existing
contents. This is evident from:

<snip>
Jan 22 08:22:26 rhsqa-grafton10-nic2 systemd: gluster_bricks-test.mount: Directory /gluster_bricks/test to mount over is not empty, mounting anyway.
Jan 22 08:22:26 rhsqa-grafton10-nic2 systemd: Mounting /gluster_bricks/test...
Jan 22 08:22:26 rhsqa-grafton10-nic2 systemd: gluster_bricks-engine.mount: Directory /gluster_bricks/engine to mount over is not empty, mounting anyway.
Jan 22 08:22:26 rhsqa-grafton10-nic2 systemd: Mounting /gluster_bricks/engine...
Jan 22 08:22:26 rhsqa-grafton10-nic2 kernel: XFS (dm-24): Mounting V5 Filesystem
Jan 22 08:22:26 rhsqa-grafton10-nic2 kernel: XFS (dm-13): Mounting V5 Filesystem
Jan 22 08:22:26 rhsqa-grafton10-nic2 kernel: XFS (dm-24): Ending clean mount
Jan 22 08:22:26 rhsqa-grafton10-nic2 kernel: XFS (dm-13): Ending clean mount
Jan 22 08:22:26 rhsqa-grafton10-nic2 systemd: Mounted /gluster_bricks/engine.
Jan 22 08:22:26 rhsqa-grafton10-nic2 systemd: Mounted /gluster_bricks/test.
</snip>

So now /gluster_bricks/engine becomes the mount point and its empty. This is how the 'engine' dir under /gluster_bricks/engine disappears


Workaround from the bug is to do 'systemctl daemon-reload' before mounting the bricks


Treating this bug as BLOCKER as it may affect customers, while redeploying.

Comment 3 SATHEESARAN 2020-01-28 07:28:06 UTC
Verified with RHVH 4.3.8 + gluster-ansible-infra-1.0.4-5.el7rhgs

1. Created gluster deployment as part of RHHI-V deployment
2. After gluster deployment is complete, clean the gluster deployment using the same inventory file
3. Re-create the gluster deployment again.
4. Make sure, all the bricks are mounted

Repeat steps 1-4 many times, and make sure all the bricks (XFS) are mounted and available post gluster deployment/configuration

Comment 4 Anjana KD 2020-01-29 17:33:52 UTC
Have updated the doc text, kindly verify the updated doc text field.

Comment 6 errata-xmlrpc 2020-01-30 06:45:36 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0289

Comment 7 SATHEESARAN 2020-02-05 09:37:53 UTC
(In reply to Anjana KD from comment #4)
> Have updated the doc text, kindly verify the updated doc text field.

yes, that looks good