Description of problem: while hosted engine deployment with non 4k devices .The brick is been killed and the directory '/gluster_bricks/engine/engine/' is missing Version-Release number of selected component (if applicable): RHGS 3.5.1 (6.0-25) RHVH-4.3.8 Steps to Reproduce: step1 : complete the gluster deployment with engine brick on 4k and other brick on VDO 4k device step2 : Now complete the hosted engine deployment Actual results: directory /gluster_bricks/engine/engine/ is missing and the brick is not online Expected results: The directory "/gluster_bricks/engine/engine/" should be present should not prompt ant error on the terminal --- Additional comment from RHEL Program Management on 2019-12-30 09:15:15 UTC --- This bug is automatically being proposed for RHHI-V 1.7 release at Red Hat Hyperconverged Infrastructure for Virtualization product, by setting the release flag 'rhiv‑1.7' to '?'. This issue happens because of RHEL 7 systemd bug: https://bugzilla.redhat.com/show_bug.cgi?id=1494014 So ideally, this issue happens with redeployment scenario. Suppose, if some one umount /gluster_bricks/engine and in the attempt to redeploy, again tries to mount engine brick at the same location - /gluster_bricks/engine, then systemd silently mounts and umounts the brick, with rc=0 This is evident from: <snip> Jan 22 08:19:24 rhsqa-grafton10-nic2 systemd: Unit gluster_bricks-engine.mount is bound to inactive unit dev-disk-by\x2duuid-41e87589\x2d905e\x2d4fea\x 2daef2\x2dd4157129add3.device. Stopping, too. Jan 22 08:19:24 rhsqa-grafton10-nic2 systemd: Stopped Migrate local SELinux policy changes from the old store structure to the new structure. Jan 22 08:19:24 rhsqa-grafton10-nic2 systemd: Stopped target Local File Systems. Jan 22 08:19:24 rhsqa-grafton10-nic2 systemd: Unmounting /gluster_bricks/engine... Jan 22 08:19:24 rhsqa-grafton10-nic2 kernel: XFS (dm-13): Unmounting Filesystem Jan 22 08:19:24 rhsqa-grafton10-nic2 systemd: Unmounted /gluster_bricks/engine. Jan 22 08:19:30 rhsqa-grafton10-nic2 python: ansible-mount Invoked with src=UUID=7702205b-2c11-49b9-8a0e-fb3bbe6ed5db dump=None boot=True fstab=None pa ssno=None fstype=xfs state=mounted path=/gluster_bricks/test backup=False opts=inode64,noatime,nodiratime Jan 22 08:19:30 rhsqa-grafton10-nic2 kernel: XFS (dm-24): Mounting V5 Filesystem Jan 22 08:19:30 rhsqa-grafton10-nic2 kernel: XFS (dm-24): Ending clean mount Jan 22 08:19:30 rhsqa-grafton10-nic2 systemd: Unit gluster_bricks-test.mount is bound to inactive unit dev-disk-by\x2duuid-17f47f26\x2d61d8\x2d40f3\x2d b47e\x2dd56512794a7c.device. Stopping, too. </snip> In this case, remember /gluster_bricks/engine is not yet mounted, but ansible is unaware of this. When creating the gluster volume, brick is created directly on the root filesystem /gluster_bricks/engine/engine Now when deployment reaches HE deployment, some step here calls - 'systemctl daemon-reload' which leads to the mount point to appear from nowhere and its mounted over the existing contents. This is evident from: <snip> Jan 22 08:22:26 rhsqa-grafton10-nic2 systemd: gluster_bricks-test.mount: Directory /gluster_bricks/test to mount over is not empty, mounting anyway. Jan 22 08:22:26 rhsqa-grafton10-nic2 systemd: Mounting /gluster_bricks/test... Jan 22 08:22:26 rhsqa-grafton10-nic2 systemd: gluster_bricks-engine.mount: Directory /gluster_bricks/engine to mount over is not empty, mounting anyway. Jan 22 08:22:26 rhsqa-grafton10-nic2 systemd: Mounting /gluster_bricks/engine... Jan 22 08:22:26 rhsqa-grafton10-nic2 kernel: XFS (dm-24): Mounting V5 Filesystem Jan 22 08:22:26 rhsqa-grafton10-nic2 kernel: XFS (dm-13): Mounting V5 Filesystem Jan 22 08:22:26 rhsqa-grafton10-nic2 kernel: XFS (dm-24): Ending clean mount Jan 22 08:22:26 rhsqa-grafton10-nic2 kernel: XFS (dm-13): Ending clean mount Jan 22 08:22:26 rhsqa-grafton10-nic2 systemd: Mounted /gluster_bricks/engine. Jan 22 08:22:26 rhsqa-grafton10-nic2 systemd: Mounted /gluster_bricks/test. </snip> So now /gluster_bricks/engine becomes the mount point and its empty. This is how the 'engine' dir under /gluster_bricks/engine disappears Workaround from the bug is to do 'systemctl daemon-reload' before mounting the bricks Treating this bug as BLOCKER as it may affect customers, while redeploying.
Verified with RHVH 4.3.8 + gluster-ansible-infra-1.0.4-5.el7rhgs 1. Created gluster deployment as part of RHHI-V deployment 2. After gluster deployment is complete, clean the gluster deployment using the same inventory file 3. Re-create the gluster deployment again. 4. Make sure, all the bricks are mounted Repeat steps 1-4 many times, and make sure all the bricks (XFS) are mounted and available post gluster deployment/configuration
Have updated the doc text, kindly verify the updated doc text field.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0289
(In reply to Anjana KD from comment #4) > Have updated the doc text, kindly verify the updated doc text field. yes, that looks good