Description of problem: .prob* file is found in one brick and missing on other 2 bricks Version-Release number of selected component (if applicable): RHGS 3.5.1 (6.0-28) RHVH-4.3.8 Steps to Reproduce: 1.Create a VM 2.Run I/O in the background 3.while running the I/O kill one engine brick 4. wait for 10 minutes 5. restart glusterd Actual results: .prob missing on 2 brick Expected results: There should be no heal pending in the engine and .prob file should be present on all the engine brick Additional info: [node1.example.com ~]# ls /gluster_bricks/engine/engine/ -a . .. faf5b9c4-04b0-4ec2-9743-afa5207966fc .glusterfs .prob-ddb8b8b6-f2bd-42f1-b1b2-f8106ac78a0a .shard [node2.example.com ~]# ls /gluster_bricks/engine/engine/ -a . .. faf5b9c4-04b0-4ec2-9743-afa5207966fc .glusterfs .shard [node3.example.com ~]# ls /gluster_bricks/engine/engine/ -a . .. faf5b9c4-04b0-4ec2-9743-afa5207966fc .glusterfs .shard
Ravi, what's next step on this bug?
On looking at the setup we found that the entry was not getting healed because the parent dir did not have any entry pending xattrs. The test (thanks Sas for the info) that writes to the prob file apparently unlinks the file before continuing to write to it, so maybe the expected result is that the file be _removed_ from all bricks, not that it is present on them: ------------------------ f = os.open(path, os.O_WRONLY | os.O_DIRECT | os.O_DSYNC | os.O_CREAT | os.O_EXCL, stat.S_IRUSR | stat.S_IWUSR) #time.sleep(20) os.unlink(path) #time.sleep(20) m = mmap.mmap(-1, 1024) s = b' ' * 1024 m.write(s) os.write(f, m) os.close(f) ------------------------ So it looks like one of the bricks (engine-client-0) was killed at the time of unlink of the prob file so the unlink did not go through on it. But AFR should have marked pending xattrs during post-op on the good bricks (so that selfheal later on removes the prob file from this brick also). I do not see any network errors on the client log which can explain a post-op failure, so I'm not sure what happened here. We need to see if this can be consistently recreated. Leaving a need-info on Milind for the same. We need the exact time the killing and restating of the bricks happen to correlate it with the log.
I have also seen the same behavior when upgrading from RHV 4.2.8 to RHV 4.3.8 and also from RHV 4.3.7 to RHV 4.3.8 During this upgrade, one of the bricks were killed, and gluster software was upgraded from RHGS 3.4.4 ( gluster-3.12.2-47.5 ) to RHGS 3.5.1 ( gluster-6.0-29 ) After upgrading one of the node, the he.metadata and he.lockspace files were shown are pending to heal and that continued forever. On checking for its GFID, then it was mismatching with the same file on other 2 bricks, but self-heal was not happening though, as the changelog entry was missing in the parent directory.
So I am able to reproduce the issue fairly consistently. 1. Create a 1x3 volume with RHHI options enabled. 2. Create and write to a file from the mount. 3. Bring one brick down, delete and re-create the file so that there is pending (granular) entry heal. 4. With the brick still down, launch the index heal. xat Even though there is nothing to be healed (since the sink brick is still down), index heal seems to be doing a no-op and resetting parent dir's afr changelog xattrs, which is why the entry never gets healed. In the QE setup also, this race is what is happening. Even before the upgraded node comes online, the shd does the entry heal described above. We can see messages like these in the shd log where there is no 'source' and the good bricks are 'sinks': [2020-02-10 05:57:55.847756] I [MSGID: 108026] [afr-self-heal-common.c:1750:afr_log_selfheal] 0-testvol-replicate-0: Completed entry selfheal on 77dd5a45-dbf5-4592-b31b-b440382302e9. sources= sinks=0 2 I need to check where the bug is in the code, if it is specific to granular entry heal and how to fix it.
(In reply to Ravishankar N from comment #8) > I need to check where the bug is in the code, if it is specific to granular > entry heal and how to fix it. So the gfid split-brain will happen only if granular-entry heal is enabled, but even otherwise, even if only two good bricks are up, spurious entry heals are triggered continuously leading to multiple unnecessary network ops. I'm sending a fix upstream for review.
Upstream patch: https://review.gluster.org/#/c/glusterfs/+/24109/
[node1]# rpm -qa | grep -i glusterfs glusterfs-libs-6.0-37.1.el8rhgs.x86_64 glusterfs-geo-replication-6.0-37.1.el8rhgs.x86_64 glusterfs-rdma-6.0-37.1.el8rhgs.x86_64 glusterfs-api-6.0-37.1.el8rhgs.x86_64 glusterfs-server-6.0-37.1.el8rhgs.x86_64 glusterfs-fuse-6.0-37.1.el8rhgs.x86_64 glusterfs-cli-6.0-37.1.el8rhgs.x86_64 glusterfs-events-6.0-37.1.el8rhgs.x86_64 glusterfs-6.0-37.1.el8rhgs.x86_64 glusterfs-client-xlators-6.0-37.1.el8rhgs.x86_64 [node1]# imgbase w You are on rhvh-4.4.1.1-0.20200713.0+1 [node1]# rpm -qa | grep -i ansible gluster-ansible-maintenance-1.0.1-9.el8rhgs.noarch gluster-ansible-cluster-1.0-1.el8rhgs.noarch ansible-2.9.10-1.el8ae.noarch gluster-ansible-features-1.0.5-7.el8rhgs.noarch gluster-ansible-roles-1.0.5-17.el8rhgs.noarch ovirt-ansible-engine-setup-1.2.4-1.el8ev.noarch gluster-ansible-infra-1.0.4-11.el8rhgs.noarch ovirt-ansible-hosted-engine-setup-1.1.6-1.el8ev.noarch gluster-ansible-repositories-1.0.1-2.el8rhgs.noarch As i dont see any pending heal in RHHI-V setup , Heance marking this bug as verified
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (RHHI for Virtualization 1.8 bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2020:3314