Description of problem: ------------------------- With RHHI-V 1.8, when using 2 network interfaces with 2 FQDNs, this issue is seen that heal is pending on volumes. Version-Release number of selected component (if applicable): ------------------------------------------------------------- RHHI-V 1.8 RHGS 3.5.2-async ( glusterfs-6.0-37.1.el8rhgs ) How reproducible: ----------------- Always Steps to Reproduce: --------------------- 1. Create a RHHI-V deployment with 3 hosts - used dedicated networks for gluster and ovirtmgmt 2. kill one of the engine brick, make sure there are pending entries and force start the engine volume to bring the brick UP 3. Check for self-heal Actual results: --------------- Pending self-heal on the node Expected results: ----------------- No pending heals, even if there are entries using 'heal' command should heal the entries.
There are few issues that are seen. Ravi and I debugged this issue and Ravi came up with following observations: 1. afr healer threads went not present on the host. It should be always available on the node, but not sure, why it wasn't there. 2. Restarting glustershd should have started the afr_healer thread, even that didn't happen. 3. Changing the hostname of the host to the FQDN corresponding to other network, and then triggering heal settles the problem Thanks Ravi. This issue was not seen with RHHI-V 1.7 with RHEL 7 server. This bug can be marked as known_issue for RHHI-V 1.8, as the initial suspicion is around RHEL 8 networking changes + how glusterd resolves the network names.
Closing this as dependent bug is already closed.
This issue happens with the 2 network interfaces and FQDNs corresponds to backend network. This is not getting fixed in RHGS, so its better to close this bug as WONTFIX. The known_issue for this bug still holds true, as the bug is not getting fixed