Describe the issue: ------------------- While upgrading to RHHI-V 1.8 from 1.7 ( RHv 4.3.15 ), hit the issue where the volume occasionally remains in degraded even after the node is upgraded and rebooted. During testing this issue is seen 2 out of 3 times. There is a solution available to fix this problem. This information when added in troubleshooting section will greatly help customers, in case if they too face this issue Describe the task you were trying to accomplish: ------------------------------------------------ Add 2 more scenarios for troubleshooting section Suggestions for improvement: ------------------------------ Section '4.4.16. Troubleshooting' needs to include information discussed in 'Additional Information' section of this bug Document URL: --------------- https://access.redhat.com/documentation/en-us/red_hat_hyperconverged_infrastructure_for_virtualization/1.8/html-single/upgrading_red_hat_hyperconverged_infrastructure_for_virtualization/index#troubleshooting Chapter/Section Number and Title: ---------------------------------- 4.4.16. Troubleshooting Product Version: ---------------- RHHI-V 1.8 Any other versions of this document that also needs this update: ----------------------------------------------------------------- No Additional information: ----------------------- 1. RHV Administration portal shows the gluster volume in degraded state with one of the brick on the upgraded node as 'down' Check for 'gluster volume status' from gluster command line on one of the hyperconverged hosts. If the brick entry corresponding to the node that was upgraded and rebooted, was listed with brick process as N/A and port as N/A, then kill the brick process and restart glusterd. Notice that in the following example, 'gluster volume status' prints result with no port info and no process info for the host 'rhvh2.example.com' [root@rhvh1]# gluster volume status engine Status of volume: engine Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick rhvh1.example.com:/gluster_bricks/eng ine/engine 49158 0 Y 94365 Brick rhvh2.example.com:/gluster_bricks/eng ine/engine N/A N/A Y 11052 Brick rhvh3.example.com:/gluster_bricks/eng ine/engine 49152 0 Y 31153 Self-heal Daemon on localhost N/A N/A Y 128608 Self-heal Daemon on rhvh2.example.com N/A N/A Y 11838 Self-heal Daemon on rhvh3.example.com N/A N/A Y 9806 Task Status of Volume engine ------------------------------------------------------------------------------ There are no active volume tasks To fix this problem, stop all glusterfsd brick process and start them back by restarting glusterd # pkill glusterfsd # systemctl restart glusterd Check for 'gluster volume status' once again to make sure all the brick entries has got brick process ID as well as the port information displayed with 'gluster volume status' output. Wait for atleast 2-3 minutes, for this information to reflect in the RHV administration portal.h
Verified with the internal doc link. The additional troubleshooting information is now added to the upgrade guide