Bug 1955660 - [upgrade guide]Add information about degraded volume status in troubleshooting section
Summary: [upgrade guide]Add information about degraded volume status in troubleshootin...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: Documentation
Version: rhhiv-1.8
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
: RHHI-V 1.8.z Batch Update 5
Assignee: Disha Walvekar
QA Contact: SATHEESARAN
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-04-30 15:45 UTC by SATHEESARAN
Modified: 2021-06-17 09:59 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-06-17 09:59:45 UTC
Embargoed:


Attachments (Terms of Use)

Description SATHEESARAN 2021-04-30 15:45:26 UTC
Describe the issue:
-------------------
While upgrading to RHHI-V 1.8 from 1.7 ( RHv 4.3.15 ), hit the issue where the volume occasionally remains in degraded even after the node is upgraded and rebooted. During testing this issue is seen 2 out of 3 times. There is a solution available to fix this problem. This information when added in troubleshooting section will greatly help customers, in case if they too face this issue

Describe the task you were trying to accomplish:
------------------------------------------------
Add 2 more scenarios for troubleshooting section

Suggestions for improvement:
------------------------------
Section '4.4.16. Troubleshooting' needs to include information discussed in 'Additional Information' section of this bug

Document URL:
---------------
https://access.redhat.com/documentation/en-us/red_hat_hyperconverged_infrastructure_for_virtualization/1.8/html-single/upgrading_red_hat_hyperconverged_infrastructure_for_virtualization/index#troubleshooting

Chapter/Section Number and Title:
----------------------------------
4.4.16. Troubleshooting

Product Version:
----------------
RHHI-V 1.8

Any other versions of this document that also needs this update:
-----------------------------------------------------------------
No

Additional information:
-----------------------
1. RHV Administration portal shows the gluster volume in degraded state with one of the brick on the upgraded node as 'down'

Check for 'gluster volume status' from gluster command line on one of the hyperconverged hosts. If the brick entry corresponding to the node that was upgraded and rebooted, was listed with brick process as N/A and port as N/A, then kill the brick process and restart glusterd.

Notice that in the following example, 'gluster volume status' prints result with no port info and no process info for the host 'rhvh2.example.com'

[root@rhvh1]# gluster volume status engine
Status of volume: engine
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick rhvh1.example.com:/gluster_bricks/eng
ine/engine                                   49158     0          Y       94365
Brick rhvh2.example.com:/gluster_bricks/eng
ine/engine                                   N/A       N/A        Y       11052
Brick rhvh3.example.com:/gluster_bricks/eng
ine/engine                                   49152     0          Y       31153
Self-heal Daemon on localhost                N/A       N/A        Y       128608
Self-heal Daemon on rhvh2.example.com        N/A       N/A        Y       11838
Self-heal Daemon on rhvh3.example.com        N/A       N/A        Y       9806 
 
Task Status of Volume engine
------------------------------------------------------------------------------
There are no active volume tasks

To fix this problem, stop all glusterfsd brick process and start them back by restarting glusterd
     # pkill glusterfsd
     # systemctl restart glusterd
Check for 'gluster volume status' once again to make sure all the brick entries has got brick process ID as well as the port information displayed with 'gluster volume status' output. Wait for atleast 2-3 minutes, for this information to reflect in the RHV administration portal.h

Comment 3 SATHEESARAN 2021-05-18 00:55:37 UTC
Verified with the internal doc link.
The additional troubleshooting information is now added to the upgrade guide


Note You need to log in before you can comment on or make changes to this bug.