Description of problem:
When the op-version of a gluster cluster is upgraded, the volfiles are regenerated. After this step, we need to restart each pod one after other after waiting for self heal.
There are two changes that are required:
1. The second command in step 14 of section 6.4 needs to be updated to the command given below.
gluster --timeout=10800 volume set all cluster.op-version 31302
We could also have a short explanation like "Increasing the op-version of the cluster might take a long time depending on the number of volumes in the cluster. To prevent the cli from timing out, we run the command with a very high timeout value of 10800 seconds."
2. After step 14, we need another step added. This step will be identical to step 11 of the same section. Preface the step with the following explanation.
When the op-version of the cluster is changed, it is necessary that the pods are restarted. Pods should be restarted one at a time, waiting for restarted pod to be ready and pending self heals to complete before proceeding to restart of next pod.
<steps copied from step 11>
Step 16 says to restart the pods but immediately after that documentation is read as to delete the pod, it might be confusing for the user / cu who reads it.
Can we say some thing as below in point i of step 16 ?
Restarting of pods is done by deleting them and to delete the pods,execute the following command.
Moving this bug to assigned state as i would like the review comments put up in comment 11 to be addressed.
clearing the need info as i am moving the bug to assigned state.
I have made the change suggested in comment 11.
The change can be seen in step 17.
Link to review - https://access.qa.redhat.com/documentation/en-us/red_hat_openshift_container_storage/3.10/html-single/deployment_guide/#chap-Documentation-Red_Hat_Gluster_Storage_Container_Native_with_OpenShift_Platform-Upgrade-Gluster_pods
verified in the link provided at Comment 13.
changes looks good to me.
But few things needs to be corrected again which is command to check for volumes heal info. Can you put the command below instead of what is present in the doc.
for each_volume in `gluster volume list`; do gluster volume heal $each_volume info ; done | grep "Number of entries: [^0]$"
And point ii can be removed.
ii. Run the following command to obtain the volume names:
# gluster volume list
I have incorporated the additional feedback given in comment 14.
Link to verify - https://access.qa.redhat.com/documentation/en-us/red_hat_openshift_container_storage/3.10/html-single/deployment_guide/#chap-Documentation-Red_Hat_Gluster_Storage_Container_Native_with_OpenShift_Platform-Upgrade-Gluster_pods
Verified in the link given in comment 15.
There is a new point 17 added to reboot the pods one after the other after bumping up the op version.
Restart the pods after the op-version of the cluster is changed. Wait for the restarted pod to be ready and pending self heals to be completed before proceeding to restart the next pod. Ensure that all pods are restarted one at a time and not simultaneously.
Moving this to verified.