The documentation for HCI doesn't explain about replacing host workflow
Section Number and Name:
Describe the issue:
Replacing the host is undertaken under 2 scenarios :
1. Proactive repair - when a customer/user decides on replacing the host proactively
2. On-repair - when a host is hit with some issues, where the node has become defunct and requires replacement.
Suggestions for improvement:
Provide the workflow guidance for replacing the host under above mentioned circumstances
I am working on the replace host workflow. I will be sending it across to all for review and then share the content for the doc.
Assigning this bug to Kasturi, as she knows more nitty-gritty information about replace host workflow
I reviewed the contents in the above link and i have few review comments.
1) can you change the brick path in step5 to /gluster_bricks/engine/engine, /gluster_bricks/data/data, /gluster_bricks/vmstore/vmstore as gdeploy creates it that way and other nodes in the cluster has the same brick path.
2) can you remove step6 and add the contents present in step5 of the doc https://docs.google.com/document/d/1wIvoSSQmXljh9M2waiR393Hh4qsi27ErxKnXz2Stvlw/edit#. what you already documented has a step to kill all gluster process which will bring down the whole cluster.
3) After step 9 can you please change the step "In Red Hat Virtualization Manager, add the new host to the cluster" to "In Red Hat Virtualization Manager, add the new host to the cluster and make sure to deploy HostedEngine by clicking on HostedEngine tab and select Deploy.
4) Can you remove the title of step 11 and move this section before step 16.
5) In step 13 can you change step i and ii to In Red Hat Virtualization Manager, click the Hosts tab and select the volume, click Bricks tab, Replace old brick which is down with new ones using Replace Brick button and Verify heal is successfully completed.
please let me know if you have any questions on the same.
Few more comments :-)
1) In step4, you said that node replacement script has to be run from Red Hat Virtualization Manager node where as it should be run from one of the host in the cluster.
2) can you change the title of this section to have " CONFIGURING TLS/SSL USING SELF-SIGNED / CA CERTIFICATES POST DEPLOYMENT"
3) Please add this a note at the begining of Replace gluster nodes section If user has SSL configured on the setup and if CA is used, follow steps at RHGS admin guide 9.4.1 section “Expanding volumes using Certificate Signed with a Certificate Authority”
6) In step 6 can you add the following command under the section "Include the new server in the value of the auth.ssl-allow volume option."
# gluster volume set <vol_name> auth.ssl-allow <gluster_ip1>,<gluster_ip2>,<gluster_newip>
Repeat the above step for other two volumes.
I am yet to review replace host with same FQDN. Will provide the details shortly.
1) Cannot access the links mentioned in point 2 at 6.2 section of the guide.
2) In step 3 of 6.2 section mention that gdeploy has to be run from one of the node in the cluster where it is installed.
3) Can you please replace step 4 of 6.2 section with step 3 from the doc below
4) can you please remove step 5 in 6.2 section.
5) change step 8 in 6.2 section to just check if nodes are connected. glusterfsd process wont be connected at this point if SSL is enabled, so heal does not happen.
6) After step 8 in 6.2 section can you please add the step "Follow the steps in Section 4.1, “Configuring TLS/SSL using self-signed certificates” to remount all gluster processes."
7) Once the above is done, verify that glusterfsd process are connected and heal happens.
8) In 6.1 section after step 11, can you add a point to check if the old host is removed from gluster peer status, If it is not removed, can you please add a step to perform a 'gluster peer detach <force>" of this.
*** Bug 1431819 has been marked as a duplicate of this bug. ***
1) Laura, i feel we do not need the line "If a Red Hat Gluster Storage node needs to be replaced, there are two options for the replacement node".
2) replace_node_prep.conf has to be run from the node where gdeploy is installed and not from Red Hat Virtualization Manager node.
3) can you remove '3' from the line 'Change to 3 IP addresses of the network intended for gluster traffic' in replace_node_prep.conf template.
4)In step 6, can you change it to have 'Generate the private key and self-signed certificate on the new server' using the the steps listed at RHGS admin guide Section 9.1, “Prerequisites”.
5) Then you can add the line If encryption using a Certificate Authority is enabled, follow the steps at the following link before continuing: https://access.redhat.com/documentation/en-us/red_hat_gluster_storage/3.2/html/administration_guide/ch22s04.
6)can this line 'Add the new node to existing certificates' be changes to 'Add the new nodes certificate to existing certificates'
7) For details on adding a host to cluster in section 5.1, we can provide the link of adding additional hosts https://doc-stage.usersys.redhat.com/documentation/en-us/red_hat_gluster_storage/3.2/html-single/deploying_red_hats_hyper-converged_infrastructure/#add_remaining_virtualization_hosts_to_the_hosted_engine
8)we can remove step 8 and line "Ensure that you select the new host as the deployment location for the Hosted Engine."
8) Point 14 of section 5.2 can be removed and you can point them to perform steps 4 and 5 in section 8.2 of deployment guide.
Laura, can you please address the above review comments ?
1) In 5.1 section change enclose nodes in Double quotes.
s /gluster volume set <volname> auth.ssl-allow <old_node1>,<old_node2>,<new_node>/ gluster volume set <volname> auth.ssl-allow "<old_node1>,<old_node2>,<new_node>"
2)replace_node_prep.conf has to be run from the node where gdeploy is
installed and not from Red Hat Virtualization Manager node.
This will be the node that hosts the Hosted Engine; ensured both instances
of replace_host script mention this.
Not necessarily. gdeploy script should be run from a node where gdeploy is installed and the node which gets replaced should have key based authentication from this node. IMO, we should remove the line "Run the node replacement script from the Red Hat Virtualization Manager node using gdeploy"
changes in the document looks good. Moving this to verified state.
Fixed in RHGS 3.3 documentation.