Bug 1429419

Summary: [HCI] Document the procedure to enable RHGS SSL/TLS encryption post HC deployment
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: SATHEESARAN <sasundar>
Component: DocumentationAssignee: Laura Bailey <lbailey>
Status: CLOSED CURRENTRELEASE QA Contact: RamaKasturi <knarra>
Severity: high Docs Contact:
Priority: unspecified    
Version: rhgs-3.2CC: knarra, lbailey, rhs-bugs, sankarshan, sasundar, storage-doc, storage-qa-internal
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-08-29 04:10:56 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1277939    

Description SATHEESARAN 2017-03-06 11:06:30 UTC
Document URL: 
-------------
HC Doc

Describe the issue: 
-------------------
There are chances and choices where customer can ask for RHGS encryption post HC deployment.So there should be the procedure that explains how to enable SSL/TLS encrption on data and management path of Gluster post HC deployment.

Additional information: 
-----------------------
I will provide the information on enabling SSL/TLS encryption post HC deployment

Comment 3 SATHEESARAN 2017-03-09 04:39:07 UTC
Here are the steps to enable RHGS SSL/TLS encryption post HC deployment:

1. Stop all the Virtual Machines
2. Move all the storage domains ( except HE storage domain ) to maintenance which unmounts all the storage domains from all the 3 hypervisors. 
   HE storage domain remains mounted on all hypervisors. 
2. Moved the HE to global maintenance
   # hosted-engine --set-maintenance --mode=global
3. Stopped the HE VM
   # hosted-engine --vm-shutdown
   Confirm that the hosted-engine is down.
4. Stopped HA services - ovirt-ha-agent, ovirt-ha-broker
   # systemctl stop ovirt-ha-agent
   # systemctl stop ovirt-ha-broker
5. Umounted the hosted-engine storage domain from all the hypervisors
   # hosted-engine --disconnect-storage
   confirm that all the gluster volumes are unmounted on all the hypervisors
Engine status should not be "unknown stale-data", instead it should have the proper status
6. Run the gdeploy conf file - GraftonEncryptionSetup.conf
7. Check glusterd logs on all the machines, for any SSL/TLS related errors.
8. Start ovirt HA services
   # systemctl start ovirt-ha-agent
   # systemctl start ovirt-ha-broker
9. Wait for all the nodes to sync
   # hosted-engine --vm-status 
Above command should reflect the correct status on all the nodes.

10. Move the HE out of global maintenance
   # hosted-engine --set-maintenance --mode=none
11.Hosted Engine should start automatically shortly.
    check for the HE VM status on all the nodes:
    # hosted-engine --vm-status

12. Activate the storage domains
13. Start the VMs

Comment 4 SATHEESARAN 2017-03-09 10:29:05 UTC
Missed to point to the gdeploy conf file in comment3. Thanks Kasturi for pointing to that mistake

Step-6 in comment3 requires execution of the following gdeploy conf file:

# IPs that corresponds to the Gluster Network
[hosts]
<Gluster_IP_Host1>
<Gluster_IP_Host2>
<Gluster_IP_Host3>

# STEP-1: Generate Keys, Certificates & CA files
# The following section generates the keys,certicates, creates
# ca file and distributes it to all the hosts
[volume1]
action=enable-ssl
volname=engine
ssl_clients=<Gluster_IP_Host1>,<Gluster_IP_Host2>,<Gluster_IP_Host3>
ignore_volume_errors=no

# As the certificates are already generated, its enough to stop
# rest of the volumes,set SSL/TLS related volume options, and
# start the volume

# STEP-2: Stop all the volumes
[volume2]
action=stop
volname=vmstore

[volume3]
action=stop
volname=data

# STEP-3: Set volume options on all the volumes to enable SSL/TLS on the volumes
[volume4]
action=set
volname=vmstore
key=client.ssl,server.ssl,auth.ssl-allow
value=on,on,"<Gluster_IP_Host1>:<Gluster_IP_Host2>:<Gluster_IP_Host3>"
ignore_volume_errors=no

[volume5]
action=set
volname=data
key=client.ssl,server.ssl,auth.ssl-allow
value=on,on,"<Gluster_IP_Host1>:<Gluster_IP_Host2>:<Gluster_IP_Host3>"
ignore_volume_errors=no

# STEP-4: Start all the volumes
[volume6]
action=start
volname=vmstore

[volume7]
action=start
volname=data

Comment 5 RamaKasturi 2017-03-09 10:35:58 UTC
sas, can we use the delimiter as ; in between "<Gluster_IP_Host1>:<Gluster_IP_Host2>:<Gluster_IP_Host3>" as we are going to remove the support for colon in RHGS 3.3.0 as part of this bug https://bugzilla.redhat.com/show_bug.cgi?id=1430682

Comment 6 SATHEESARAN 2017-03-09 16:04:00 UTC
The content is out from my rough draft which I could understand well, but not sure its clear for readers

Please rephrase the content that suits the doc.
I have mentioned few more corrections

(In reply to SATHEESARAN from comment #3)

> Here are the steps to enable RHGS SSL/TLS encryption post HC deployment:
> 
> 1. Stop all the Virtual Machines
> 2. Move all the storage domains ( except HE storage domain ) to maintenance
> which unmounts all the storage domains from all the 3 hypervisors. 
>    HE storage domain remains mounted on all hypervisors. 
> 2. Moved the HE to global maintenance
>    # hosted-engine --set-maintenance --mode=global
> 3. Stopped the HE VM
>    # hosted-engine --vm-shutdown
>    Confirm that the hosted-engine is down.
> 4. Stopped HA services - ovirt-ha-agent, ovirt-ha-broker
>    # systemctl stop ovirt-ha-agent
>    # systemctl stop ovirt-ha-broker
> 5. Umounted the hosted-engine storage domain from all the hypervisors
>    # hosted-engine --disconnect-storage
>    confirm that all the gluster volumes are unmounted on all the hypervisors

> 6. Run the gdeploy conf file - GraftonEncryptionSetup.conf
The conf file is pasted in comment4

> 7. Check glusterd logs on all the machines, for any SSL/TLS related errors.
> 8. Start ovirt HA services
>    # systemctl start ovirt-ha-agent
>    # systemctl start ovirt-ha-broker
> 9. Wait for all the nodes to sync
>    # hosted-engine --vm-status 
> Above command on all the nodes,should reflect the correct status
command output indicating the correct status on one of the node, the same should be done on all the node:
# hosted-engine --vm-status | grep 'Engine status'
Engine status                      : {"health": "good", "vm": "up", "detail": "up"}
Engine status                      : {"reason": "vm not running on this host", "health": "bad", "vm": "down", "detail": "unknown"}
Engine status                      : {"reason": "vm not running on this host", "health": "bad", "vm": "down", "detail": "unknown"}

If the engine status is 'unknown stale-data', user need to wait for few more minutes for the sync to happen

> 
> 10. Move the HE out of global maintenance
>    # hosted-engine --set-maintenance --mode=none
> 11.Hosted Engine should start automatically shortly.
>     check for the HE VM status on all the nodes:
>     # hosted-engine --vm-status
> 
> 12. Activate the storage domains
> 13. Start the VMs

Comment 7 SATHEESARAN 2017-03-09 16:04:48 UTC
(In reply to RamaKasturi from comment #5)
> sas, can we use the delimiter as ; in between
> "<Gluster_IP_Host1>:<Gluster_IP_Host2>:<Gluster_IP_Host3>" as we are going
> to remove the support for colon in RHGS 3.3.0 as part of this bug
> https://bugzilla.redhat.com/show_bug.cgi?id=1430682

ok, I need to test for the same with ";" and will update it

Comment 8 SATHEESARAN 2017-03-14 17:25:18 UTC
(In reply to SATHEESARAN from comment #7)
> (In reply to RamaKasturi from comment #5)
> > sas, can we use the delimiter as ; in between
> > "<Gluster_IP_Host1>:<Gluster_IP_Host2>:<Gluster_IP_Host3>" as we are going
> > to remove the support for colon in RHGS 3.3.0 as part of this bug
> > https://bugzilla.redhat.com/show_bug.cgi?id=1430682
> 
> ok, I need to test for the same with ";" and will update it

That worked with ";" as a delimiter. @Laura, in comment4, under [volume4]  & [volume5], replace ":", with ";"

Also making a small changes to the steps. Here are the final steps:

1. Stop all the Virtual Machines
2. Move all the storage domains ( except HE storage domain ) to maintenance
 which unmounts all the storage domains from all the 3 hypervisors. 
    HE storage domain remains mounted on all hypervisors. 
3. Move the HE to global maintenance
    # hosted-engine --set-maintenance --mode=global
4. Stop the HE VM
    # hosted-engine --vm-shutdown
    Confirm that the hosted-engine is down.
    # hosted-engine --vm-status
5. Stopped HA services - ovirt-ha-agent, ovirt-ha-broker
    # systemctl stop ovirt-ha-agent
    # systemctl stop ovirt-ha-broker
6. Umounted the hosted-engine storage domain from all the hypervisors
    # hosted-engine --disconnect-storage
    confirm that all the gluster volumes are unmounted on all the hypervisors

7. Run the gdeploy conf file - GraftonEncryptionSetup.conf
The conf file is pasted in comment4
Check glusterd logs on all the machines, for any repeated SSL/TLS related errors.

8. Start ovirt HA services
    # systemctl start ovirt-ha-agent
    # systemctl start ovirt-ha-broker

9. Wait for all the nodes to sync
    # hosted-engine --vm-status 
Above command on all the nodes,should reflect the  consistent status
# hosted-engine --vm-status | grep 'Engine status'
Engine status                      : {"reason": "bad vm status", "health": "bad", "vm": "down", "detail": "down"}
Engine status                      : {"reason": "vm not running on this host", "health": "bad", "vm": "down", "detail": "unknown"}
Engine status                      : {"reason": "vm not running on this host", "health": "bad", "vm": "down", "detail": "unknown"}


If the engine status is 'unknown stale-data', user need to wait for few more minutes for the sync to happen

10. Move the HE out of global maintenance
    # hosted-engine --set-maintenance --mode=none

11.Hosted Engine should start automatically shortly.
     check for the HE VM status on all the nodes:
     # hosted-engine --vm-status
 
12. Activate the storage domains from RHV UI
13. Start the VMs

Comment 12 RamaKasturi 2017-03-24 06:04:42 UTC
Laura, i have replied all the questions since i had answers for them, so clearing need info on sas. Going forward for any bug you can put need info on me and sas so that any one of us can reply. I feel it would be little difficult for sas to provide info for all the bugs.

To reply to your point 4 in comment 11, yesterday me and sas were discussing about enabling RHGS SSL/TLS on HC stack and we have come up with the model at  doc [1]. Other day i was trying to verify bug 1417210 and found that it is put in appendix section. Can we have a separate chapter for enabling RHGS SSL/TLS on HC stack and add the contents below there. There is one script which is missing and sas would be able to get that.

[1]https://docs.google.com/document/d/1UibSh2047Uf0FRvh3W2AO04bVJ9rMxWt7cJ0M2CX9fQ/edit

Comment 16 RamaKasturi 2017-03-28 10:07:01 UTC
1) Can we change the section title to have "Configuring TLS/SSL Post Deployment of HC "

2) Can you replace step 8 and 9 in section 4.1 with step9 and step10 in the doc below.

https://docs.google.com/document/d/1UibSh2047Uf0FRvh3W2AO04bVJ9rMxWt7cJ0M2CX9fQ/edit

Rest all steps looks good.

Comment 19 RamaKasturi 2017-04-04 05:47:26 UTC
All changes for enabling encryption post deployment in the doc looks good to me. Moving this to verified state.

Comment 20 Laura Bailey 2017-08-29 04:10:56 UTC
Fixed in RHGS 3.3 documentation.