Senior TAM Ben Schmaus had a customer that identified some command fixes from the OSP9 Instance High Availability Guide: https://access.redhat.com/documentation/en/red-hat-openstack-platform/9/single/high-availability-for-compute-instances/ == Item 1 == - On the first code box in the documentation, there are some typos, and the right command to pull the host name is: [heat-admin@lcompute-n # openstack-config --get /etc/nova/nova.conf DEFAULT host == Item 2 == - Step 8. OSP9 is not running keystone as a resource anymore, this is how we should stop all openstack services pcs resource disable openstack-core-clone == Item 3 == - Step 9. There are some extra * in the command line that confuses: *heat-admin@controller-1 #* echo $controllers It should be: heat-admin@controller-1 # echo $controllers == Item 4 == - Step 17. If we used the short names of the computes to create the compute resources, the compute are marked offline. If I used the FQDN name they are marked online, but in any case the nova-compute-checkevacuate sends an error about Invalid name, this is how I fixed it to force it to use the long names: # On the compute nodes, replace this line: vi /usr/lib/ocf/resource.d/openstack/nova-compute-wait From: short_host=$(uname -n | awk -F. '{print $1}') To: short_host=$(uname -n) == Item 5 == - Step 17. Also another problem is when setting up the stonith levelm the "fence_computen" is a typo, I think you should add clearer instructions to replace the with the stonith device name of the compute, and also to repeat this step for each compute. == Item 6 == - Step 18. OSP9 is not running keystone as a resource anymore, this is how we should start all openstack services pcs resource enable openstack-core-clone
@ddomingo - I can make the edits for this guide if you like. Just need to know if there's anything special about the guide I need to be aware of (e.g. was it pulled from upstream?)
Ben, I've queued publication of the updated 'High Availability for Compute Instances'. This should include all the prescribed corrections by Rackspace, except for the following: (In reply to Dan Macpherson from comment #0) <SNIP> > == Item 4 == > > - Step 17. If we used the short names of the computes to create the compute > resources, the compute are marked offline. If I used the FQDN name they are > marked online, but in any case the nova-compute-checkevacuate sends an error > about Invalid name, this is how I fixed it to force it to use the long names: > > # On the compute nodes, replace this line: > vi /usr/lib/ocf/resource.d/openstack/nova-compute-wait > > From: > short_host=$(uname -n | awk -F. '{print $1}') > To: > short_host=$(uname -n) Upon consultation, it turns out that this should not be a problem if you use resource-agents-3.9.5-54.el7_2.18, as BZ#1380314 includes a fix for it there. I then updated the required package version accordingly (was: resource-agents-3.9.5-40.el7_1.5). In addition, I've backported the same fixes to the OSP8 version of the doc. The updated OSP8 and OSP9 versions should be published on the portal within a few hours. Let me know if we missed anything.
Changes are up now on the portal for the OSP9 version: https://access.redhat.com/documentation/en/red-hat-openstack-platform/9/single/high-availability-for-compute-instances
I've submitted a request to publish the OSP8 version of the doc now. Should be up in a few hours or so.
Hi, I'd like to propose updates below: (a) Beggening of chapter 2, OSP8 The command is openstack-config, not openstack-service. > To check a Compute node’s hostname: > heat-admin@compute-n # sudo openstack-service --get /etc/nova/nova.conf DEFAULT host To check a Compute node’s hostname: heat-admin@compute-n # sudo openstack-config --get /etc/nova/nova.conf DEFAULT host (b) In the step of creating fence-nova, Chapter 3, OSP8 and OSP9 I think domain should be indicated explicitly in case CloudDomain is customized. > heat-admin@controller-1 # sudo pcs stonith create fence-nova fence_compute \ > auth-url=$OS_AUTH_URL \ > login=$OS_USERNAME \ > passwd=$OS_PASSWORD \ > tenant-name=$OS_TENANT_NAME \ > record-only=1 action=off --force heat-admin@controller-1 # sudo pcs stonith create fence-nova fence_compute \ auth-url=$OS_AUTH_URL \ login=$OS_USERNAME \ passwd=$OS_PASSWORD \ tenant-name=$OS_TENANT_NAME \ domain=localdomain \ record-only=1 action=off --force (c) In the step of creating stonith device on compute nodes, Chapter 3, OSP8 and OSP9 Should "ipmilan-overcloud-compute-0" and "overcloud-compute-0" be "ipmilan-overcloud-compute-N" and "overcloud-compute-N" respectively? > heat-admin@controller-1 # sudo pcs stonith create ipmilan-overcloud-compute-0 fence_ipmilan pcmk_host_list=overcloud-compute-0 ipaddr=10.35.160.78 login=$IPMILAN_USERNAME passwd=$IPMILAN_PASSWORD lanplus=1 cipher=1 op monitor interval=60s; heat-admin@controller-1 # sudo pcs stonith create ipmilan-overcloud-compute-N fence_ipmilan pcmk_host_list=overcloud-compute-N ipaddr=10.35.160.78 login=$IPMILAN_USERNAME passwd=$IPMILAN_PASSWORD lanplus=1 cipher=1 op monitor interval=60s; (d) In the step of setting stonith level, Chapter 3, OSP8 and OSP9 I think replacing "STONITHDEV" to "ipmilan-overcloud-compute-N" in sync with (c) above is more easier to understand. > heat-admin@controller-1 # sudo pcs stonith level add 1 overcloud-compute-N STONITHDEV,fence-nova heat-admin@controller-1 # sudo pcs stonith level add 1 overcloud-compute-N ipmilan-overcloud-compute-N,fence-nova
Raoul, Can you review Manabu's suggestions in the previous comment and let me know if they're good to go? I've already corrected the first item (ie. s/openstack-service --get/openstack-config --get/).