Bug 1399908

Summary: Pacemaker remote / high availability guide for openstack does not set proper timeouts for remote systemd resources
Product: Red Hat OpenStack Reporter: arkady kanevsky <arkady_kanevsky>
Component: openstack-tripleoAssignee: James Slagle <jslagle>
Status: CLOSED CURRENTRELEASE QA Contact: Arik Chernetsky <achernet>
Severity: high Docs Contact:
Priority: unspecified    
Version: 9.0 (Mitaka)CC: abeekhof, arkady_kanevsky, aschultz, david_paterson, fdinitto, mburns, michele, morazi, randy_perryman, rhel-osp-director-maint, rhos-docs, rscarazz, sasha, srevivo
Target Milestone: ---Keywords: Documentation, ZStream
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1386186 Environment:
Last Closed: 2018-05-21 09:10:43 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On: 1386186    
Bug Blocks: 1305654    

Description arkady kanevsky 2016-11-30 04:07:03 UTC
+++ This bug was initially created as a clone of Bug #1386186 +++

Description of problem:
Currently in the docs when we create the resource that need to run on the compute
nodes we do not set the timeout to 200s like we do for the systemd resources as
deployed on the controller. (The reason for 200s is to give enough time to 
systemd to forcibly stop a service without us recurring to fencing)

So from the documentation:
https://access.redhat.com/documentation/en/red-hat-openstack-platform/9/paged/high-availability-for-compute-instances/chapter-3-installation

Section 13.
heat-admin@controller-1 # sudo pcs resource create neutron-openvswitch-agent-compute systemd:neutron-openvswitch-agent --clone interleave=true --disabled --force

Section 13.a
heat-admin@controller-1 # sudo pcs resource create libvirtd-compute systemd:libvirtd --clone interleave=true --disabled --force

Section 13.b
heat-admin@controller-1 # sudo pcs resource create ceilometer-compute systemd:openstack-ceilometer-compute --clone interleave=true --disabled --force

Section 13.c
heat-admin@controller-1 # sudo pcs resource create ceilometer-compute systemd:openstack-ceilometer-compute --clone interleave=true --disabled --force


All the above commands need to add a "op start timeout 200s stop timeout 200s" parameter. So for example:
sudo pcs resource create neutron-openvswitch-agent-compute systemd:neutron-openvswitch-agent op start timeout 200s stop timeout 200s --clone interleave=true --disabled --force

--- Additional comment from Andrew Beekhof on 2016-10-18 21:27:15 EDT ---



--- Additional comment from Don Domingo on 2016-10-24 23:42:56 EDT ---

Thanks Michele, I updated the doc per your correction. The changes should be up in a few hours on the portal (for both OSP8 and OSP9 versions of the doc).

--- Additional comment from Randy Perryman on 2016-11-08 07:39:09 EST ---

Question what is the command to modify and the existing services?  i.e. Rabbit?

--- Additional comment from Randy Perryman on 2016-11-08 08:43:54 EST ---

answered my own question

pcs resource update rabbitmq op add stop timeout=200s


it needs the equal "="

--- Additional comment from David Paterson on 2016-11-16 17:05:21 EST ---

On our weekly call, 11/10, we discussed this one and Mike O mentioned it should be more than a doc just that the script should be modified as well.

Comment 1 Andrew Beekhof 2018-05-21 08:21:42 UTC
Is this still a concern?

Comment 2 Andrew Beekhof 2018-05-21 09:10:43 UTC
Answering my own question, the official way to install IHA is with the ansible scripts which take care of this.  Additionally design changes mean we don't create cluster resources for these services anymore.

On that basis, I'm closing.  Please reopen if you feel there is still a problem