Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1386186

Summary:	Pacemaker remote / high availability guide for openstack does not set proper timeouts for remote systemd resources
Product:	Red Hat OpenStack	Reporter:	Michele Baldessari <michele>
Component:	documentation	Assignee:	Don Domingo <ddomingo>
Status:	CLOSED CURRENTRELEASE	QA Contact:	RHOS Documentation Team <rhos-docs>
Severity:	high	Docs Contact:
Priority:	high
Version:	10.0 (Newton)	CC:	arkady_kanevsky, david_paterson, ddomingo, fdinitto, mburns, michele, morazi, randy_perryman, rscarazz, sasha, smerrow, srevivo
Target Milestone:	---	Keywords:	Documentation, ZStream
Target Release:	---
Hardware:	All
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:
Clones:	1399908 (view as bug list)		Environment:
Last Closed:	2017-05-15 20:08:47 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1335596, 1356451, 1399908

Description Michele Baldessari 2016-10-18 11:26:13 UTC

Description of problem:
Currently in the docs when we create the resource that need to run on the compute
nodes we do not set the timeout to 200s like we do for the systemd resources as
deployed on the controller. (The reason for 200s is to give enough time to 
systemd to forcibly stop a service without us recurring to fencing)

So from the documentation:
https://access.redhat.com/documentation/en/red-hat-openstack-platform/9/paged/high-availability-for-compute-instances/chapter-3-installation

Section 13.
heat-admin@controller-1 # sudo pcs resource create neutron-openvswitch-agent-compute systemd:neutron-openvswitch-agent --clone interleave=true --disabled --force

Section 13.a
heat-admin@controller-1 # sudo pcs resource create libvirtd-compute systemd:libvirtd --clone interleave=true --disabled --force

Section 13.b
heat-admin@controller-1 # sudo pcs resource create ceilometer-compute systemd:openstack-ceilometer-compute --clone interleave=true --disabled --force

Section 13.c
heat-admin@controller-1 # sudo pcs resource create ceilometer-compute systemd:openstack-ceilometer-compute --clone interleave=true --disabled --force


All the above commands need to add a "op start timeout 200s stop timeout 200s" parameter. So for example:
sudo pcs resource create neutron-openvswitch-agent-compute systemd:neutron-openvswitch-agent op start timeout 200s stop timeout 200s --clone interleave=true --disabled --force

Comment 1 Andrew Beekhof 2016-10-19 01:27:15 UTC

*** Bug 1383780 has been marked as a duplicate of this bug. ***

Comment 2 Don Domingo 2016-10-25 03:42:56 UTC

Thanks Michele, I updated the doc per your correction. The changes should be up in a few hours on the portal (for both OSP8 and OSP9 versions of the doc).

Comment 4 Randy Perryman 2016-11-08 12:39:09 UTC

Question what is the command to modify and the existing services?  i.e. Rabbit?

Comment 5 Randy Perryman 2016-11-08 13:43:54 UTC

answered my own question

pcs resource update rabbitmq op add stop timeout=200s


it needs the equal "="

Comment 7 David Paterson 2016-11-16 22:05:21 UTC

On our weekly call, 11/10, we discussed this one and Mike O mentioned it should be more than a doc just that the script should be modified as well.

Comment 9 Sean Merrow 2016-12-09 15:12:40 UTC

Clarification from Mike O:

Correct.  I think I probably worded it poorly on the call.  I believe the intent here is the confirm that the docs on manual setup have been updated correctly & then make sure that the JS scripts that implement instance HA are updated accordingly.  Apologies if that was confusing on the call.

Comment 10 Don Domingo 2017-01-17 01:22:52 UTC

Changes were published with OSP10 GA:
https://access.redhat.com/documentation/en/red-hat-openstack-platform/10/single/high-availability-for-compute-instances

Setting to VERIFIED. If any further changes are required, please re-set to ASSIGNED and advise what needs to be changed.

Comment 11 Michele Baldessari 2017-05-15 11:40:05 UTC

LGTM, I guess we can close this one out?