Bug 1123314 - Rubygem-staypuft: HA: Relax the openstack-heat-engine: op monitor interval to 60 seconds.
Summary: Rubygem-staypuft: HA: Relax the openstack-heat-engine: op monitor interval to...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-foreman-installer
Version: 5.0 (RHEL 7)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ga
: Installer
Assignee: Crag Wolfe
QA Contact: Leonid Natapov
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-07-25 09:51 UTC by Leonid Natapov
Modified: 2016-04-26 20:46 UTC (History)
8 users (show)

Fixed In Version: openstack-foreman-installer-2.0.19-1.el6ost
Doc Type: Bug Fix
Doc Text:
Previously, openstack-heat-engine monitor interval parameter value was set too low for Galera (MariaDB) for Pacemaker to identify that Orchestration service was running. This caused the Orchestration service to restart as random intervals. With this bug fix, the monitor interval has been increased to 60s and as a result, Orchestration service is not unnecessarily restarted by Pacemaker.
Clone Of:
Environment:
Last Closed: 2014-08-21 18:06:25 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2014:1090 0 normal SHIPPED_LIVE Red Hat Enterprise Linux OpenStack Platform Enhancement Advisory 2014-08-22 15:28:08 UTC

Description Leonid Natapov 2014-07-25 09:51:00 UTC
Rubygem-staypuft: HA: Relax the openstack-heat-engine: op monitor interval to 60 seconds.

Now it's openstack-heat-engine: op monitor interval=30s

appears to be too resource consuming in some cases (like openstack-heat-engine and causes false positives.

Comment 1 Fabio Massimo Di Nitto 2014-07-25 14:09:59 UTC
This bug causes heat service to go up/down at random interval. I consider this a blocker.

Comment 2 Crag Wolfe 2014-08-04 23:21:09 UTC
Patch posted: https://github.com/redhat-openstack/astapor/pull/338

Comment 3 Crag Wolfe 2014-08-04 23:23:05 UTC
It looks like the heat how-to should be updated as well.

Comment 4 Fabio Massimo Di Nitto 2014-08-05 03:51:33 UTC
(In reply to Crag Wolfe from comment #3)
> It looks like the heat how-to should be updated as well.

I don't understand the point here.

The heat how-to uses the default monitor op of 60 seconds by not specifying any value.

http://rhel-ha.etherpad.corp.redhat.com/RHOS-RHEL-HA-how-to-mrgcloud-rhos5-on-rhel7-heat

line 61:

pcs resource create heat-engine systemd:openstack-heat-engine op monitor start-delay=10s

default is 60.

Comment 6 Crag Wolfe 2014-08-05 16:09:44 UTC
My mistake, I did not realize the 60s was global across all services.  Should we open another bug to change the interval to 60s for all services (right now the default is 30s as deployed by puppet)?

Comment 7 Fabio Massimo Di Nitto 2014-08-06 06:58:02 UTC
(In reply to Crag Wolfe from comment #6)
> My mistake, I did not realize the 60s was global across all services. 
> Should we open another bug to change the interval to 60s for all services
> (right now the default is 30s as deployed by puppet)?

It's a bit tricky here. I think default should be 60 and match pacemaker default but we probably want the ability to configure that value.

Now, I experienced those timeout because hw was "questionable" and slow. Customers might not experience the same and perhaps want faster failure detection.

Comment 8 Jason Guiditta 2014-08-12 15:42:21 UTC
Merged

Comment 11 Leonid Natapov 2014-08-18 09:36:06 UTC
openstack-foreman-installer-2.0.20-1.el6ost


[root@mac047d7b627d5a haproxy]# pcs resource  show openstack-heat-engine
 Resource: openstack-heat-engine (class=systemd type=openstack-heat-engine)
  Attributes: start-delay=10s 
  Operations: monitor interval=60s (openstack-heat-engine-monitor-interval-60s)

Comment 12 errata-xmlrpc 2014-08-21 18:06:25 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-1090.html


Note You need to log in before you can comment on or make changes to this bug.