Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1262425 - memcached needs the interleave=true pacemaker attribute
memcached needs the interleave=true pacemaker attribute
Status: CLOSED ERRATA
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates (Show other bugs)
7.0 (Kilo)
All Linux
high Severity high
: y2
: 7.0 (Kilo)
Assigned To: Giulio Fidente
Asaf Hirshberg
: Triaged
Depends On:
Blocks: 1262263
  Show dependency treegraph
 
Reported: 2015-09-11 11:34 EDT by Michele Baldessari
Modified: 2015-12-21 11:49 EST (History)
12 users (show)

See Also:
Fixed In Version: openstack-tripleo-heat-templates-0.8.6-72.el7ost
Doc Type: Bug Fix
Doc Text:
Previously, the interleave property was not enabled for the Pacemaker memcached clone set. Due to this, the Pacemaker resources depending on the memcached had to wait for all copies of memcached to be in the running state before they could be started. With this update, the memcached clone set is configured enabling the interleave property. As a result, the Pacemaker resources depending on memcached can be started as soon as one of the copies from the clone set becomes available.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-12-21 11:49:01 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
OpenStack gerrit 236990 None None None Never
Red Hat Product Errata RHSA-2015:2650 normal SHIPPED_LIVE Moderate: Red Hat Enterprise Linux OpenStack Platform 7 director update 2015-12-21 16:44:54 EST

  None (edit)
Description Michele Baldessari 2015-09-11 11:34:16 EDT
Description of problem:
osp-d creates the memcached resource as follows (CIB dump as pcs status does not show meta attributes in pcs status):
<clone id="memcached-clone">
  <primitive class="systemd" id="memcached" type="memcached">
    <instance_attributes id="memcached-instance_attributes"/>
    <operations>
      <op id="memcached-start-timeout-60s" interval="0s" name="start" timeout="60s"/>
      <op id="memcached-monitor-interval-60s" interval="60s" name="monitor"/>
    </operations>
  </primitive>
  <meta_attributes id="memcached-clone-meta"/>
</clone>   


Whereas the osp-ha reference architecture sets interleave=true
<clone id="memcached-clone">
  <primitive class="systemd" id="memcached" type="memcached">
    <instance_attributes id="memcached-instance_attributes"/>
    <operations>
      <op id="memcached-monitor-interval-60s" interval="60s" name="monitor"/>
    </operations>
  </primitive>
  <meta_attributes id="memcached-clone-meta">
    <nvpair id="memcached-interleave" name="interleave" value="true"/>
  </meta_attributes>
</clone>
Comment 3 chris alfonso 2015-09-14 12:10:50 EDT
What is the overall impact of this bug, is it just an info display issue or does it cause other issues?
Comment 4 Michele Baldessari 2015-09-14 12:24:24 EDT
Mainly speed of starting all the services on a controller. Without interleave=true
the cascade of services depending on memcached (keystone and services depending on keystone) will all need to wait for memcached to be started on *all* nodes before starting themselves.

E.g. with memcached interleave=true, keystone on node A can start as soon as memcached on node A is started and does not need to wait for memcached to be started on node B and C.

Fabio, anything I missed above?
Comment 5 Fabio Massimo Di Nitto 2015-09-14 12:53:28 EDT
(In reply to Michele Baldessari from comment #4)
> Mainly speed of starting all the services on a controller. Without
> interleave=true
> the cascade of services depending on memcached (keystone and services
> depending on keystone) will all need to wait for memcached to be started on
> *all* nodes before starting themselves.
> 
> E.g. with memcached interleave=true, keystone on node A can start as soon as
> memcached on node A is started and does not need to wait for memcached to be
> started on node B and C.
> 
> Fabio, anything I missed above?

That is correct, use of interleave=true decreases recovery time of services in case of some faults. It is already used for many openstack services, but for some reasons OSPd based deployments didn´t have it.
Comment 6 Michele Baldessari 2015-09-16 16:24:38 EDT
I need to partially backpedal on my comment #4. The issue here is *not* simply a speed of starting problem (which still holds true). The real problem is that whenever a controller joins a cluster (say after a reboot), pacemaker will consider memcached as a single unit on all nodes so it will restart keystone on every node. So this one is more important than initially thought.

Putting Andrew in CC: as he provided this feedback in today's call
Comment 10 Asaf Hirshberg 2015-12-01 05:14:41 EST
verified on RHEL-OSP director 7.2 puddle - 2015-11-25.2

using cibadmin --query --local:
      <clone id="memcached-clone">
        <primitive class="systemd" id="memcached" type="memcached">
          <instance_attributes id="memcached-instance_attributes"/>
          <operations>
            <op id="memcached-start-interval-0s" interval="0s" name="start" timeout="100s"/>
            <op id="memcached-stop-interval-0s" interval="0s" name="stop" timeout="100s"/>
            <op id="memcached-monitor-interval-60s" interval="60s" name="monitor"/>
          </operations>
        </primitive>
        <meta_attributes id="memcached-clone-meta_attributes">
          <nvpair id="memcached-interleave" name="interleave" value="true"/>

Info:
rpm: openstack-tripleo-heat-templates-0.8.6-85.el7ost.noarch
HA-environmet: 3 controllers, 3 computes
Comment 13 errata-xmlrpc 2015-12-21 11:49:01 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2015:2650

Note You need to log in before you can comment on or make changes to this bug.