Bug 1235408 - HAProxy should use clustercheck for galera nodes health checks
Summary: HAProxy should use clustercheck for galera nodes health checks
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 7.0 (Kilo)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: beta
: Director
Assignee: Giulio Fidente
QA Contact: Marius Cornea
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-06-24 17:43 UTC by Marius Cornea
Modified: 2015-08-05 13:55 UTC (History)
9 users (show)

Fixed In Version: openstack-tripleo-heat-templates-0.8.6-17.el7ost
Doc Type: Bug Fix
Doc Text:
HAProxy did not use clustercheck to check MariaDB's backends status. This caused HAProxy to forward requests to MariaDB nodes responsive at the TCP check but not in synchronization with the Galera cluster. This fix now uses clustercheck to check MariaDB's backends status. HAProxy now forwards requests to MariaDB nodes correctly.
Clone Of:
Environment:
Last Closed: 2015-08-05 13:55:49 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
OpenStack gerrit 194960 None None None Never
OpenStack gerrit 195365 None None None Never
Red Hat Product Errata RHEA-2015:1549 normal SHIPPED_LIVE Red Hat Enterprise Linux OpenStack Platform director Release 2015-08-05 17:49:10 UTC

Description Marius Cornea 2015-06-24 17:43:22 UTC
Description of problem:
HAProxy should use clustercheck for galera nodes health checks in order to get valid status of the db servers. 

Version-Release number of selected component (if applicable):
openstack-puppet-modules-2015.1.7-2.el7ost.noarch

Additional info:
https://review.openstack.org/#/c/194960/2

Comment 3 Omri Hochman 2015-06-24 19:57:07 UTC
On HA environment:  galera_start failed after rebooting of controller_0 : 


pcs status : 
-------------
Failed actions:
    openstack-cinder-volume_start_0 on overcloud-controller-2 'not running' (7): call=314, status=complete, exit-reason='none', last-rc-change='Wed Jun 24 15:50:47 2015', queued=2001ms, exec=4ms
    galera_start_0 on overcloud-controller-0 'unknown error' (1): call=216, status=Timed Out, exit-reason='none', last-rc-change='Wed Jun 24 15:47:30 2015', queued=0ms, exec=120003ms
    redis_start_0 on overcloud-controller-0 'unknown error' (1): call=219, status=complete, exit-reason='none', last-rc-change='Wed Jun 24 15:49:35 2015', queued=0ms, exec=21910ms
    openstack-nova-scheduler_start_0 on overcloud-controller-0 'not running' (7): call=236, status=complete, exit-reason='none', last-rc-change='Wed Jun 24 15:50:30 2015', queued=2001ms, exec=2ms
    openstack-nova-consoleauth_start_0 on overcloud-controller-0 'not running' (7): call=238, status=complete, exit-reason='none', last-rc-change='Wed Jun 24 15:50:34 2015', queued=2001ms, exec=5ms
    openstack-cinder-api_start_0 on overcloud-controller-0 'not running' (7): call=242, status=complete, exit-reason='none', last-rc-change='Wed Jun 24 15:50:39 2015', queued=2002ms, exec=5ms
    neutron-server_start_0 on overcloud-controller-0 'not running' (7): call=246, status=complete, exit-reason='none', last-rc-change='Wed Jun 24 15:50:46 2015', queued=2001ms, exec=3ms
    openstack-cinder-volume_start_0 on overcloud-controller-1 'not running' (7): call=325, status=complete, exit-reason='none', last-rc-change='Wed Jun 24 15:50:41 2015', queued=2001ms, exec=2ms


PCSD Status:
  overcloud-controller-0: Online
  overcloud-controller-1: Online
  overcloud-controller-2: Online

Comment 6 Giulio Fidente 2015-06-25 15:10:11 UTC
The puppet-tripleo change should be included in openstack-puppet-modules-2015.1.7-5.el7ost

Comment 8 Omri Hochman 2015-06-26 21:33:03 UTC
Verified: openstack-tripleo-heat-templates-0.8.6-19.el7ost.noarch

from sudo vi /etc/haproxy/haproxy.cfg

listen cinder
  bind 192.168.0.6:8776
  option httpchk GET /
  server overcloud-controller-0 192.168.0.11:8776 check fall 5 inter 2000 rise 2
  server overcloud-controller-1 192.168.0.12:8776 check fall 5 inter 2000 rise 2
  server overcloud-controller-2 192.168.0.10:8776 check fall 5 inter 2000 rise 2


[heat-admin@overcloud-controller-1 ~]$ sudo grep httpchk /etc/haproxy/haproxy.cfg
  option httpchk GET /
  option httpchk GET /
  option httpchk GET /
  option httpchk GET /
  option httpchk GET /
  option httpchk GET /
  option httpchk GET /
  option httpchk GET /
  option httpchk
  option httpchk GET /
  option httpchk GET /
  option httpchk GET /
  option httpchk GET /
  option httpchk GET /
  option httpchk GET /info

Comment 10 errata-xmlrpc 2015-08-05 13:55:49 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2015:1549


Note You need to log in before you can comment on or make changes to this bug.