Bug 1319916

Summary: All Redis HAProxy backends are marked as down
Product: Red Hat OpenStack Reporter: Marius Cornea <mcornea>
Component: openstack-puppet-modulesAssignee: Giulio Fidente <gfidente>
Status: CLOSED ERRATA QA Contact: Marius Cornea <mcornea>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 8.0 (Liberty)CC: bperkins, dbecker, gfidente, jcoufal, jguiditt, mabaakou, mburns, morazi, rhel-osp-director-maint, yeylon, yprokule
Target Milestone: ga   
Target Release: 8.0 (Liberty)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-puppet-modules-7.0.17-1.el7ost Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1320036 1320251 (view as bug list) Environment:
Last Closed: 2016-04-07 21:35:34 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1320036    

Description Marius Cornea 2016-03-21 19:48:24 UTC
Description of problem:
With latest puddle all Redis HAProxy backends are marked as down by HAProxy and thus no connections that reach the redis VIP are routed to the backends. This causes Ceilometer being unable to communicate with Redis. 

I suspect this is caused by the fact that Redis now uses authentication but the haproxy checks do not implement any authentication. 

Version-Release number of selected component (if applicable):
openstack-tripleo-heat-templates-0.8.12-1.el7ost.noarch
openstack-puppet-modules-7.0.15-1.el7ost.noarch

How reproducible:
100%

Steps to Reproduce:
1. Deploy overcloud 
2. Check /var/log/ceilometer/central.log

Actual results:
ToozConnectionError: Error while reading from socket: ('Connection closed by server.',)

Expected results:
Ceilometer is able to communicate with Redis.

Additional info:

HAProxy redis config:
listen redis
  bind 172.16.20.11:6379 transparent
  balance first
  option tcp-check
  tcp-check send info\ replication\r\n
  tcp-check expect string role:master
  server overcloud-controller-0 172.16.20.12:6379 check fall 5 inter 2000 rise 2
  server overcloud-controller-1 172.16.20.13:6379 check fall 5 inter 2000 rise 2
  server overcloud-controller-2 172.16.20.14:6379 check fall 5 inter 2000 rise 2

HAProxy log:
Server redis/overcloud-controller-0 is DOWN, reason: Layer4 connection problem, info: "Connection refused at step 1 of tcp-check (send)", check duration: 0ms.
Server redis/overcloud-controller-1 is DOWN, reason: Layer4 connection problem, info: "Connection refused at step 1 of tcp-check (send)", check duration: 1ms.
Server redis/overcloud-controller-2 is DOWN, reason: Layer4 connection problem, info: "Connection refused at step 1 of tcp-check (send)", check duration: 0ms.

Try connecting to the VIP:
[root@overcloud-controller-0 heat-admin]# nc 172.16.20.11 6379
AUTH vmygq8ybbrYDn4YvwHWURJpKt
Ncat: Broken pipe.

Try connecting to the backend:
[root@overcloud-controller-0 heat-admin]# nc 172.16.20.12 6379
AUTH vmygq8ybbrYDn4YvwHWURJpKt
+OK

Comment 2 Marius Cornea 2016-03-21 19:59:17 UTC
Adjusting haproxy config as below gets the check to work so it detects the master:

listen redis
  bind 172.16.20.11:6379 transparent
  balance first
  option tcp-check
  tcp-check send AUTH\ vmygq8ybbrYDn4YvwHWURJpKt\r\n
  tcp-check send info\ replication\r\n
  tcp-check expect string role:master
  server overcloud-controller-0 172.16.20.12:6379 check fall 5 inter 2000 rise 2
  server overcloud-controller-1 172.16.20.13:6379 check fall 5 inter 2000 rise 2
  server overcloud-controller-2 172.16.20.14:6379 check fall 5 inter 2000 rise 2

But Ceilometer now shows another error in the log as if it didn't send the authentication string:

/var/log/ceilometer/central.log:
ToozError: NOAUTH Authentication required.

/etc/ceilometer/ceilometer.conf does have the right password set for the redis connection:
backend_url = redis://172.16.20.11:6379/?password=vmygq8ybbrYDn4YvwHWURJpKt

Comment 3 Mehdi ABAAKOUK 2016-03-22 07:42:58 UTC
The haproxy configuration is not enough to work correctly in case of failover, you should try:

  tcp-check connect
  tcp-check send AUTH\ vmygq8ybbrYDn4YvwHWURJpKt\r\n
  tcp-check send PING\r\n
  tcp-check expect string +PONG
  tcp-check send info\ replication\r\n
  tcp-check expect string role:master
  tcp-check send QUIT\r\n
  tcp-check expect string +OK

(that comes from https://bugzilla.redhat.com/show_bug.cgi?id=1299833)

?password= doesn't exists for redis driver, can you try this url, instead:

backend_url = redis://:vmygq8ybbrYDn4YvwHWURJpKt.20.11:6379/

Comment 4 Marius Cornea 2016-03-22 07:48:30 UTC
OK, setting backend_url = redis://:vmygq8ybbrYDn4YvwHWURJpKt.20.11:6379/
cleared the error in central.log.

I'm going the clone this BZ so we can keep track of building the backend_url with proper format.

Comment 5 Giulio Fidente 2016-03-22 16:36:02 UTC
*** Bug 1320251 has been marked as a duplicate of this bug. ***

Comment 7 errata-xmlrpc 2016-04-07 21:35:34 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2016-0603.html