Bug 1486037

Summary:	rhosp-director: composable roles deployment gets stuck due clustercheck container not being in the database role
Product:	Red Hat OpenStack	Reporter:	Alexander Chuzhoy <sasha>
Component:	openstack-tripleo-heat-templates	Assignee:	Michele Baldessari <michele>
Status:	CLOSED ERRATA	QA Contact:	Alexander Chuzhoy <sasha>
Severity:	high	Docs Contact:
Priority:	high
Version:	12.0 (Pike)	CC:	aherr, aschultz, bperkins, chjones, dbecker, dprince, jjoyce, jschluet, mburns, mcornea, michele, morazi, rhel-osp-director-maint, tvignaud
Target Milestone:	beta	Keywords:	TestBlocker, Triaged
Target Release:	12.0 (Pike)
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	openstack-tripleo-heat-templates-7.0.0-0.20170913050524.0rc2.el7ost	Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2017-12-13 21:58:11 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1482087

Description Alexander Chuzhoy 2017-08-28 19:57:57 UTC

rhosp-director: composable roles deployment gets stuck: error: Couldn't expand haproxy-bundle_stop_0 to haproxy-bundle_stopped_0 in haproxy-bundle



Environment:
puppet-haproxy-1.5.0-0.20170728184739.6ffcb07.el7ost.noarch
haproxy-1.5.18-6.el7.x86_64

instack-undercloud-7.2.1-0.20170729010706.el7ost.noarch
openstack-tripleo-heat-templates-7.0.0-0.20170805163048.el7ost.noarch
openstack-puppet-modules-10.0.0-0.20170315222135.0333c73.el7.1.noarch


Steps to reproduce:
Attempt to deploy OC deployment with composable roles.

Deployment command: openstack overcloud deploy \
--templates /usr/share/openstack-tripleo-heat-templates \
--libvirt-type kvm \
--ntp-server clock.redhat.com \
-r /home/stack/roles_data.yaml \
-e /home/stack/virt/internal.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \
-e /home/stack/virt/network/network-environment.yaml \
-e /home/stack/virt/hostnames.yml \
-e /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-ansible.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/docker.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/docker-ha.yaml \
-e /home/stack/virt/debug.yaml \
-e /home/stack/virt/nodes_data.yaml \
-e /home/stack/virt/docker-images.yaml


(undercloud) [stack@undercloud-0 ~]$ cat roles_data.yaml
###############################################################################
# File generated by TripleO
###############################################################################
###############################################################################
# Role: ControllerOpenstack                                                   #
###############################################################################
- name: Controller
  description: |
    Controller role that does not contain the database, messaging and networking
    components. Use in combination with the Database, Messaging and Networker
    roles.
  tags:
    - primary
    - controller
  networks:
    - External
    - InternalApi
    - Storage
    - StorageMgmt
    - Tenant
  HostnameFormatDefault: '%stackname%-controller-%index%'
  ServicesDefault:
    - OS::TripleO::Services::AodhApi
    - OS::TripleO::Services::AodhEvaluator
    - OS::TripleO::Services::AodhListener
    - OS::TripleO::Services::AodhNotifier
    - OS::TripleO::Services::AuditD
    - OS::TripleO::Services::BarbicanApi
    - OS::TripleO::Services::CACerts
    - OS::TripleO::Services::CeilometerAgentCentral
    - OS::TripleO::Services::CeilometerAgentNotification
    - OS::TripleO::Services::CeilometerApi
    - OS::TripleO::Services::CeilometerExpirer
    - OS::TripleO::Services::CephExternal
    - OS::TripleO::Services::CephMds
    - OS::TripleO::Services::CephMon
    - OS::TripleO::Services::CephRbdMirror
    - OS::TripleO::Services::CephRgw
    - OS::TripleO::Services::CinderApi
    - OS::TripleO::Services::CinderBackup
    - OS::TripleO::Services::CinderHPELeftHandISCSI
    - OS::TripleO::Services::CinderScheduler
    - OS::TripleO::Services::CinderVolume
    - OS::TripleO::Services::Collectd
    - OS::TripleO::Services::Congress
    - OS::TripleO::Services::Clustercheck
    - OS::TripleO::Services::Docker
    - OS::TripleO::Services::Ec2Api
    - OS::TripleO::Services::Etcd
    - OS::TripleO::Services::FluentdClient
    - OS::TripleO::Services::GlanceApi
    - OS::TripleO::Services::GnocchiApi
    - OS::TripleO::Services::GnocchiMetricd
    - OS::TripleO::Services::GnocchiStatsd
    - OS::TripleO::Services::HAproxy
    - OS::TripleO::Services::HeatApi
    - OS::TripleO::Services::HeatApiCfn
    - OS::TripleO::Services::HeatApiCloudwatch
    - OS::TripleO::Services::HeatEngine
    - OS::TripleO::Services::Horizon
    - OS::TripleO::Services::IronicApi
    - OS::TripleO::Services::IronicConductor
    - OS::TripleO::Services::Iscsid
    - OS::TripleO::Services::Keepalived
    - OS::TripleO::Services::Kernel
    - OS::TripleO::Services::Keystone
    - OS::TripleO::Services::ManilaApi
    - OS::TripleO::Services::ManilaBackendCephFs
    - OS::TripleO::Services::ManilaBackendGeneric
    - OS::TripleO::Services::ManilaBackendNetapp
    - OS::TripleO::Services::ManilaScheduler
    - OS::TripleO::Services::ManilaShare
    - OS::TripleO::Services::Memcached
    - OS::TripleO::Services::MongoDb
    - OS::TripleO::Services::MySQLClient
    - OS::TripleO::Services::NovaApi
    - OS::TripleO::Services::NovaConductor
    - OS::TripleO::Services::NovaConsoleauth
    - OS::TripleO::Services::NovaIronic
    - OS::TripleO::Services::NovaMetadata
    - OS::TripleO::Services::NovaPlacement
    - OS::TripleO::Services::NovaScheduler
    - OS::TripleO::Services::NovaVncProxy
    - OS::TripleO::Services::Ntp
    - OS::TripleO::Services::OctaviaApi
    - OS::TripleO::Services::OctaviaHealthManager
    - OS::TripleO::Services::OctaviaHousekeeping
    - OS::TripleO::Services::OctaviaWorker
    - OS::TripleO::Services::OpenDaylightApi
    - OS::TripleO::Services::OpenDaylightOvs
    - OS::TripleO::Services::OVNDBs
    - OS::TripleO::Services::OVNController
    - OS::TripleO::Services::Pacemaker
    - OS::TripleO::Services::PankoApi
    - OS::TripleO::Services::Redis
    - OS::TripleO::Services::SaharaApi
    - OS::TripleO::Services::SaharaEngine
    - OS::TripleO::Services::SensuClient
    - OS::TripleO::Services::Snmp
    - OS::TripleO::Services::Sshd
    - OS::TripleO::Services::SwiftProxy
    - OS::TripleO::Services::SwiftRingBuilder
    - OS::TripleO::Services::SwiftStorage
    - OS::TripleO::Services::Tacker
    - OS::TripleO::Services::Timezone
    - OS::TripleO::Services::TripleoFirewall
    - OS::TripleO::Services::TripleoPackages
    - OS::TripleO::Services::Tuned
    - OS::TripleO::Services::Vpp
    - OS::TripleO::Services::Zaqar

###############################################################################
# Role: Database                                                              #
###############################################################################
- name: Database
  description: |
    Standalone database role with the database being managed via Pacemaker
  networks:
    - InternalApi
  HostnameFormatDefault: '%stackname%-database-%index%'
  ServicesDefault:
    - OS::TripleO::Services::AuditD
    - OS::TripleO::Services::CACerts
    - OS::TripleO::Services::Collectd
    - OS::TripleO::Services::FluentdClient
    - OS::TripleO::Services::Kernel
    - OS::TripleO::Services::MySQL
    - OS::TripleO::Services::MySQLClient
    - OS::TripleO::Services::Ntp
    - OS::TripleO::Services::Pacemaker
    - OS::TripleO::Services::SensuClient
    - OS::TripleO::Services::Snmp
    - OS::TripleO::Services::Timezone
    - OS::TripleO::Services::TripleoFirewall
    - OS::TripleO::Services::TripleoPackages
    - OS::TripleO::Services::Tuned
    - OS::TripleO::Services::Docker
    - OS::TripleO::Services::Sshd
###############################################################################
# Role: Messaging                                                             #
###############################################################################
- name: Messaging
  description: |
    Standalone messaging role with RabbitMQ being managed via Pacemaker
  networks:
    - InternalApi
  HostnameFormatDefault: '%stackname%-messaging-%index%'
  ServicesDefault:
    - OS::TripleO::Services::AuditD
    - OS::TripleO::Services::CACerts
    - OS::TripleO::Services::Collectd
    - OS::TripleO::Services::FluentdClient
    - OS::TripleO::Services::Kernel
    - OS::TripleO::Services::Ntp
    - OS::TripleO::Services::Pacemaker
    - OS::TripleO::Services::RabbitMQ
    - OS::TripleO::Services::SensuClient
    - OS::TripleO::Services::Snmp
    - OS::TripleO::Services::Timezone
    - OS::TripleO::Services::TripleoFirewall
    - OS::TripleO::Services::TripleoPackages
    - OS::TripleO::Services::Tuned
    - OS::TripleO::Services::Sshd
    - OS::TripleO::Services::Docker

###############################################################################
# Role: Networker                                                             #
###############################################################################
- name: Networker
  description: |
    Standalone networking role to run Neutron services their own. Includes
    Pacemaker integration via PacemakerRemote
  networks:
    - InternalApi
    - External
  HostnameFormatDefault: '%stackname%-networker-%index%'
  ServicesDefault:
    - OS::TripleO::Services::AuditD
    - OS::TripleO::Services::CACerts
    - OS::TripleO::Services::Collectd
    - OS::TripleO::Services::FluentdClient
    - OS::TripleO::Services::Kernel
    - OS::TripleO::Services::MySQLClient
    - OS::TripleO::Services::NeutronApi
    - OS::TripleO::Services::NeutronBgpVpnApi
    - OS::TripleO::Services::NeutronCorePlugin
    - OS::TripleO::Services::NeutronDhcpAgent
    - OS::TripleO::Services::NeutronL2gwAgent
    - OS::TripleO::Services::NeutronL2gwApi
    - OS::TripleO::Services::NeutronL3Agent
    - OS::TripleO::Services::NeutronLbaasv2Agent
    - OS::TripleO::Services::NeutronMetadataAgent
    - OS::TripleO::Services::NeutronML2FujitsuCfab
    - OS::TripleO::Services::NeutronML2FujitsuFossw
    - OS::TripleO::Services::NeutronOvsAgent
    - OS::TripleO::Services::NeutronVppAgent
    - OS::TripleO::Services::Ntp
    - OS::TripleO::Services::OpenDaylightOvs
    - OS::TripleO::Services::PacemakerRemote
    - OS::TripleO::Services::SensuClient
    - OS::TripleO::Services::Snmp
    - OS::TripleO::Services::Timezone
    - OS::TripleO::Services::TripleoFirewall
    - OS::TripleO::Services::TripleoPackages
    - OS::TripleO::Services::Tuned
    - OS::TripleO::Services::Docker
    - OS::TripleO::Services::Sshd
###############################################################################
# Role: Compute                                                               #
###############################################################################
- name: Compute
  description: |
    Basic Compute Node role
  CountDefault: 1
  networks:
    - InternalApi
    - Tenant
    - Storage
  HostnameFormatDefault: '%stackname%-novacompute-%index%'
  disable_upgrade_deployment: True
  ServicesDefault:
    - OS::TripleO::Services::AuditD
    - OS::TripleO::Services::CACerts
    - OS::TripleO::Services::CephClient
    - OS::TripleO::Services::CephExternal
    - OS::TripleO::Services::CertmongerUser
    - OS::TripleO::Services::Collectd
    - OS::TripleO::Services::ComputeCeilometerAgent
    - OS::TripleO::Services::ComputeNeutronCorePlugin
    - OS::TripleO::Services::ComputeNeutronL3Agent
    - OS::TripleO::Services::ComputeNeutronMetadataAgent
    - OS::TripleO::Services::ComputeNeutronOvsAgent
    - OS::TripleO::Services::Docker
    - OS::TripleO::Services::FluentdClient
    - OS::TripleO::Services::Iscsid
    - OS::TripleO::Services::Kernel
    - OS::TripleO::Services::MySQLClient
    - OS::TripleO::Services::NeutronLinuxbridgeAgent
    - OS::TripleO::Services::NeutronSriovAgent
    - OS::TripleO::Services::NeutronVppAgent
    - OS::TripleO::Services::NovaCompute
    - OS::TripleO::Services::NovaLibvirt
    - OS::TripleO::Services::NovaMigrationTarget
    - OS::TripleO::Services::Ntp
    - OS::TripleO::Services::OpenDaylightOvs
    - OS::TripleO::Services::Securetty
    - OS::TripleO::Services::SensuClient
    - OS::TripleO::Services::Snmp
    - OS::TripleO::Services::Sshd
    - OS::TripleO::Services::Timezone
    - OS::TripleO::Services::TripleoFirewall
    - OS::TripleO::Services::TripleoPackages
    - OS::TripleO::Services::Tuned
    - OS::TripleO::Services::Vpp
    - OS::TripleO::Services::OVNController
###############################################################################
# Role: CephStorage                                                           #
###############################################################################
- name: CephStorage
  description: |
    Ceph OSD Storage node role
  networks:
    - Storage
    - StorageMgmt
  ServicesDefault:
    - OS::TripleO::Services::AuditD
    - OS::TripleO::Services::CACerts
    - OS::TripleO::Services::CephOSD
    - OS::TripleO::Services::CertmongerUser
    - OS::TripleO::Services::Collectd
    - OS::TripleO::Services::Docker
    - OS::TripleO::Services::FluentdClient
    - OS::TripleO::Services::Kernel
    - OS::TripleO::Services::MySQLClient
    - OS::TripleO::Services::Ntp
    - OS::TripleO::Services::Securetty
    - OS::TripleO::Services::SensuClient
    - OS::TripleO::Services::Snmp
    - OS::TripleO::Services::Sshd
    - OS::TripleO::Services::Timezone
    - OS::TripleO::Services::TripleoFirewall
    - OS::TripleO::Services::TripleoPackages
    - OS::TripleO::Services::Tuned




Result:

the deployment gets stuck.


Looking on what it's stuck:

(undercloud) [stack@undercloud-0 ~]$ heat resource-list -n5 overcloud|grep -v COMPLE
WARNING (shell) "heat resource-list" is deprecated, please use "openstack stack resource list" instead
+----------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------+--------------------+----------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------+
| resource_name                                | physical_resource_id                                                                                                                                                                 | resource_type                                                                                                                  | resource_status    | updated_time         | stack_name                                                                                                                                               |
+----------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------+--------------------+----------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------+
| AllNodesDeploySteps                          | 0cc75327-de9c-4e5f-b55b-2192ddead33b                                                                                                                                                 | OS::TripleO::PostDeploySteps                                                                                                   | CREATE_IN_PROGRESS | 2017-08-28T18:55:19Z | overcloud                                                                                                                                                |
| ControllerDeployment_Step3                   | e0b575e8-9f1c-42bb-b199-4fd90b13b2fd                                                                                                                                                 | OS::Heat::StructuredDeploymentGroup                                                                                            | CREATE_IN_PROGRESS | 2017-08-28T19:11:45Z | overcloud-AllNodesDeploySteps-mnw7npmaa72j                                                                                                               |
| 0                                            | 12b034ee-e017-4c05-ab92-8c3e344e9b4e                                                                                                                                                 | OS::Heat::StructuredDeployment                                                                                                 | CREATE_IN_PROGRESS | 2017-08-28T19:27:24Z | overcloud-AllNodesDeploySteps-mnw7npmaa72j-ControllerDeployment_Step3-vfhc4blmshe5                                                                       |
+----------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------+--------------------+----------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------+


(undercloud) [stack@undercloud-0 ~]$ heat deployment-show 12b034ee-e017-4c05-ab92-8c3e344e9b4e
WARNING (shell) "heat deployment-show" is deprecated, please use "openstack software deployment show" instead
{
  "status": "IN_PROGRESS", 
  "server_id": "7d3d7442-9278-4e0d-abe5-ed21609cf554", 
  "config_id": "9fe0f172-fc74-475b-9c60-1f60a95cbe5d", 
  "output_values": null, 
  "creation_time": "2017-08-28T19:27:27Z", 
  "input_values": {
    "update_identifier": "1503946498", 
    "docker_puppet_debug": "", 
    "role_name": "Controller", 
    "step": 3, 
    "bootstrap_server_id": "7d3d7442-9278-4e0d-abe5-ed21609cf554"
  }, 
  "action": "CREATE", 
  "status_reason": "Deploy data available", 
  "id": "12b034ee-e017-4c05-ab92-8c3e344e9b4e"
}




(undercloud) [stack@undercloud-0 ~]$ nova list
+--------------------------------------+-----------------------+--------+------------+-------------+------------------------+
| ID                                   | Name                  | Status | Task State | Power State | Networks               |
+--------------------------------------+-----------------------+--------+------------+-------------+------------------------+
| e6dce180-53dd-427e-9ac9-e0a92dd8d511 | ceph-0                | ACTIVE | -          | Running     | ctlplane=192.168.24.9  |
| fcfde688-40ab-4dfd-b152-5f1a718d551b | ceph-1                | ACTIVE | -          | Running     | ctlplane=192.168.24.10 |
| f9bbd65b-c781-4e6b-9912-dc536addfd6a | ceph-2                | ACTIVE | -          | Running     | ctlplane=192.168.24.18 |
| 69938854-4baa-47cd-9932-ca2afbc0b2e6 | compute-0             | ACTIVE | -          | Running     | ctlplane=192.168.24.11 |
| 7d3d7442-9278-4e0d-abe5-ed21609cf554 | controller-0          | ACTIVE | -          | Running     | ctlplane=192.168.24.8  |
| a03a4294-2460-4cbd-a50d-f16c91f4329a | controller-1          | ACTIVE | -          | Running     | ctlplane=192.168.24.7  |
| e4b7b58d-ea00-4533-bf07-7cbf5b844d74 | controller-2          | ACTIVE | -          | Running     | ctlplane=192.168.24.22 |
| d91f4c00-a6c7-4b95-831e-6c4415ce0e5b | overcloud-database-0  | ACTIVE | -          | Running     | ctlplane=192.168.24.6  |
| e1c71bdf-43e4-48a1-b89a-5473a8818ab0 | overcloud-database-1  | ACTIVE | -          | Running     | ctlplane=192.168.24.15 |
| 58aae23d-1088-48bb-8023-6b0c4caf0183 | overcloud-database-2  | ACTIVE | -          | Running     | ctlplane=192.168.24.12 |
| 7009b922-6e9f-440d-973a-9403abd159e8 | overcloud-messaging-0 | ACTIVE | -          | Running     | ctlplane=192.168.24.13 |
| c9f3f417-680d-4b05-96dd-d0c8f0da92c1 | overcloud-messaging-1 | ACTIVE | -          | Running     | ctlplane=192.168.24.17 |
| ad7e4458-c53e-4c01-8005-0c71cd103278 | overcloud-messaging-2 | ACTIVE | -          | Running     | ctlplane=192.168.24.24 |
| b7aedbfb-08c2-42db-a598-ebe0a93a0de6 | overcloud-networker-0 | ACTIVE | -          | Running     | ctlplane=192.168.24.16 |
| 51cfc60f-ca1b-47ea-8b02-8363d7f04ce7 | overcloud-networker-1 | ACTIVE | -          | Running     | ctlplane=192.168.24.19 |
+--------------------------------------+-----------------------+--------+------------+-------------+------------------------+


So the task is on controller-0
Checking that node for errors:

Many repeating error messages:


Aug 28 19:55:00 controller-0 pengine[21737]:    error: Couldn't expand haproxy-bundle_stop_0 to haproxy-bundle_stopped_0 in haproxy-bundle
Aug 28 19:55:00 controller-0 pengine[21737]:    error: Couldn't expand haproxy-bundle_stop_0 to haproxy-bundle_stopped_0 in haproxy-bundle
Aug 28 19:56:00 controller-0 pengine[21737]:    error: Couldn't expand haproxy-bundle_stop_0 to haproxy-bundle_stopped_0 in haproxy-bundle
Aug 28 19:56:00 controller-0 pengine[21737]:    error: Couldn't expand haproxy-bundle_stop_0 to haproxy-bundle_stopped_0 in haproxy-bundle
Aug 28 19:56:00 controller-0 pengine[21737]:    error: Couldn't expand haproxy-bundle_stop_0 to haproxy-bundle_stopped_0 in haproxy-bundle
Aug 28 19:56:00 controller-0 pengine[21737]:    error: Couldn't expand haproxy-bundle_stop_0 to haproxy-bundle_stopped_0 in haproxy-bundle
Aug 28 19:56:00 controller-0 pengine[21737]:    error: Couldn't expand haproxy-bundle_stop_0 to haproxy-bundle_stopped_0 in haproxy-bundle
Aug 28 19:56:00 controller-0 pengine[21737]:    error: Couldn't expand haproxy-bundle_stop_0 to haproxy-bundle_stopped_0 in haproxy-bundle

Comment 1 Dan Prince 2017-09-06 18:51:36 UTC

I was able to debug this environment a bit today w/ Sasha. It appears that database syncs are failing.

It looks to me that MySQL is running. I was able to attached to MySQL via localhost (mysql -u root) and verify all of the databases are getting created. But some of the db sync for services are timing out.

I manually tried to connect and got this on the command line:

mysql -u heat -h 172.17.1.13 -p3UazsaeTC64V9UvEcJ3GZ9rbd
ERROR 2013 (HY000): Lost connection to MySQL server at 'reading initial communication packet', system error: 0

Double checked the HA proxy config and I see this is the correct VIP for MySQL. Could be related to firewall rules given that this deployment has the controller and database servers split out.

On the controller I see:

[root@controller-0 containers]# iptables-save | grep 3306
-A INPUT -p tcp -m multiport --dports 3306 -m state --state NEW -m comment --comment "100 mysql_haproxy ipv4" -j ACCEPT

[root@overcloud-database-0 ~]# iptables-save | grep 3306
-A INPUT -p tcp -m multiport --dports 873,3123,3306,4444,4567,4568,9200 -m state --state NEW -m comment --comment "104 mysql galera-bundle ipv4" -j ACCEPT

It would seem that both services are accepting 3306 traffic.

Would be good to have someone from the HA team review these configs and see if they line up correctly.

Comment 2 Michele Baldessari 2017-09-07 07:39:19 UTC

I am travelling so am a bit slow to respond, but do we have sosreports around for this or a live env? From Dan's initial analysis at c#1 and the error messages, I would guess that haproxy is having some sort of issues (maybe the bundle is constantly restarting or what not). Sasha, if you can send me some env login or some sosreports I can investigate a bit more.

(NB: I deploy composable HA on a daily basis with galera/rabbit split out to 6 separate nodes, so my best guess without more data would be that it is due to the fact that we do not have yet a new pacemaker build with all the needed bundle fixes, but I'd like to take a deeper look in any case)

Comment 4 Michele Baldessari 2017-09-08 10:24:05 UTC

Thanks Sasha!

So the issue is that the clustercheck container is erroring out when talking to mysql and hence haproxy will refuse to accept connections on 3306 because all three backends are down.
A) Cluster check not working
[root@controller-2 log]# docker exec -it clustercheck /bin/bash
()[mysql@controller-2 /]$ mysql -h 127.0.0.1 -u clustercheck -pdrwh87rmM8KzWxyGcJWZ2TbGC
ERROR 2003 (HY000): Can't connect to MySQL server on '127.0.0.1' (111)

B) Haproxy refusing connections
[root@controller-2 log]# mysql -u heat -h 172.17.1.13 -p3UazsaeTC64V9UvEcJ3GZ9rbd
ERROR 2013 (HY000): Lost connection to MySQL server at 'reading initial communication packet', system error: 0

C) Connections straight to mysql work correctly:
[root@controller-2 log]# mysql -u heat -h 172.17.1.22 -p3UazsaeTC64V9UvEcJ3GZ9rbd
MariaDB [(none)]> Bye                         
[root@controller-2 log]# mysql -u heat -h 172.17.1.21 -p3UazsaeTC64V9UvEcJ3GZ9rbd
MariaDB [(none)]> Bye                         
[root@controller-2 log]# mysql -u heat -h 172.17.1.16 -p3UazsaeTC64V9UvEcJ3GZ9rbd
MariaDB [(none)]> Bye                         

The reason this is not working in this environment is that the clustercheck container needs to be always deployed on the database role. I will make sure that upstream this will be fixed. But for the time being you can just add OS::TripleO::Services::Clustercheck to your database role and remove it from the ControllerOpenstack role 
upstream

Comment 5 Alexander Chuzhoy 2017-09-08 13:04:58 UTC

Hi Michele,
Since the roles_data.yaml was prepared with "openstack overcloud roles generate"  - I added this comment https://bugzilla.redhat.com/show_bug.cgi?id=1485108#c9 in the respective bug.
Thanks.

Comment 6 Alexander Chuzhoy 2017-09-08 14:41:24 UTC

Confirm that I was able to deploy successfully, once I moved the "OS::TripleO::Services::Clustercheck" to database role from controller.

Comment 7 Michele Baldessari 2017-09-09 06:37:51 UTC

pike review merged, moving to POST and linking the right review

Comment 9 Alexander Chuzhoy 2017-09-26 17:48:19 UTC

Verified:
Environment:
openstack-tripleo-heat-templates-7.0.1-0.20170919183703.el7ost.noarch

  Clustercheck is added by default to the database role:

###############################################################################
# Role: Database                                                              #
###############################################################################
- name: Database
  description: |
    Standalone database role with the database being managed via Pacemaker
  networks:
    - InternalApi
  HostnameFormatDefault: '%stackname%-database-%index%'
  ServicesDefault:
    - OS::TripleO::Services::AuditD
    - OS::TripleO::Services::CACerts
    - OS::TripleO::Services::CertmongerUser
    - OS::TripleO::Services::Collectd
    - OS::TripleO::Services::Clustercheck
    - OS::TripleO::Services::Docker
    - OS::TripleO::Services::FluentdClient
    - OS::TripleO::Services::Kernel
    - OS::TripleO::Services::MySQL
    - OS::TripleO::Services::MySQLClient
    - OS::TripleO::Services::Ntp
    - OS::TripleO::Services::ContainersLogrotateCrond
    - OS::TripleO::Services::Pacemaker
    - OS::TripleO::Services::SensuClient
    - OS::TripleO::Services::Snmp
    - OS::TripleO::Services::Timezone
    - OS::TripleO::Services::TripleoFirewall
    - OS::TripleO::Services::TripleoPackages
    - OS::TripleO::Services::Tuned

Comment 13 errata-xmlrpc 2017-12-13 21:58:11 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2017:3462