Bug 1707148

Summary: Overcloud deployment using composable roles times out because database traffic is not allowed on the controller nodes where haproxy loadbalancer is running
Product: Red Hat OpenStack Reporter: Marius Cornea <mcornea>
Component: openstack-tripleo-heat-templatesAssignee: Michele Baldessari <michele>
Status: CLOSED ERRATA QA Contact: Sasha Smolyak <ssmolyak>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 15.0 (Stein)CC: aherr, bperkins, dbecker, emacchi, jcoufal, lmiccini, mburns, michele, morazi, pkomarov
Target Milestone: betaKeywords: Triaged
Target Release: 15.0 (Stein)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-tripleo-heat-templates-10.5.1-0.20190521220357.dd20049.el8ost Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-09-21 11:21:45 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Marius Cornea 2019-05-06 22:17:48 UTC
Description of problem:

Overcloud deployment using composable roles times out because database traffic is not allowed on the controller nodes where haproxy loadbalancer is running.


on controller-0:

[stack@controller-0 ~]$ sudo podman ps -a | grep Exited\ \(1
06475f42fc29  192.168.24.145:8787/rhosp15/openstack-aodh-api:20190423.1                 dumb-init --singl...  23 minutes ago  Exited (1) 22 minutes ago         aodh_db_sync

[stack@controller-0 ~]$ sudo tail -f /var/log/containers/aodh/aodh-dbsync.log 
2019-05-06 21:48:22.268 13 ERROR aodh   File "/usr/lib64/python3.6/site-packages/sqlalchemy/engine/default.py", line 411, in connect
2019-05-06 21:48:22.268 13 ERROR aodh     return self.dbapi.connect(*cargs, **cparams)
2019-05-06 21:48:22.268 13 ERROR aodh   File "/usr/lib/python3.6/site-packages/pymysql/__init__.py", line 90, in Connect
2019-05-06 21:48:22.268 13 ERROR aodh     return Connection(*args, **kwargs)
2019-05-06 21:48:22.268 13 ERROR aodh   File "/usr/lib/python3.6/site-packages/pymysql/connections.py", line 699, in __init__
2019-05-06 21:48:22.268 13 ERROR aodh     self.connect()
2019-05-06 21:48:22.268 13 ERROR aodh   File "/usr/lib/python3.6/site-packages/pymysql/connections.py", line 967, in connect
2019-05-06 21:48:22.268 13 ERROR aodh     raise exc
2019-05-06 21:48:22.268 13 ERROR aodh oslo_db.exception.DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '172.17.1.40' (timed out)") (Background on this error at: http://sqlalche.me/e/e3q8)
2019-05-06 21:48:22.268 13 ERROR aodh 


[stack@controller-0 ~]$ curl --connect-timeout 1 http://172.17.1.40:3306
curl: (28) Connection timed out after 1000 milliseconds

[stack@controller-0 ~]$ sudo iptables -nL
Chain INPUT (policy ACCEPT)
target     prot opt source               destination         
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            state RELATED,ESTABLISHED /* 000 accept related established rules ipv4 */
ACCEPT     icmp --  0.0.0.0/0            0.0.0.0/0            state NEW /* 001 accept all icmp ipv4 */
ACCEPT     all  --  0.0.0.0/0            0.0.0.0/0            state NEW /* 002 accept all to lo interface ipv4 */
ACCEPT     tcp  --  192.168.24.0/24      0.0.0.0/0            multiport dports 22 state NEW /* 003 accept ssh from controlplane ipv4 */
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            multiport dports 3124,6379,26379 state NEW /* 108 redis-bundle ipv4 */
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            multiport dports 6789,3300 state NEW /* 110 ceph_mon ipv4 */
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            multiport dports 5000,13000,35357 state NEW /* 111 keystone ipv4 */
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            multiport dports 9292,13292 state NEW /* 112 glance_api ipv4 */
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            multiport dports 6800:7300 state NEW /* 113 ceph_mgr ipv4 */
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            multiport dports 8774,13774 state NEW /* 113 nova_api ipv4 */
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            multiport dports 9696,13696 state NEW /* 114 neutron api ipv4 */
ACCEPT     udp  --  0.0.0.0/0            0.0.0.0/0            multiport dports 4789 state NEW /* 118 neutron vxlan networks ipv4 */
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            multiport dports 8776,13776 state NEW /* 119 cinder ipv4 */
ACCEPT     udp  --  0.0.0.0/0            0.0.0.0/0            multiport dports 6081 state NEW /* 119 neutron geneve networks ipv4 */
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            multiport dports 3260 state NEW /* 120 iscsi initiator ipv4 */
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            multiport dports 6641,6642 state NEW /* 121 OVN DB server ports ipv4 */
ACCEPT     tcp  --  172.17.1.0/24        0.0.0.0/0            multiport dports 11211 state NEW /* 121 memcached 172.17.1.0/24 ipv4 */
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            multiport dports 8080,13808 state NEW /* 122 swift proxy ipv4 */
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            multiport dports 873,6000,6001,6002 state NEW /* 123 swift storage ipv4 */
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            multiport dports 8004,13004 state NEW /* 125 heat_api ipv4 */
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            multiport dports 8000,13800 state NEW /* 125 heat_cfn ipv4 */
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            multiport dports 80,443 state NEW /* 126 horizon ipv4 */
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            multiport dports 80,443 state NEW /* 127 horizon ipv4 */
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            multiport dports 8042,13042 state NEW /* 128 aodh-api ipv4 */
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            multiport dports 8041,13041 state NEW /* 129 gnocchi-api ipv4 */
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            multiport dports 2224,3121,21064 state NEW /* 130 pacemaker tcp ipv4 */
ACCEPT     udp  --  0.0.0.0/0            0.0.0.0/0            multiport dports 5405 state NEW /* 131 pacemaker udp ipv4 */
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            multiport dports 6080,13080 state NEW /* 137 nova_vnc_proxy ipv4 */
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            multiport dports 8778,13778 state NEW /* 138 nova_placement ipv4 */
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            multiport dports 8775,13775 state NEW /* 139 nova_metadata ipv4 */
ACCEPT     udp  --  0.0.0.0/0            0.0.0.0/0            multiport dports 8125 state NEW /* 140 gnocchi-statsd ipv4 */
ACCEPT     tcp  --  0.0.0.0/0            0.0.0.0/0            multiport dports 8977,13977 state NEW /* 140 panko-api ipv4 */
LOG        all  --  0.0.0.0/0            0.0.0.0/0            state NEW limit: avg 20/min burst 15 /* 998 log all ipv4 */ LOG flags 0 level 4
DROP       all  --  0.0.0.0/0            0.0.0.0/0            state NEW /* 999 drop all ipv4 */

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination         

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination  


After allowing db traffic on the controller node holding the internal api VIP I am able to connect:

[root@controller-2 stack]# iptables -I INPUT -p tcp -m multiport --dports 3306 -m state --state NEW -m comment --comment "db on controller" -j ACCEPT

[stack@controller-0 ~]$ curl --output - --connect-timeout 1 http://172.17.1.40:3306
Y
5.5.5-10.3.11-MariaDB7<j"heEeO���;yoB;>hot{.mmysql_native_password!��#08S01Got packets out of order


Version-Release number of selected component (if applicable):
openstack-tripleo-heat-templates-10.5.1-0.20190423085106.3f148c4.el8ost.noarch

How reproducible:


Steps to Reproduce:
1. Deploy overcloud on pre-deployed servers with the ControllerDeployedServer, ComputeDeployedServer, NetworkerDeployedServer, DatabaseDeployedServer, CephDeployedServer, MessagingDeployedServer roles.


Actual results:
Deployment times out

Expected results:
Deployment succeeds.

Additional info:
Attaching templates

Comment 11 errata-xmlrpc 2019-09-21 11:21:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2019:2811