Bug 1539192 - radosgw binds on *:8080 and prevents haproxy to start
Summary: radosgw binds on *:8080 and prevents haproxy to start
Keywords:
Status: CLOSED DUPLICATE of bug 1509584
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo
Version: 12.0 (Pike)
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Raoul Scarazzini
QA Contact: Arik Chernetsky
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-01-26 21:17 UTC by Gonéri Le Bouder
Modified: 2018-01-31 19:00 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-01-31 19:00:37 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Gonéri Le Bouder 2018-01-26 21:17:43 UTC
Description of problem:


I was able to do a full OSP12 deployment without Ceph, not I retry with
Ceph enabled. My deployment fails because cinder-manage db sync reachs
a timeout.

pcs status gives my that:

Failed Actions:
* haproxy-bundle-docker-0_start_0 on overcloud-controller-0 'unknown error' (1): call=105, status=complete, exitreason='Newly created docker container exited after start',
    last-rc-change='Fri Jan 26 19:52:07 2018', queued=0ms, exec=1313ms
* haproxy-bundle-docker-2_start_0 on overcloud-controller-0 'unknown error' (1): call=109, status=complete, exitreason='Newly created docker container exited after start',
    last-rc-change='Fri Jan 26 19:52:10 2018', queued=0ms, exec=1298ms
* haproxy-bundle-docker-1_start_0 on overcloud-controller-0 'unknown error' (1): call=107, status=complete, exitreason='Newly created docker container exited after start',
    last-rc-change='Fri Jan 26 19:52:09 2018', queued=0ms, exec=1363ms
* haproxy-bundle-docker-0_start_0 on overcloud-controller-2 'unknown error' (1): call=107, status=complete, exitreason='Newly created docker container exited after start',
    last-rc-change='Fri Jan 26 19:52:10 2018', queued=0ms, exec=1318ms
* haproxy-bundle-docker-1_start_0 on overcloud-controller-2 'unknown error' (1): call=109, status=complete, exitreason='Newly created docker container exited after start',
    last-rc-change='Fri Jan 26 19:52:12 2018', queued=0ms, exec=1323ms
* haproxy-bundle-docker-2_start_0 on overcloud-controller-2 'unknown error' (1): call=105, status=complete, exitreason='Newly created docker container exited after start',
    last-rc-change='Fri Jan 26 19:52:07 2018', queued=0ms, exec=1326ms
* haproxy-bundle-docker-2_start_0 on overcloud-controller-1 'unknown error' (1): call=109, status=complete, exitreason='Newly created docker container exited after start',
    last-rc-change='Fri Jan 26 19:52:12 2018', queued=0ms, exec=1322ms
* haproxy-bundle-docker-1_start_0 on overcloud-controller-1 'unknown error' (1): call=105, status=complete, exitreason='Newly created docker container exited after start',
    last-rc-change='Fri Jan 26 19:52:07 2018', queued=0ms, exec=1328ms
* haproxy-bundle-docker-0_start_0 on overcloud-controller-1 'unknown error' (1): call=107, status=complete, exitreason='Newly created docker container exited after start',
    last-rc-change='Fri Jan 26 19:52:09 2018', queued=0ms, exec=1310m


[root@overcloud-controller-0 log]# pcs resource show haproxy-bundle
 Bundle: haproxy-bundle
  Docker: image=192.168.249.8:5000/rhosp12/openstack-haproxy:pcmklatest network=host options="--user=root --log-driver=journald -e KOLLA_CONFIG_STRATEGY=COPY_ALWAYS" replicas=3 run-command="/bin/bash /usr/local/bin/kolla_start"
  Storage Mapping:
   options=ro source-dir=/var/lib/kolla/config_files/haproxy.json target-dir=/var/lib/kolla/config_files/config.json (haproxy-cfg-files)
   options=ro source-dir=/var/lib/config-data/puppet-generated/haproxy/ target-dir=/var/lib/kolla/config_files/src (haproxy-cfg-data)
   options=ro source-dir=/etc/hosts target-dir=/etc/hosts (haproxy-hosts)
   options=ro source-dir=/etc/localtime target-dir=/etc/localtime (haproxy-localtime)
   options=ro source-dir=/etc/pki/ca-trust/extracted target-dir=/etc/pki/ca-trust/extracted (haproxy-pki-extracted)
   options=ro source-dir=/etc/pki/tls/certs/ca-bundle.crt target-dir=/etc/pki/tls/certs/ca-bundle.crt (haproxy-pki-ca-bundle-crt)
   options=ro source-dir=/etc/pki/tls/certs/ca-bundle.trust.crt target-dir=/etc/pki/tls/certs/ca-bundle.trust.crt (haproxy-pki-ca-bundle-trust-crt)
   options=ro source-dir=/etc/pki/tls/cert.pem target-dir=/etc/pki/tls/cert.pem (haproxy-pki-cert)
   options=rw source-dir=/dev/log target-dir=/dev/log (haproxy-dev-log)
[root@overcloud-controller-0 log]# docker logs haproxy-bundle                                                                                                                                                                                                                                                                 
Error: No such container: haproxy-bundle

I've nothing in /var/log/containers:

[root@overcloud-controller-0 containers]# find /var/log/containers/ -type f
/var/log/containers/horizon/horizon.log
/var/log/containers/memcached/memcached.log
[root@overcloud-controller-0 containers]# cat /var/log/containers/horizon/horizon.log
[root@overcloud-controller-0 containers]# cat /var/log/containers/memcached/memcached.log


[root@overcloud-controller-0 containers]# docker ps -a
CONTAINER ID        IMAGE                                                      COMMAND                  CREATED             STATUS                         PORTS               NAMES
f2e1cdba9d8c        192.168.249.8:5000/rhosp12/openstack-redis:pcmklatest      "/bin/bash /usr/local"   About an hour ago   Up About an hour                                   redis-bundle-docker-0
46ca7063f3a2        192.168.249.8:5000/rhosp12/openstack-haproxy:latest        "/bin/bash -c 'cp -a "   About an hour ago   Exited (0) About an hour ago                       haproxy_init_bundle
209388242c1b        192.168.249.8:5000/rhosp12/openstack-redis:latest          "/bin/bash -c 'cp -a "   About an hour ago   Exited (0) About an hour ago                       redis_init_bundle
3d46fed9ec53        192.168.249.8:5000/rhosp12/openstack-mariadb:pcmklatest    "/bin/bash /usr/local"   About an hour ago   Up About an hour                                   galera-bundle-docker-0
6dce70e18d54        192.168.249.8:5000/rhosp12/openstack-rabbitmq:pcmklatest   "/bin/bash /usr/local"   About an hour ago   Up About an hour (healthy)                         rabbitmq-bundle-docker-0
3a03dbbc8a81        192.168.249.8:5000/rhosp12/openstack-mariadb:latest        "/bin/bash -c 'cp -a "   About an hour ago   Exited (0) About an hour ago                       mysql_init_bundle
e46016adccff        192.168.249.8:5000/rhosp12/openstack-mariadb:latest        "kolla_start"            About an hour ago   Up About an hour                                   clustercheck
5f2cbff5bafe        192.168.249.8:5000/rhosp12/openstack-aodh-api:latest       "/bin/bash -c 'chown "   About an hour ago   Exited (0) About an hour ago                       aodh_init_log
7aa812aa4511        192.168.249.8:5000/rhosp12/openstack-panko-api:latest      "/bin/bash -c 'chown "   About an hour ago   Exited (0) About an hour ago                       panko_init_log
bec9d5d9805c        192.168.249.8:5000/rhosp12/openstack-horizon:latest        "/bin/bash -c 'touch "   About an hour ago   Exited (0) About an hour ago                       horizon_fix_perms
d875072cbe0b        192.168.249.8:5000/rhosp12/openstack-keystone:latest       "/bin/bash -c 'chown "   About an hour ago   Exited (0) About an hour ago                       keystone_init_log
6e0ba98027eb        192.168.249.8:5000/rhosp12/openstack-glance-api:latest     "/bin/bash -c 'chown "   About an hour ago   Exited (0) About an hour ago                       glance_init_logs
88fc6f1844eb        192.168.249.8:5000/rhosp12/openstack-heat-engine:latest    "/bin/bash -c 'chown "   About an hour ago   Exited (0) About an hour ago                       heat_init_log
6fcf97a39c83        192.168.249.8:5000/rhosp12/openstack-nova-api:latest       "/bin/bash -c 'chown "   About an hour ago   Exited (0) About an hour ago                       nova_init_logs
454c3bdc3c17        192.168.249.8:5000/rhosp12/openstack-rabbitmq:latest       "/bin/bash -c 'cp -a "   About an hour ago   Exited (0) About an hour ago                       rabbitmq_init_bundle
b9cd2596f74e        192.168.249.8:5000/rhosp12/openstack-gnocchi-api:latest    "/bin/bash -c 'chown "   About an hour ago   Exited (0) About an hour ago                       gnocchi_init_log
4a2b2cbb4e3c        registry.access.redhat.com/rhceph/rhceph-2-rhel7:latest    "/entrypoint.sh"         About an hour ago   Up About an hour                                   ceph-rgw-overcloud-controller-0
acd69513e728        registry.access.redhat.com/rhceph/rhceph-2-rhel7:latest    "/entrypoint.sh"         About an hour ago   Up About an hour                                   ceph-mon-overcloud-controller-0
8371566df993        192.168.249.8:5000/rhosp12/openstack-mariadb:latest        "/bin/bash -c '/usr/b"   About an hour ago   Exited (0) About an hour ago                       mysql_image_tag
fb2ab111c595        192.168.249.8:5000/rhosp12/openstack-memcached:latest      "/bin/bash -c 'source"   About an hour ago   Up About an hour                                   memcached
8af6da99bfad        192.168.249.8:5000/rhosp12/openstack-haproxy:latest        "/bin/bash -c '/usr/b"   About an hour ago   Exited (0) 16 minutes ago                          haproxy_image_tag
1be6aa4060dc        192.168.249.8:5000/rhosp12/openstack-mariadb:latest        "bash -ecx 'if [ -e /"   About an hour ago   Exited (0) About an hour ago                       mysql_bootstrap
a2b23bc1e701        192.168.249.8:5000/rhosp12/openstack-redis:latest          "/bin/bash -c '/usr/b"   About an hour ago   Exited (0) About an hour ago                       redis_image_tag
43027667f6a9        192.168.249.8:5000/rhosp12/openstack-rabbitmq:latest       "/bin/bash -c '/usr/b"   About an hour ago   Exited (0) About an hour ago                       rabbitmq_image_tag
3734fa7c3ef2        192.168.249.8:5000/rhosp12/openstack-rabbitmq:latest       "kolla_start"            About an hour ago   Exited (0) About an hour ago                       rabbitmq_bootstrap
cb7f05492c04        192.168.249.8:5000/rhosp12/openstack-memcached:latest      "/bin/bash -c 'source"   About an hour ago   Exited (0) About an hour ago                       memcached_init_logs
3a69051db3c7        192.168.249.8:5000/rhosp12/openstack-mariadb:latest        "chown -R mysql: /var"   About an hour ago   Exited (0) About an hour ago                       mysql_data_ownership

"docker logs haproxy_init_bundle" show me the log of a successful run of puppet.

[root@overcloud-controller-0 log]# cat /var/log/cinder/cinder-manage.log 
2018-01-26 19:54:03.876 122212 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. -1 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '192.168.140.252' ([Errno 113] No route to host)")
2018-01-26 19:54:16.895 122212 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. -2 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '192.168.140.252' ([Errno 113] No route to host)")
2018-01-26 19:54:29.911 122212 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. -3 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '192.168.140.252' ([Errno 113] No route to host)")
2018-01-26 19:54:42.928 122212 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. -4 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '192.168.140.252' ([Errno 113] No route to host)")
2018-01-26 19:54:55.945 122212 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. -5 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '192.168.140.252' ([Errno 113] No route to host)")
2018-01-26 19:55:08.963 122212 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. -6 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '192.168.140.252' ([Errno 113] No route to host)")
2018-01-26 19:55:21.979 122212 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. -7 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '192.168.140.252' ([Errno 113] No route to host)")
2018-01-26 19:55:34.996 122212 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. -8 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '192.168.140.252' ([Errno 113] No route to host)")
2018-01-26 19:55:48.011 122212 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. -9 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '192.168.140.252' ([Errno 113] No route to host)")
2018-01-26 19:56:01.027 122212 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. -10 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '192.168.140.252' ([Errno 113] No route to host)")
2018-01-26 19:56:14.043 122212 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. -11 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '192.168.140.252' ([Errno 113] No route to host)")
2018-01-26 19:56:27.059 122212 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. -12 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '192.168.140.252' ([Errno 113] No route to host)")
2018-01-26 19:56:40.076 122212 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. -13 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '192.168.140.252' ([Errno 113] No route to host)")
2018-01-26 19:56:53.092 122212 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. -14 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '192.168.140.252' ([Errno 113] No route to host)")
2018-01-26 19:57:06.099 122212 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. -15 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '192.168.140.252' ([Errno 113] No route to host)")
2018-01-26 19:57:19.115 122212 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. -16 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '192.168.140.252' ([Errno 113] No route to host)")
2018-01-26 19:57:32.131 122212 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. -17 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '192.168.140.252' ([Errno 113] No route to host)")
2018-01-26 19:57:45.148 122212 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. -18 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '192.168.140.252' ([Errno 113] No route to host)")
2018-01-26 19:57:58.163 122212 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. -19 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '192.168.140.252' ([Errno 113] No route to host)")
2018-01-26 19:58:10.514 122212 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. -20 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '192.168.140.252' ([Errno 113] No route to host)")
2018-01-26 19:58:23.522 122212 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. -21 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '192.168.140.252' ([Errno 113] No route to host)")
2018-01-26 19:58:36.540 122212 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. -22 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '192.168.140.252' ([Errno 113] No route to host)")
2018-01-26 19:58:49.548 122212 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. -23 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '192.168.140.252' ([Errno 113] No route to host)")
[root@overcloud-controller-0 log]# ping -c1 192.168.140.252
PING 192.168.140.252 (192.168.140.252) 56(84) bytes of data.
^C
--- 192.168.140.252 ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms

[root@overcloud-controller-0 log]# ip a |grep 192.168.140
    inet 192.168.140.152/24 brd 192.168.140.255 scope global vlan140

Any advice would be welcome.

Comment 1 James Slagle 2018-01-29 17:55:03 UTC
this needs investigation by PIDONE as to why pacemaker failed to configure/start haproxy

Comment 2 Raoul Scarazzini 2018-01-29 18:06:08 UTC
It is not clear if it is an overcloud update we're talking about or not. Was this an overcloud in state CREATE_COMPLETE that was updated adding ceph and then failed, or not? Is it reproducible? Do we have sosreports of the nodes?

Comment 3 Gonéri Le Bouder 2018-01-29 18:54:45 UTC
The overcloud deployment has failed with this on the #1 controller:

        "Error: /Stage[main]/Cinder::Db::Sync/Exec[cinder-manage db_sync]: Failed to call refresh: Command exceeded timeout",
        "Error: /Stage[main]/Cinder::Db::Sync/Exec[cinder-manage db_sync]: Command exceeded timeout",

Comment 4 Gonéri Le Bouder 2018-01-30 13:59:48 UTC
Do you have any idea of where I should start to investigate?

Comment 5 Gonéri Le Bouder 2018-01-31 14:12:05 UTC
I have managed to reproduce the same problem with puddle 2018-01-26.2.

Comment 9 Michele Baldessari 2018-01-31 16:01:15 UTC
So after digging enough around the journal we can see the following:
<7>haproxy-systemd-wrapper: executing /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -Ds 
[ALERT] 030/153522 (27) : Starting proxy ceph_rgw: cannot bind socket [192.168.170.251:8080]
[ALERT] 030/153522 (27) : Starting proxy ceph_rgw: cannot bind socket [192.168.249.251:8080]
<5>haproxy-systemd-wrapper: exit, haproxy RC=1

The problem is that haproxy is configured to proxy rados gw connections via:
listen ceph_rgw
  bind 192.168.170.251:8080 transparent
  bind 192.168.249.251:8080 transparent
  http-request set-header X-Forwarded-Proto https if { ssl_fc }
  http-request set-header X-Forwarded-Proto http if !{ ssl_fc }
  option httpchk HEAD /
  server overcloud-controller-0.storage.fv3.net 192.168.170.158:8080 check fall 5 inter 2000 rise 2
  server overcloud-controller-1.storage.fv3.net 192.168.170.161:8080 check fall 5 inter 2000 rise 2
  server overcloud-controller-2.storage.fv3.net 192.168.170.157:8080 check fall 5 inter 2000 rise 2

But radosgw is binding to those two IPs up there as well (192.168.170.251:8080 and 192.168.249.251:8080). In fact it is listening to all IPs (which kind of violates network-isolation):
[root@overcloud-controller-0 audit]# ss -tnlp | grep 8080
LISTEN     0      128          *:8080                     *:*                   users:(("radosgw",pid=56628,fd=70))

I am no ceph expert, but the ceph.conf that is being mapped inside the ceph rgw container seems to imply that it should be binding on some IP (see rgw frontends snippets)?:
global]
cluster network = 192.168.180.0/24
fsid = 785dbf60-05f2-11e8-918f-5254008eef51
journal_collocation = False
journal_size = 10000
mon host = 192.168.170.157,192.168.170.158,192.168.170.161
mon initial members = overcloud-controller-2,overcloud-controller-0,overcloud-controller-1
osd_pool_default_pg_num = 128
osd_pool_default_pgp_num = 128
osd_pool_default_size = 3
public network = 192.168.170.0/24
raw_multi_journal = True
rgw_keystone_accepted_roles = Member, _member_, admin
rgw_keystone_admin_domain = default
rgw_keystone_admin_password = QceDmgCgm7wNw7BEgyEWAzXZj
rgw_keystone_admin_project = service
rgw_keystone_admin_user = swift
rgw_keystone_api_version = 3
rgw_keystone_url = http://192.168.140.252:5000
rgw_s3_auth_use_keystone = true
   
[client.rgw.overcloud-controller-2]
host = overcloud-controller-2
keyring = /var/lib/ceph/radosgw/ceph-rgw.overcloud-controller-2/keyring
log file = /var/log/ceph/ceph-rgw-overcloud-controller-2.log
rgw frontends = civetweb port=192.168.170.157:8080 num_threads=100
   
[client.rgw.overcloud-controller-1]
host = overcloud-controller-1
keyring = /var/lib/ceph/radosgw/ceph-rgw.overcloud-controller-1/keyring
log file = /var/log/ceph/ceph-rgw-overcloud-controller-1.log
rgw frontends = civetweb port=192.168.170.161:8080 num_threads=100
   
[client.rgw.overcloud-controller-0]
host = overcloud-controller-0
keyring = /var/lib/ceph/radosgw/ceph-rgw.overcloud-controller-0/keyring
log file = /var/log/ceph/ceph-rgw-overcloud-controller-0.log
rgw frontends = civetweb port=192.168.170.158:8080 num_threads=100


Moving to the Ceph dfg as this binding seems to be the root cause here.

Comment 11 John Fulton 2018-01-31 19:00:37 UTC

*** This bug has been marked as a duplicate of bug 1509584 ***


Note You need to log in before you can comment on or make changes to this bug.