Bug 1539192
| Summary: | radosgw binds on *:8080 and prevents haproxy to start | ||
|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Gonéri Le Bouder <goneri> |
| Component: | openstack-tripleo | Assignee: | Raoul Scarazzini <rscarazz> |
| Status: | CLOSED DUPLICATE | QA Contact: | Arik Chernetsky <achernet> |
| Severity: | unspecified | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 12.0 (Pike) | CC: | bperkins, emacchi, goneri, johfulto, mburns, michele, rhel-osp-director-maint |
| Target Milestone: | --- | ||
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2018-01-31 19:00:37 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
this needs investigation by PIDONE as to why pacemaker failed to configure/start haproxy It is not clear if it is an overcloud update we're talking about or not. Was this an overcloud in state CREATE_COMPLETE that was updated adding ceph and then failed, or not? Is it reproducible? Do we have sosreports of the nodes? The overcloud deployment has failed with this on the #1 controller:
"Error: /Stage[main]/Cinder::Db::Sync/Exec[cinder-manage db_sync]: Failed to call refresh: Command exceeded timeout",
"Error: /Stage[main]/Cinder::Db::Sync/Exec[cinder-manage db_sync]: Command exceeded timeout",
Do you have any idea of where I should start to investigate? I have managed to reproduce the same problem with puddle 2018-01-26.2. So after digging enough around the journal we can see the following:
<7>haproxy-systemd-wrapper: executing /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -Ds
[ALERT] 030/153522 (27) : Starting proxy ceph_rgw: cannot bind socket [192.168.170.251:8080]
[ALERT] 030/153522 (27) : Starting proxy ceph_rgw: cannot bind socket [192.168.249.251:8080]
<5>haproxy-systemd-wrapper: exit, haproxy RC=1
The problem is that haproxy is configured to proxy rados gw connections via:
listen ceph_rgw
bind 192.168.170.251:8080 transparent
bind 192.168.249.251:8080 transparent
http-request set-header X-Forwarded-Proto https if { ssl_fc }
http-request set-header X-Forwarded-Proto http if !{ ssl_fc }
option httpchk HEAD /
server overcloud-controller-0.storage.fv3.net 192.168.170.158:8080 check fall 5 inter 2000 rise 2
server overcloud-controller-1.storage.fv3.net 192.168.170.161:8080 check fall 5 inter 2000 rise 2
server overcloud-controller-2.storage.fv3.net 192.168.170.157:8080 check fall 5 inter 2000 rise 2
But radosgw is binding to those two IPs up there as well (192.168.170.251:8080 and 192.168.249.251:8080). In fact it is listening to all IPs (which kind of violates network-isolation):
[root@overcloud-controller-0 audit]# ss -tnlp | grep 8080
LISTEN 0 128 *:8080 *:* users:(("radosgw",pid=56628,fd=70))
I am no ceph expert, but the ceph.conf that is being mapped inside the ceph rgw container seems to imply that it should be binding on some IP (see rgw frontends snippets)?:
global]
cluster network = 192.168.180.0/24
fsid = 785dbf60-05f2-11e8-918f-5254008eef51
journal_collocation = False
journal_size = 10000
mon host = 192.168.170.157,192.168.170.158,192.168.170.161
mon initial members = overcloud-controller-2,overcloud-controller-0,overcloud-controller-1
osd_pool_default_pg_num = 128
osd_pool_default_pgp_num = 128
osd_pool_default_size = 3
public network = 192.168.170.0/24
raw_multi_journal = True
rgw_keystone_accepted_roles = Member, _member_, admin
rgw_keystone_admin_domain = default
rgw_keystone_admin_password = QceDmgCgm7wNw7BEgyEWAzXZj
rgw_keystone_admin_project = service
rgw_keystone_admin_user = swift
rgw_keystone_api_version = 3
rgw_keystone_url = http://192.168.140.252:5000
rgw_s3_auth_use_keystone = true
[client.rgw.overcloud-controller-2]
host = overcloud-controller-2
keyring = /var/lib/ceph/radosgw/ceph-rgw.overcloud-controller-2/keyring
log file = /var/log/ceph/ceph-rgw-overcloud-controller-2.log
rgw frontends = civetweb port=192.168.170.157:8080 num_threads=100
[client.rgw.overcloud-controller-1]
host = overcloud-controller-1
keyring = /var/lib/ceph/radosgw/ceph-rgw.overcloud-controller-1/keyring
log file = /var/log/ceph/ceph-rgw-overcloud-controller-1.log
rgw frontends = civetweb port=192.168.170.161:8080 num_threads=100
[client.rgw.overcloud-controller-0]
host = overcloud-controller-0
keyring = /var/lib/ceph/radosgw/ceph-rgw.overcloud-controller-0/keyring
log file = /var/log/ceph/ceph-rgw-overcloud-controller-0.log
rgw frontends = civetweb port=192.168.170.158:8080 num_threads=100
Moving to the Ceph dfg as this binding seems to be the root cause here.
*** This bug has been marked as a duplicate of bug 1509584 *** |
Description of problem: I was able to do a full OSP12 deployment without Ceph, not I retry with Ceph enabled. My deployment fails because cinder-manage db sync reachs a timeout. pcs status gives my that: Failed Actions: * haproxy-bundle-docker-0_start_0 on overcloud-controller-0 'unknown error' (1): call=105, status=complete, exitreason='Newly created docker container exited after start', last-rc-change='Fri Jan 26 19:52:07 2018', queued=0ms, exec=1313ms * haproxy-bundle-docker-2_start_0 on overcloud-controller-0 'unknown error' (1): call=109, status=complete, exitreason='Newly created docker container exited after start', last-rc-change='Fri Jan 26 19:52:10 2018', queued=0ms, exec=1298ms * haproxy-bundle-docker-1_start_0 on overcloud-controller-0 'unknown error' (1): call=107, status=complete, exitreason='Newly created docker container exited after start', last-rc-change='Fri Jan 26 19:52:09 2018', queued=0ms, exec=1363ms * haproxy-bundle-docker-0_start_0 on overcloud-controller-2 'unknown error' (1): call=107, status=complete, exitreason='Newly created docker container exited after start', last-rc-change='Fri Jan 26 19:52:10 2018', queued=0ms, exec=1318ms * haproxy-bundle-docker-1_start_0 on overcloud-controller-2 'unknown error' (1): call=109, status=complete, exitreason='Newly created docker container exited after start', last-rc-change='Fri Jan 26 19:52:12 2018', queued=0ms, exec=1323ms * haproxy-bundle-docker-2_start_0 on overcloud-controller-2 'unknown error' (1): call=105, status=complete, exitreason='Newly created docker container exited after start', last-rc-change='Fri Jan 26 19:52:07 2018', queued=0ms, exec=1326ms * haproxy-bundle-docker-2_start_0 on overcloud-controller-1 'unknown error' (1): call=109, status=complete, exitreason='Newly created docker container exited after start', last-rc-change='Fri Jan 26 19:52:12 2018', queued=0ms, exec=1322ms * haproxy-bundle-docker-1_start_0 on overcloud-controller-1 'unknown error' (1): call=105, status=complete, exitreason='Newly created docker container exited after start', last-rc-change='Fri Jan 26 19:52:07 2018', queued=0ms, exec=1328ms * haproxy-bundle-docker-0_start_0 on overcloud-controller-1 'unknown error' (1): call=107, status=complete, exitreason='Newly created docker container exited after start', last-rc-change='Fri Jan 26 19:52:09 2018', queued=0ms, exec=1310m [root@overcloud-controller-0 log]# pcs resource show haproxy-bundle Bundle: haproxy-bundle Docker: image=192.168.249.8:5000/rhosp12/openstack-haproxy:pcmklatest network=host options="--user=root --log-driver=journald -e KOLLA_CONFIG_STRATEGY=COPY_ALWAYS" replicas=3 run-command="/bin/bash /usr/local/bin/kolla_start" Storage Mapping: options=ro source-dir=/var/lib/kolla/config_files/haproxy.json target-dir=/var/lib/kolla/config_files/config.json (haproxy-cfg-files) options=ro source-dir=/var/lib/config-data/puppet-generated/haproxy/ target-dir=/var/lib/kolla/config_files/src (haproxy-cfg-data) options=ro source-dir=/etc/hosts target-dir=/etc/hosts (haproxy-hosts) options=ro source-dir=/etc/localtime target-dir=/etc/localtime (haproxy-localtime) options=ro source-dir=/etc/pki/ca-trust/extracted target-dir=/etc/pki/ca-trust/extracted (haproxy-pki-extracted) options=ro source-dir=/etc/pki/tls/certs/ca-bundle.crt target-dir=/etc/pki/tls/certs/ca-bundle.crt (haproxy-pki-ca-bundle-crt) options=ro source-dir=/etc/pki/tls/certs/ca-bundle.trust.crt target-dir=/etc/pki/tls/certs/ca-bundle.trust.crt (haproxy-pki-ca-bundle-trust-crt) options=ro source-dir=/etc/pki/tls/cert.pem target-dir=/etc/pki/tls/cert.pem (haproxy-pki-cert) options=rw source-dir=/dev/log target-dir=/dev/log (haproxy-dev-log) [root@overcloud-controller-0 log]# docker logs haproxy-bundle Error: No such container: haproxy-bundle I've nothing in /var/log/containers: [root@overcloud-controller-0 containers]# find /var/log/containers/ -type f /var/log/containers/horizon/horizon.log /var/log/containers/memcached/memcached.log [root@overcloud-controller-0 containers]# cat /var/log/containers/horizon/horizon.log [root@overcloud-controller-0 containers]# cat /var/log/containers/memcached/memcached.log [root@overcloud-controller-0 containers]# docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES f2e1cdba9d8c 192.168.249.8:5000/rhosp12/openstack-redis:pcmklatest "/bin/bash /usr/local" About an hour ago Up About an hour redis-bundle-docker-0 46ca7063f3a2 192.168.249.8:5000/rhosp12/openstack-haproxy:latest "/bin/bash -c 'cp -a " About an hour ago Exited (0) About an hour ago haproxy_init_bundle 209388242c1b 192.168.249.8:5000/rhosp12/openstack-redis:latest "/bin/bash -c 'cp -a " About an hour ago Exited (0) About an hour ago redis_init_bundle 3d46fed9ec53 192.168.249.8:5000/rhosp12/openstack-mariadb:pcmklatest "/bin/bash /usr/local" About an hour ago Up About an hour galera-bundle-docker-0 6dce70e18d54 192.168.249.8:5000/rhosp12/openstack-rabbitmq:pcmklatest "/bin/bash /usr/local" About an hour ago Up About an hour (healthy) rabbitmq-bundle-docker-0 3a03dbbc8a81 192.168.249.8:5000/rhosp12/openstack-mariadb:latest "/bin/bash -c 'cp -a " About an hour ago Exited (0) About an hour ago mysql_init_bundle e46016adccff 192.168.249.8:5000/rhosp12/openstack-mariadb:latest "kolla_start" About an hour ago Up About an hour clustercheck 5f2cbff5bafe 192.168.249.8:5000/rhosp12/openstack-aodh-api:latest "/bin/bash -c 'chown " About an hour ago Exited (0) About an hour ago aodh_init_log 7aa812aa4511 192.168.249.8:5000/rhosp12/openstack-panko-api:latest "/bin/bash -c 'chown " About an hour ago Exited (0) About an hour ago panko_init_log bec9d5d9805c 192.168.249.8:5000/rhosp12/openstack-horizon:latest "/bin/bash -c 'touch " About an hour ago Exited (0) About an hour ago horizon_fix_perms d875072cbe0b 192.168.249.8:5000/rhosp12/openstack-keystone:latest "/bin/bash -c 'chown " About an hour ago Exited (0) About an hour ago keystone_init_log 6e0ba98027eb 192.168.249.8:5000/rhosp12/openstack-glance-api:latest "/bin/bash -c 'chown " About an hour ago Exited (0) About an hour ago glance_init_logs 88fc6f1844eb 192.168.249.8:5000/rhosp12/openstack-heat-engine:latest "/bin/bash -c 'chown " About an hour ago Exited (0) About an hour ago heat_init_log 6fcf97a39c83 192.168.249.8:5000/rhosp12/openstack-nova-api:latest "/bin/bash -c 'chown " About an hour ago Exited (0) About an hour ago nova_init_logs 454c3bdc3c17 192.168.249.8:5000/rhosp12/openstack-rabbitmq:latest "/bin/bash -c 'cp -a " About an hour ago Exited (0) About an hour ago rabbitmq_init_bundle b9cd2596f74e 192.168.249.8:5000/rhosp12/openstack-gnocchi-api:latest "/bin/bash -c 'chown " About an hour ago Exited (0) About an hour ago gnocchi_init_log 4a2b2cbb4e3c registry.access.redhat.com/rhceph/rhceph-2-rhel7:latest "/entrypoint.sh" About an hour ago Up About an hour ceph-rgw-overcloud-controller-0 acd69513e728 registry.access.redhat.com/rhceph/rhceph-2-rhel7:latest "/entrypoint.sh" About an hour ago Up About an hour ceph-mon-overcloud-controller-0 8371566df993 192.168.249.8:5000/rhosp12/openstack-mariadb:latest "/bin/bash -c '/usr/b" About an hour ago Exited (0) About an hour ago mysql_image_tag fb2ab111c595 192.168.249.8:5000/rhosp12/openstack-memcached:latest "/bin/bash -c 'source" About an hour ago Up About an hour memcached 8af6da99bfad 192.168.249.8:5000/rhosp12/openstack-haproxy:latest "/bin/bash -c '/usr/b" About an hour ago Exited (0) 16 minutes ago haproxy_image_tag 1be6aa4060dc 192.168.249.8:5000/rhosp12/openstack-mariadb:latest "bash -ecx 'if [ -e /" About an hour ago Exited (0) About an hour ago mysql_bootstrap a2b23bc1e701 192.168.249.8:5000/rhosp12/openstack-redis:latest "/bin/bash -c '/usr/b" About an hour ago Exited (0) About an hour ago redis_image_tag 43027667f6a9 192.168.249.8:5000/rhosp12/openstack-rabbitmq:latest "/bin/bash -c '/usr/b" About an hour ago Exited (0) About an hour ago rabbitmq_image_tag 3734fa7c3ef2 192.168.249.8:5000/rhosp12/openstack-rabbitmq:latest "kolla_start" About an hour ago Exited (0) About an hour ago rabbitmq_bootstrap cb7f05492c04 192.168.249.8:5000/rhosp12/openstack-memcached:latest "/bin/bash -c 'source" About an hour ago Exited (0) About an hour ago memcached_init_logs 3a69051db3c7 192.168.249.8:5000/rhosp12/openstack-mariadb:latest "chown -R mysql: /var" About an hour ago Exited (0) About an hour ago mysql_data_ownership "docker logs haproxy_init_bundle" show me the log of a successful run of puppet. [root@overcloud-controller-0 log]# cat /var/log/cinder/cinder-manage.log 2018-01-26 19:54:03.876 122212 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. -1 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '192.168.140.252' ([Errno 113] No route to host)") 2018-01-26 19:54:16.895 122212 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. -2 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '192.168.140.252' ([Errno 113] No route to host)") 2018-01-26 19:54:29.911 122212 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. -3 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '192.168.140.252' ([Errno 113] No route to host)") 2018-01-26 19:54:42.928 122212 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. -4 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '192.168.140.252' ([Errno 113] No route to host)") 2018-01-26 19:54:55.945 122212 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. -5 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '192.168.140.252' ([Errno 113] No route to host)") 2018-01-26 19:55:08.963 122212 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. -6 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '192.168.140.252' ([Errno 113] No route to host)") 2018-01-26 19:55:21.979 122212 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. -7 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '192.168.140.252' ([Errno 113] No route to host)") 2018-01-26 19:55:34.996 122212 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. -8 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '192.168.140.252' ([Errno 113] No route to host)") 2018-01-26 19:55:48.011 122212 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. -9 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '192.168.140.252' ([Errno 113] No route to host)") 2018-01-26 19:56:01.027 122212 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. -10 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '192.168.140.252' ([Errno 113] No route to host)") 2018-01-26 19:56:14.043 122212 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. -11 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '192.168.140.252' ([Errno 113] No route to host)") 2018-01-26 19:56:27.059 122212 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. -12 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '192.168.140.252' ([Errno 113] No route to host)") 2018-01-26 19:56:40.076 122212 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. -13 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '192.168.140.252' ([Errno 113] No route to host)") 2018-01-26 19:56:53.092 122212 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. -14 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '192.168.140.252' ([Errno 113] No route to host)") 2018-01-26 19:57:06.099 122212 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. -15 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '192.168.140.252' ([Errno 113] No route to host)") 2018-01-26 19:57:19.115 122212 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. -16 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '192.168.140.252' ([Errno 113] No route to host)") 2018-01-26 19:57:32.131 122212 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. -17 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '192.168.140.252' ([Errno 113] No route to host)") 2018-01-26 19:57:45.148 122212 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. -18 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '192.168.140.252' ([Errno 113] No route to host)") 2018-01-26 19:57:58.163 122212 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. -19 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '192.168.140.252' ([Errno 113] No route to host)") 2018-01-26 19:58:10.514 122212 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. -20 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '192.168.140.252' ([Errno 113] No route to host)") 2018-01-26 19:58:23.522 122212 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. -21 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '192.168.140.252' ([Errno 113] No route to host)") 2018-01-26 19:58:36.540 122212 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. -22 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '192.168.140.252' ([Errno 113] No route to host)") 2018-01-26 19:58:49.548 122212 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. -23 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '192.168.140.252' ([Errno 113] No route to host)") [root@overcloud-controller-0 log]# ping -c1 192.168.140.252 PING 192.168.140.252 (192.168.140.252) 56(84) bytes of data. ^C --- 192.168.140.252 ping statistics --- 1 packets transmitted, 0 received, 100% packet loss, time 0ms [root@overcloud-controller-0 log]# ip a |grep 192.168.140 inet 192.168.140.152/24 brd 192.168.140.255 scope global vlan140 Any advice would be welcome.