Bug 1488601
Summary: | Overcloud deploy fails on a timeout | ||
---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Dan Yasny <dyasny> |
Component: | openstack-tripleo-heat-templates | Assignee: | Juan Antonio Osorio <josorior> |
Status: | CLOSED DUPLICATE | QA Contact: | Gurenko Alex <agurenko> |
Severity: | urgent | Docs Contact: | |
Priority: | high | ||
Version: | 12.0 (Pike) | CC: | aschultz, dyasny, mburns, nkinder, racedoro, rhel-osp-director-maint |
Target Milestone: | --- | Keywords: | Triaged |
Target Release: | --- | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2017-09-07 16:44:57 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Dan Yasny
2017-09-05 19:36:03 UTC
Please provide templates & sosreport for nodes. This bug report is not sufficient to determine what is happening. (undercloud) [stack@undercloud-0 ~]$ cat oc_ironic.yaml parameter_defaults: NtpServer: ["clock.redhat.com","clock2.redhat.com"] IronicEnabledDrivers: - pxe_ipmitool - fake NovaSchedulerDefaultFilters: - RetryFilter - AggregateInstanceExtraSpecsFilter - AvailabilityZoneFilter - RamFilter - DiskFilter - ComputeFilter - ComputeCapabilitiesFilter - ImagePropertiesFilter IronicCleaningDiskErase: metadata IronicIPXEEnabled: true GlanceBackend: "file" IronicCleaningNetwork: baremetal The rest are standard THT and IR2 default templates. sosreports at http://rhos-release.virt.bos.redhat.com/log/bz1488601 //if you ping me during EST hours I'll simply give you access to the setup I had a quick look at the sosreports, no conclusion but could be a network issue or a problem bringing up the VIP for the Internal network: I can see lots of these in controller-0: var/log/messages:Sep 5 21:21:15 localhost journal: 2017-09-06 01:21:15.726 11 WARNING oslo_db.sqlalchemy.engines [-] SQL connection failed. -2924 attempts left.: DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '172.17.1.12' ([Errno 113] No route to host)”) That’s VLAN 20 for the Internal network. Then I can see this: $ cat ip_neigh_show fe80::5054:ff:fe2a:4290 dev br-ex lladdr 52:54:00:2a:42:90 router STALE 192.168.24.1 dev eth0 lladdr 52:54:00:3a:a6:f9 REACHABLE 172.17.1.14 dev vlan20 lladdr 7a:ed:8b:f9:db:11 STALE 10.0.0.1 dev br-ex lladdr 52:54:00:2a:42:90 STALE 172.17.1.12 dev vlan20 FAILED 10.0.0.111 dev br-ex lladdr 52:54:00:05:fe:73 STALE 192.168.24.7 dev eth0 lladdr 52:54:00:7c:7e:57 STALE 172.17.2.16 dev vlan50 lladdr fe:04:a0:8f:e0:ef STALE 172.17.3.14 dev vlan30 lladdr b6:03:82:a2:97:76 STALE 172.17.4.10 dev vlan40 lladdr 2e:87:84:fa:33:94 STALE 172.17.1.21 dev vlan20 lladdr 22:67:ed:c8:7a:58 REACHABLE 172.17.1.15 dev vlan20 lladdr 22:67:ed:c8:7a:58 STALE 172.17.4.19 dev vlan40 lladdr 2e:87:84:fa:33:94 STALE 172.17.3.13 dev vlan30 lladdr e6:42:ad:03:14:23 STALE 172.17.4.13 dev vlan40 lladdr 02:35:56:0c:c8:aa STALE 172.17.3.15 dev vlan30 lladdr b6:03:82:a2:97:76 STALE 10.0.0.101 dev br-ex lladdr 52:54:00:05:fe:73 STALE 172.17.2.15 dev vlan50 lladdr 46:fe:c1:6f:9c:b6 STALE 172.17.3.11 dev vlan30 lladdr ba:04:8e:35:ad:c8 STALE 172.17.1.20 dev vlan20 lladdr fa:85:17:3f:c6:32 REACHABLE 172.17.2.13 dev vlan50 lladdr c6:8d:be:3a:6d:4c STALE 10.0.0.105 dev br-ex lladdr 52:54:00:40:c7:9b STALE It looks like a networking problem. In controller-1: cat ip_neigh_show fe80::5054:ff:fe2a:4290 dev br-ex lladdr 52:54:00:2a:42:90 router STALE 172.17.1.15 dev vlan20 lladdr 22:67:ed:c8:7a:58 STALE 10.0.0.1 dev br-ex lladdr 52:54:00:2a:42:90 REACHABLE 172.17.2.21 dev vlan50 lladdr c6:fd:c8:f4:ee:85 STALE 172.17.4.10 dev vlan40 lladdr 2e:87:84:fa:33:94 STALE 172.17.4.19 dev vlan40 lladdr 2e:87:84:fa:33:94 STALE 172.17.1.21 dev vlan20 lladdr 22:67:ed:c8:7a:58 REACHABLE 172.17.1.12 dev vlan20 INCOMPLETE 172.17.4.11 dev vlan40 lladdr 06:e1:0e:5f:fb:15 STALE 192.168.24.7 dev eth0 lladdr 52:54:00:87:d8:55 STALE 192.168.24.1 dev eth0 lladdr 52:54:00:3a:a6:f9 REACHABLE 172.17.3.22 dev vlan30 lladdr 7a:fd:3c:24:93:06 STALE 10.0.0.104 dev br-ex lladdr 52:54:00:cc:dc:ce STALE 172.17.1.16 dev vlan20 lladdr e6:be:7b:0c:1d:ad REACHABLE And the same in controler-2. Also, pcs_status says the VIP is stopped: $ grep 172.17.1.12 pcs_status ip-172.17.1.12 (ocf::heartbeat:IPaddr2): Stopped Then I also see this: sos_commands/logs/journalctl_--no-pager_--boot:Sep 05 14:28:38 controller-2 puppet-user[34344]: [ALERT] 247/142838 (339) : parsing [/etc/haproxy/haproxy.cfg20170905-12-11vkb1s:132] : 'bind 172.17.1.12:4 43' : unable to load SSL private key from PEM file '/etc/pki/tls/private/overcloud_endpoint.pem’. And this: sos_commands/logs/journalctl_--no-pager_--boot:Sep 05 14:28:38 controller-2 puppet-user[34344]: (/Stage[main]/Haproxy/Haproxy::Instance[haproxy]/Haproxy::Config[haproxy]/Concat[/etc/haproxy/haproxy.cfg]/File[/etc/haproxy/haproxy.cfg]/content) [ALERT] 247/142838 (339) : Proxy 'horizon': no SSL certificate specified for bind '172.17.1.12:443' at [/etc/haproxy/haproxy.cfg20170905-12-11vkb1s:132] (use 'crt’). Not sure if that’s a problem though, could it be worth trying with SSL and compare to try isolating the issue? I can’t find anything in corosync saying why that IP can’t be brought up though. If this is being deployed with SSL we might be hitting https://bugs.launchpad.net/tripleo/+bug/1715132 (In reply to Alex Schultz from comment #4) > If this is being deployed with SSL we might be hitting > https://bugs.launchpad.net/tripleo/+bug/1715132 Yes, I am about to start a non-ssl reproducer, will confirm once it's done Looking in the logs, it does appear to be the upstream bug. ./messages:Sep 5 10:28:54 localhost journal: [ALERT] 247/142854 (339) : Proxy 'panko': no SSL certificate specified for bind '10.0.0.101:13779' at [/etc/haproxy/haproxy.cfg20170905-12-1ksguai:250] (use 'crt'). ./messages:Sep 5 10:28:54 localhost journal: [ALERT] 247/142854 (339) : Proxy 'swift_proxy_server': no SSL certificate specified for bind '10.0.0.101:13808' at [/etc/haproxy/haproxy.cfg20170905-12-1ksguai:275] (use 'crt'). ./messages:Sep 5 10:34:45 localhost journal: [ALERT] 247/143445 (3273) : Proxy 'aodh': no SSL certificate specified for bind '10.0.0.101:13042' at [/etc/haproxy/haproxy.cfg20170905-9-e6vwjg:26] (use 'crt'). ./messages:Sep 5 10:34:45 localhost journal: [ALERT] 247/143445 (3273) : Proxy 'cinder': no SSL certificate specified for bind '10.0.0.101:13776' at [/etc/haproxy/haproxy.cfg20170905-9-e6vwjg:39] (use 'crt'). ./messages:Sep 5 10:34:45 localhost journal: [ALERT] 247/143445 (3273) : Proxy 'glance_api': no SSL certificate specified for bind '10.0.0.101:13292' at [/etc/haproxy/haproxy.cfg20170905-9-e6vwjg:52] (use 'crt'). ./messages:Sep 5 10:34:45 localhost journal: [ALERT] 247/143445 (3273) : Proxy 'gnocchi': no SSL certificate specified for bind '10.0.0.101:13041' at [/etc/haproxy/haproxy.cfg20170905-9-e6vwjg:65] (use 'crt'). ./messages:Sep 5 10:34:45 localhost journal: [ALERT] 247/143445 (3273) : Proxy 'heat_api': no SSL certificate specified for bind '10.0.0.101:13004' at [/etc/haproxy/haproxy.cfg20170905-9-e6vwjg:85] (use 'crt'). ./messages:Sep 5 10:34:45 localhost journal: [ALERT] 247/143445 (3273) : Proxy 'heat_cfn': no SSL certificate specified for bind '10.0.0.101:13005' at [/etc/haproxy/haproxy.cfg20170905-9-e6vwjg:100] (use 'crt'). ./messages:Sep 5 10:34:45 localhost journal: [ALERT] 247/143445 (3273) : Proxy 'heat_cloudwatch': no SSL certificate specified for bind '10.0.0.101:13003' at [/etc/haproxy/haproxy.cfg20170905-9-e6vwjg:115] (use 'crt'). ./messages:Sep 5 10:34:45 localhost journal: [ALERT] 247/143445 (3273) : Proxy 'horizon': no SSL certificate specified for bind '10.0.0.101:443' at [/etc/haproxy/haproxy.cfg20170905-9-e6vwjg:130] (use 'crt'). ./messages:Sep 5 10:34:45 localhost journal: [ALERT] 247/143445 (3273) : Proxy 'horizon': no SSL certificate specified for bind '172.17.1.12:443' at [/etc/haproxy/haproxy.cfg20170905-9-e6vwjg:132] (use 'crt'). ./messages:Sep 5 10:34:45 localhost journal: [ALERT] 247/143445 (3273) : Proxy 'ironic': no SSL certificate specified for bind '10.0.0.101:13385' at [/etc/haproxy/haproxy.cfg20170905-9-e6vwjg:147] (use 'crt'). ./messages:Sep 5 10:34:45 localhost journal: [ALERT] 247/143445 (3273) : Proxy 'keystone_public': no SSL certificate specified for bind '10.0.0.101:13000' at [/etc/haproxy/haproxy.cfg20170905-9-e6vwjg:167] (use 'crt'). ./messages:Sep 5 10:34:45 localhost journal: [ALERT] 247/143445 (3273) : Proxy 'neutron': no SSL certificate specified for bind '10.0.0.101:13696' at [/etc/haproxy/haproxy.cfg20170905-9-e6vwjg:192] (use 'crt'). ./messages:Sep 5 10:34:45 localhost journal: [ALERT] 247/143445 (3273) : Proxy 'nova_novncproxy': no SSL certificate specified for bind '10.0.0.101:13080' at [/etc/haproxy/haproxy.cfg20170905-9-e6vwjg:212] (use 'crt'). ./messages:Sep 5 10:34:45 localhost journal: [ALERT] 247/143445 (3273) : Proxy 'nova_osapi': no SSL certificate specified for bind '10.0.0.101:13774' at [/etc/haproxy/haproxy.cfg20170905-9-e6vwjg:224] (use 'crt'). ./messages:Sep 5 10:34:45 localhost journal: [ALERT] 247/143445 (3273) : Proxy 'nova_placement': no SSL certificate specified for bind '10.0.0.101:13778' at [/etc/haproxy/haproxy.cfg20170905-9-e6vwjg:237] (use 'crt'). ./messages:Sep 5 10:34:45 localhost journal: [ALERT] 247/143445 (3273) : Proxy 'panko': no SSL certificate specified for bind '10.0.0.101:13779' at [/etc/haproxy/haproxy.cfg20170905-9-e6vwjg:250] (use 'crt'). ./messages:Sep 5 10:34:45 localhost journal: [ALERT] 247/143445 (3273) : Proxy 'swift_proxy_server': no SSL certificate specified for bind '10.0.0.101:13808' at [/etc/haproxy/haproxy.cfg20170905-9-e6vwjg:275] (use 'crt'). ./messages:Sep 5 10:34:45 localhost journal: [ALERT] 247/143445 (3273) : Proxy 'aodh': no SSL certificate specified for bind '10.0.0.101:13042' at [/etc/haproxy/haproxy.cfg20170905-9-e6vwjg:26] (use 'crt'). ./messages:Sep 5 10:34:45 localhost journal: [ALERT] 247/143445 (3273) : Proxy 'cinder': no SSL certificate specified for bind '10.0.0.101:13776' at [/etc/haproxy/haproxy.cfg20170905-9-e6vwjg:39] (use 'crt'). ./messages:Sep 5 10:34:45 localhost journal: [ALERT] 247/143445 (3273) : Proxy 'glance_api': no SSL certificate specified for bind '10.0.0.101:13292' at [/etc/haproxy/haproxy.cfg20170905-9-e6vwjg:52] (use 'crt'). ./messages:Sep 5 10:34:45 localhost journal: [ALERT] 247/143445 (3273) : Proxy 'gnocchi': no SSL certificate specified for bind '10.0.0.101:13041' at [/etc/haproxy/haproxy.cfg20170905-9-e6vwjg:65] (use 'crt'). ./messages:Sep 5 10:34:45 localhost journal: [ALERT] 247/143445 (3273) : Proxy 'heat_api': no SSL certificate specified for bind '10.0.0.101:13004' at [/etc/haproxy/haproxy.cfg20170905-9-e6vwjg:85] (use 'crt'). ./messages:Sep 5 10:34:45 localhost journal: [ALERT] 247/143445 (3273) : Proxy 'heat_cfn': no SSL certificate specified for bind '10.0.0.101:13005' at [/etc/haproxy/haproxy.cfg20170905-9-e6vwjg:100] (use 'crt'). ./messages:Sep 5 10:34:45 localhost journal: [ALERT] 247/143445 (3273) : Proxy 'heat_cloudwatch': no SSL certificate specified for bind '10.0.0.101:13003' at [/etc/haproxy/haproxy.cfg20170905-9-e6vwjg:115] (use 'crt'). ./messages:Sep 5 10:34:45 localhost journal: [ALERT] 247/143445 (3273) : Proxy 'horizon': no SSL certificate specified for bind '10.0.0.101:443' at [/etc/haproxy/haproxy.cfg20170905-9-e6vwjg:130] (use 'crt'). ./messages:Sep 5 10:34:45 localhost journal: [ALERT] 247/143445 (3273) : Proxy 'horizon': no SSL certificate specified for bind '172.17.1.12:443' at [/etc/haproxy/haproxy.cfg20170905-9-e6vwjg:132] (use 'crt'). ./messages:Sep 5 10:34:45 localhost journal: [ALERT] 247/143445 (3273) : Proxy 'ironic': no SSL certificate specified for bind '10.0.0.101:13385' at [/etc/haproxy/haproxy.cfg20170905-9-e6vwjg:147] (use 'crt'). ./messages:Sep 5 10:34:45 localhost journal: [ALERT] 247/143445 (3273) : Proxy 'keystone_public': no SSL certificate specified for bind '10.0.0.101:13000' at [/etc/haproxy/haproxy.cfg20170905-9-e6vwjg:167] (use 'crt'). ./messages:Sep 5 10:34:45 localhost journal: [ALERT] 247/143445 (3273) : Proxy 'neutron': no SSL certificate specified for bind '10.0.0.101:13696' at [/etc/haproxy/haproxy.cfg20170905-9-e6vwjg:192] (use 'crt'). ./messages:Sep 5 10:34:45 localhost journal: [ALERT] 247/143445 (3273) : Proxy 'nova_novncproxy': no SSL certificate specified for bind '10.0.0.101:13080' at [/etc/haproxy/haproxy.cfg20170905-9-e6vwjg:212] (use 'crt'). ./messages:Sep 5 10:34:45 localhost journal: [ALERT] 247/143445 (3273) : Proxy 'nova_osapi': no SSL certificate specified for bind '10.0.0.101:13774' at [/etc/haproxy/haproxy.cfg20170905-9-e6vwjg:224] (use 'crt'). ./messages:Sep 5 10:34:45 localhost journal: [ALERT] 247/143445 (3273) : Proxy 'nova_placement': no SSL certificate specified for bind '10.0.0.101:13778' at [/etc/haproxy/haproxy.cfg20170905-9-e6vwjg:237] (use 'crt'). ./messages:Sep 5 10:34:45 localhost journal: [ALERT] 247/143445 (3273) : Proxy 'panko': no SSL certificate specified for bind '10.0.0.101:13779' at [/etc/haproxy/haproxy.cfg20170905-9-e6vwjg:250] (use 'crt'). ./messages:Sep 5 10:34:45 localhost journal: [ALERT] 247/143445 (3273) : Proxy 'swift_proxy_server': no SSL certificate specified for bind '10.0.0.101:13808' at [/etc/haproxy/haproxy.cfg20170905-9-e6vwjg:275] (use 'crt'). This has merged upstream in master, but it is still pending review for stable/pike: https://review.openstack.org/#/c/501127 Confirmed, with SSL disabled, I am able to deploy an overcloud, with the ironic bits enabled. It merged in stable/pike. Please make sure that you're also not overwritting the value of DeployedSSLCertificatePath; just let it use the default value. This is really a duplicate of bug#1486363, which is on track for inclusion in OSP12. Closing this as a duplicate. *** This bug has been marked as a duplicate of bug 1486363 *** |