Bug 1398013
Summary: | [Backwards Compatibility] UC10-OC9 deploy completes successfully, but the controllers are unreachable post-deployment and the entire setup is non-op | ||
---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Dan Yasny <dyasny> |
Component: | rhosp-director | Assignee: | Angus Thomas <athomas> |
Status: | CLOSED DUPLICATE | QA Contact: | Dan Yasny <dyasny> |
Severity: | urgent | Docs Contact: | |
Priority: | high | ||
Version: | 9.0 (Mitaka) | CC: | apetrich, beagles, ccamacho, dbecker, dyasny, jcoufal, jschluet, jslagle, mandreou, mburns, morazi, nyechiel, ohochman, rhel-osp-director-maint, sasha |
Target Milestone: | ga | Keywords: | ZStream |
Target Release: | 9.0 (Mitaka) | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2016-11-30 23:59:12 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Dan Yasny
2016-11-23 21:09:26 UTC
I'm looking at this also. Happens to me on clean OSP9.0 deployment: Environment: openstack-puppet-modules-8.1.8-3.el7ost.noarch instack-undercloud-4.0.0-15.el7ost.noarch openstack-tripleo-heat-templates-liberty-2.0.0-40.el7ost.noarch openstack-tripleo-heat-templates-2.0.0-40.el7ost.noarch The controllers became unreachable after a reboot. (In reply to Alexander Chuzhoy from comment #4) > Happens to me on clean OSP9.0 deployment: > Environment: > openstack-puppet-modules-8.1.8-3.el7ost.noarch > instack-undercloud-4.0.0-15.el7ost.noarch > openstack-tripleo-heat-templates-liberty-2.0.0-40.el7ost.noarch > openstack-tripleo-heat-templates-2.0.0-40.el7ost.noarch > > > The controllers became unreachable after a reboot. Confirmed in my env - this is reproducing when I reboot the nodes @jarda as discussed just now - reminder that you want to move to DFG:DF (happens on OSP9 deployment?) I believe some combination of dan/sasha will provide the env too Additional findings: 1. As was recommended, I tried to restart the network service on the controllers. That made the controllers reachable, but trying to contact the overcloud endpoints produced a 503 error 2. Rebooted the controllers again - they are unreachable again On a side note, issuing the "reboot" command on the controllers made them hang (probably some service or process holding everything back), had to do a powercycle instead. [root@overcloud-controller-0 ~]# ovs-vsctl show 150c5a41-48b6-4e54-8fbf-58874b11a578 Bridge br-ex Port "eth0" Interface "eth0" Port br-ex Interface br-ex type: internal Bridge br-int fail_mode: secure Port "tapd48b845e-02" tag: 2 Interface "tapd48b845e-02" type: internal Port "tapadbb4a12-31" tag: 3 Interface "tapadbb4a12-31" type: internal Port "ha-fb411560-ee" tag: 4 Interface "ha-fb411560-ee" type: internal Port int-br-ex Interface int-br-ex type: patch options: {peer=phy-br-ex} Port "qg-8f932a91-18" tag: 5 Interface "qg-8f932a91-18" type: internal Port "tap3ee36087-11" tag: 1 Interface "tap3ee36087-11" type: internal Port patch-tun Interface patch-tun type: patch options: {peer=patch-int} Port "qr-bd5fdd91-cb" tag: 3 Interface "qr-bd5fdd91-cb" type: internal Port br-int Interface br-int type: internal Port "qr-ea15e493-86" tag: 1 Interface "qr-ea15e493-86" type: internal Port "qr-582362df-cb" tag: 2 Interface "qr-582362df-cb" type: internal Bridge br-tun fail_mode: secure Port "gre-c0a8960d" Interface "gre-c0a8960d" type: gre options: {df_default="true", in_key=flow, local_ip="192.168.150.12", out_key=flow, remote_ip="192.168.150.13"} Port patch-int Interface patch-int type: patch options: {peer=patch-tun} Port "vxlan-c0a8960d" Interface "vxlan-c0a8960d" type: vxlan options: {df_default="true", in_key=flow, local_ip="192.168.150.12", out_key=flow, remote_ip="192.168.150.13"} Port "vxlan-c0a8960a" Interface "vxlan-c0a8960a" type: vxlan options: {df_default="true", in_key=flow, local_ip="192.168.150.12", out_key=flow, remote_ip="192.168.150.10"} Port "vxlan-c0a8960b" Interface "vxlan-c0a8960b" type: vxlan options: {df_default="true", in_key=flow, local_ip="192.168.150.12", out_key=flow, remote_ip="192.168.150.11"} Port "gre-c0a8960a" Interface "gre-c0a8960a" type: gre options: {df_default="true", in_key=flow, local_ip="192.168.150.12", out_key=flow, remote_ip="192.168.150.10"} Port br-tun Interface br-tun type: internal Port "gre-c0a8960b" Interface "gre-c0a8960b" type: gre options: {df_default="true", in_key=flow, local_ip="192.168.150.12", out_key=flow, remote_ip="192.168.150.11"} ovs_version: "2.5.0" [root@overcloud-controller-0 ~]# ping 192.0.2.1 -c1 PING 192.0.2.1 (192.0.2.1) 56(84) bytes of data. --- 192.0.2.1 ping statistics --- 1 packets transmitted, 0 received, 100% packet loss, time 0ms [root@overcloud-controller-0 ~]# ifdown br-ex; ifup br-ex; ifup eth0 [root@overcloud-controller-0 ~]# ping 192.0.2.1 -c1 PING 192.0.2.1 (192.0.2.1) 56(84) bytes of data. 64 bytes from 192.0.2.1: icmp_seq=1 ttl=64 time=2.38 ms --- 192.0.2.1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 2.388/2.388/2.388/0.000 ms Restarting network service works, but once the nodes are rebooted the issue is back. But if one runs "ifdown br-ex; ifup br-ex; ifup eth0" then the nodes are reachable after reboot. This reminds me of: https://bugzilla.redhat.com/show_bug.cgi?id=1394890 Is this bug a duplicate of ^ Environment openstack-neutron-common-8.1.2-12.el7ost.noarch openstack-neutron-ml2-8.1.2-12.el7ost.noarch openstack-neutron-openvswitch-8.1.2-12.el7ost.noarch openstack-neutron-8.1.2-12.el7ost.noarch #10 workaround seems to work for me also (In reply to Alexander Chuzhoy from comment #10) > Restarting network service works, but once the nodes are rebooted the issue > is back. > > But if one runs "ifdown br-ex; ifup br-ex; ifup eth0" then the nodes are > reachable after reboot. > > > This reminds me of: > https://bugzilla.redhat.com/show_bug.cgi?id=1394890 > > Is this bug a duplicate of ^ It appears to be. Setting to DFG:Networking. Issue gone with the current OSP9 images in place setting to verified *** This bug has been marked as a duplicate of bug 1394890 *** |