Bug 1388546
Summary: | Upgrade of openvswitch-2.4.0-1.el7 makes ip disappears. (osp8) | |||
---|---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Omri Hochman <ohochman> | |
Component: | openstack-tripleo-heat-templates | Assignee: | Marios Andreou <mandreou> | |
Status: | CLOSED ERRATA | QA Contact: | Alexander Chuzhoy <sasha> | |
Severity: | urgent | Docs Contact: | ||
Priority: | medium | |||
Version: | 8.0 (Liberty) | CC: | achernet, alan_bishop, aloughla, apevec, arkady_kanevsky, audra_cooper, cdevine, christopher_dearborn, chrisw, david_paterson, jcoufal, John_walsh, kazen, kbader, kurt_hey, lbezdick, lhh, mandreou, markmc, mburns, randy_perryman, rhel-osp-director-maint, rhos-maint, rsussman, sathlang, srevivo | |
Target Milestone: | async | Keywords: | Reopened, Triaged, ZStream | |
Target Release: | 8.0 (Liberty) | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | openstack-tripleo-heat-templates-0.8.14-23.el7ost | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | ||
Clone Of: | 1364540 | |||
: | 1394322 (view as bug list) | Environment: | ||
Last Closed: | 2017-01-05 14:37:02 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | 1364540 | |||
Bug Blocks: | 1261979, 1305654, 1337794, 1388543, 1394322, 1406478 |
Comment 1
Marios Andreou
2016-10-31 14:40:36 UTC
we need to make sure that osp8 that runs on rhel7.2 can be updated to latest osp8 that runs on rhel7.3, without encountering the issue of losing IP on overcloud during the upgrade of openvswitch . Environment: openstack-tripleo-heat-templates-0.8.14-23.el7ost.noarch Deployed 8.0GA with rhel7.2 Updated to latest OSP8 with rhel7.3 The osp version is: openvswitch-2.4.0-2.el7_2.x86_64 The controllers were reachable. Rebooted all the nodes in the setup and verified that all nodes are reachable via ctlplane network after reboot. Marios, Is this enough to verify this bug or you'd like me to check something else? Slight correction to comment #13; The openvswitch version is not osp of course, meant to write "ovs". Note, that after minor update of 8.0 - it's not 2.5 (In reply to Alexander Chuzhoy from comment #14) > Slight correction to comment #13; > The openvswitch version is not osp of course, meant to write "ovs". > > Note, that after minor update of 8.0 - it's not 2.5 right this BZ and the 'fix' special case handling we carry in the review tracker are about upgrading openvswitch-2.4 to 2.5 so if you're not getting to openvswitch 2.5 then its not verifying here imo - maybe we need to ping mburns about downstream build/status of openvswitch 2.5 on 8 FailedQA: Environment: openvswitch-2.5.0-14.git20160727.el7fdp.x86_64 instack-undercloud-2.2.7-8.el7ost.noarch Deployed OSP8.0GA: instack-undercloud-2.2.7-4.el7ost.noarch openvswitch-2.4.0-2.el7_2.x86_64 Checked the status of services: ● ovirt-guest-agent.service loaded failed failed oVirt Guest Agent Checked the IP with: ip a s dev br-ctlplane: 5: br-ctlplane: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN link/ether 00:18:86:d3:7b:59 brd ff:ff:ff:ff:ff:ff inet 192.0.2.1/24 brd 192.0.2.255 scope global br-ctlplane valid_lft forever preferred_lft forever inet6 fe80::218:86ff:fed3:7b59/64 scope link valid_lft forever preferred_lft forever Ran yum update on the undercloud: relevant rpms version: openvswitch-2.5.0-14.git20160727.el7fdp.x86_64 instack-undercloud-2.2.7-8.el7ost.noarch List of failed services: ● httpd.service loaded failed failed The Apache HTTP Server ● openstack-ceilometer-api.service loaded failed failed OpenStack ceilometer API service ● openstack-heat-api-cfn.service loaded failed failed Openstack Heat CFN-compatible API Service ● openstack-heat-api-cloudwatch.service loaded failed failed OpenStack Heat CloudWatch API Service ● openstack-heat-api.service loaded failed failed OpenStack Heat API Service ● openstack-ironic-api.service loaded failed failed OpenStack Ironic API service ● ovirt-guest-agent.service loaded failed failed oVirt Guest Agent ● rabbitmq-server.service loaded failed failed RabbitMQ broker The IP is gone from br-ctlplane interface. Output from ip a s dev br-ctlplane: 12: br-ctlplane: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN link/ether 00:18:86:d3:7b:59 brd ff:ff:ff:ff:ff:ff Note: If I reboot the undercloud, the IP "returns" after reboot and the list of failed services is reduced to ● ovirt-guest-agent.service loaded failed failed oVirt Guest Agent Which was exactly the case before the update. [stack@instack ~]$ ip a s dev br-ctlplane 8: br-ctlplane: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN qlen 1000 link/ether 00:18:86:d3:7b:59 brd ff:ff:ff:ff:ff:ff inet 192.0.2.1/24 brd 192.0.2.255 scope global br-ctlplane valid_lft forever preferred_lft forever inet6 fe80::218:86ff:fed3:7b59/64 scope link valid_lft forever preferred_lft forever The subsequent overcloud update failed after a long time. Checking where the update took place: [stack@instack ~]$ for i in 192.0.2.{7..12}; do echo $i; ssh heat-admin@$i "hostname; sudo grep -i update /var/log/yum.log"; done 192.0.2.7 overcloud-controller-2.localdomain 192.0.2.8 overcloud-compute-1.localdomain 192.0.2.9 overcloud-compute-0.localdomain 192.0.2.10 overcloud-controller-0.localdomain 192.0.2.11 overcloud-cephstorage-0.localdomain Dec 20 21:49:19 Updated: 1:openstack-puppet-modules-7.1.5-1.el7ost.noarch 192.0.2.12 overcloud-controller-1.localdomain (In reply to Alexander Chuzhoy from comment #16) > FailedQA: > Ran yum update on the undercloud: > relevant rpms version: > openvswitch-2.5.0-14.git20160727.el7fdp.x86_64 > instack-undercloud-2.2.7-8.el7ost.noarch > @Sasha as mentioned on irc, losing IP after upgrading openvswitch on the *undercloud* is a different (obviously related) issue and for 9->10 we added the explicit sudo systemctl stop openvswitch before the undercloud upgrade. Can you please check if it works as a workaround here too. If not then we probably need to reach out to the ovs guys for more debugging here (i.e. file a distinct bug for the undercloud 2.4-2.5 upgrade undercloud OSP8) WRT the overcloud update as per comment #18 I'd rather first make sure we have a good setup (undercloud upgrade completed fine with the workaround) and then see if the overcloud update fails. Verified: Environment: openvswitch-2.5.0-14.git20160727.el7fdp.x86_64 Following comment #23 I reran the update on OSP8 with the following procedure: 1) on the undercloud node: Stop all services starting with: openstack-* neutron-* openvswitch.service 2) Make sure the updates are available (take care of missing repos if needed) 3) openstack undercloud upgrade Then I ran the overcloud normally and it completed successfully. I was able to ping all OC nodes. Steps I am taking: 1. pcs cluster stop 2. systemctl stop openvswitch.service 3. yum update openvswtich* 4. systemctl start openvswitch.service 5. ip a - validate all IP's 6. pcs cluster start 7. pcs status until all nodes are back in cluster repeat on next controller --- computes 1. systemctl stop neutron/openstac/openvswitch 2. yum update openvswitch* 3. systemctl start openvswitch 4. ip a validate IP's 5. systemctl start neutron/openstack 6. *** Bug 1406478 has been marked as a duplicate of this bug. *** Note: On a setup with successful minor update,I get the following openstack-tripleo-heat-templates version after updating the undercloud: openstack-tripleo-heat-templates-0.8.14-24.el7ost.noarch Hi, Continuing the discussion from https://bugzilla.redhat.com/show_bug.cgi?id=1406478. > I see that nopostrun is not part of Liberty tag, but the Mitaka > branch has it. I made a typo in the original comment. The command to run is grep -r postun ~/pilot/templates But according to your previous comment, you don't have the latest version of the tht package. The code was backported on downstream only as liberty was EOL at that time and the code couldn't be pushed upstream. The rpm that hold the necessary code is openstack-tripleo-heat-templates-0.8.14-24.el7ost.noarch.rpm Could you try again after having upgraded the openstack-tripleo-heat-templates package ? Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2017-0026.html |