Bug 2151958 - Data plane disruption during update from 16.2.1, 16.2.0, or any 16.1 release to 16.2.2 or later in ML2/OVN deployments
Summary: Data plane disruption during update from 16.2.1, 16.2.0, or any 16.1 release ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 16.2 (Train)
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: z5
: 16.2 (Train on RHEL 8.4)
Assignee: Sofer Athlan-Guyot
QA Contact: Archana Singh
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-12-08 17:44 UTC by Sofer Athlan-Guyot
Modified: 2023-08-21 15:53 UTC (History)
9 users (show)

Fixed In Version: openstack-tripleo-heat-templates-11.6.1-2.20230225005015.2e5f254.el8ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-04-26 12:17:18 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker OSP-20822 0 None None None 2022-12-08 17:50:05 UTC
Red Hat Knowledge Base (Solution) 6990594 0 None None None 2022-12-15 16:41:43 UTC
Red Hat Product Errata RHBA-2023:1763 0 None None None 2023-04-26 12:17:51 UTC

Description Sofer Athlan-Guyot 2022-12-08 17:44:31 UTC
This bug was initially created as a copy of Bug #2094265

I am copying this bug because: The code provided in 2094265 fixes *deployment* but not update. Creating this bugzilla to track the fix for update.

Description of problem:
Customer has upgraded one of our rhosp from 16.2.1 to 16.2.2 and during the procedure we saw some impact to the VMs running in there. It seems that it happened during the ovn-controller container refresh where at least some of the VMs experienced connection timeouts.

It seems that exactly while following the below step was the one that caused the issue and as mentioned, everything auto-recovered after 60-90 seconds.

https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.2/html-single/keeping_red_hat_openstack_platform_updated/index#proc_updating-ovn-controller-container_updating-overcloud

Version-Release number of selected component (if applicable):
RHOSP 16.2.1

How reproducible:
Upgrade from RHOSP 16.2.1 to 16.2.2

Steps to Reproduce:
1. Upgrade 16.2.1 to 16.2.2



Actual results:
Upgraded successfully but there VM connectivity drop for 60 - 90 seconds while data plan upgrade.

Expected results:
successful upgrade without any downtime.

Additional info:

Comment 3 Sofer Athlan-Guyot 2022-12-15 16:53:29 UTC
Hi,

added KCS link. 

For the doc side of this, @kgilliga we would need a new entry in "Known issues that might block an update" in https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.2/html-single/keeping_red_hat_openstack_platform_updated/index#assembly_updating-the-overcloud_keeping-updated . Something along those lines:

    "Data plane cut might be experienced when using OVN"
    
    To prevent such a cut from happening please run the code associated with the KCS 6990594 before running the ovn controller update.

Comment 22 errata-xmlrpc 2023-04-26 12:17:18 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform 16.2.5 (Train) bug fix and enhancement advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:1763


Note You need to log in before you can comment on or make changes to this bug.