Bug 1867458 - [RHOSP 13 to 16.1 Upgrades][OvS Offload] Baremetal node hangs on reboot during leapp upgrade
Summary: [RHOSP 13 to 16.1 Upgrades][OvS Offload] Baremetal node hangs on reboot durin...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 16.1 (Train)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: z1
: 16.1 (Train on RHEL 8.2)
Assignee: Saravanan KR
QA Contact: David Rosenfeld
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-08-10 04:01 UTC by Saravanan KR
Modified: 2023-09-14 06:05 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Known Issue
Doc Text:
A Leapp issue causes failure of fast forward upgrades from Red Hat OpenStack (RHOSP) platform 13 to RHOSP 16.1. + A Leapp upgrade from RHEL 7 to RHEL 8 removes all older RHOSP packages and performs an operating system upgrade and reboot. Because Leapp installs os-net-config package at the "overcloud upgrade run" stage, os-net-config-sriov executable is not available for sriov_config serivce to configure virtual functions (VF) and switchdev mode after reboot. As a result, VFs are not configured and switchdevmode is not applied on the physical function (PF) interfaces. + As a workaround, manually create the VFs, apply switchdevmode to the VF interface, and restart the VF interface.
Clone Of:
Environment:
Last Closed: 2020-12-03 10:38:01 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Saravanan KR 2020-08-10 04:01:09 UTC
This bug was initially created as a copy of Bug #1866372

I am copying this bug because: A feature specific analysis (nic partioning and ovs offload) has to be done for FFU 13 to 16.1

Description of problem:

With kernel args issue fixed (BZ #1858673), ComputeOvsDpdkSriov node hangs on leapp upgrade and reboot. Node is not recoverable even after rebooting manually (login via ssh and virtual console is not working).

Comment 2 Saravanan KR 2020-08-10 06:58:26 UTC
For nic paritionioning, the VFs will be used for an openstack network (InternalApi, Storage, Tenant). When the ifcfg scripts for the respective network is running, the underlying VF interface, will not be available. This will result in network.service failure. Also in order to restore the networks with VFs, os-net-config has to be run without "--no-activate". 

The same applies for offload feature as well, ifcfg run will expect the interface has been set to switchdev mode already, since sriov_config service didn't run, it will not be applied. network will be established in legacy mode and it requires to change the mode to switchdev, which also requires openvswitch to be restarted.

Comment 10 Saravanan KR 2020-12-03 10:38:01 UTC
Splitting this BZ in to 2 
(1) OvS HW Offload - this BZ, it will be closed as WONT FIX, as we dont have active use case to suport it
(2) Nic Partitioning - open a BZ to continue work on it

Comment 11 Red Hat Bugzilla 2023-09-14 06:05:31 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days


Note You need to log in before you can comment on or make changes to this bug.