Bug 2249024 - RHOSP16.2 to 17.1 upgrade: During Leapp uprade steps the network interface names are not preserved
Summary: RHOSP16.2 to 17.1 upgrade: During Leapp uprade steps the network interface na...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 17.1 (Wallaby)
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: z3
: 17.1
Assignee: Sergii Golovatiuk
QA Contact: Archana Singh
URL:
Whiteboard:
: 2263838 2314924 (view as bug list)
Depends On:
Blocks: 2263838
TreeView+ depends on / blocked
 
Reported: 2023-11-10 09:43 UTC by Shravan Kumar Tiwari
Modified: 2024-10-08 12:22 UTC (History)
12 users (show)

Fixed In Version: openstack-tripleo-heat-templates-14.3.1-17.1.20231103010835.el9ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2024-05-22 20:42:17 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker OSP-30349 0 None None None 2023-11-10 09:45:51 UTC
Red Hat Product Errata RHSA-2024:2736 0 None None None 2024-05-22 20:42:24 UTC

Description Shravan Kumar Tiwari 2023-11-10 09:43:59 UTC
Description of problem:
One of our Telco customer performing RHOSP16.2 to 17.1 upgrade and after OSP upgrade completed successfully he has to do the Leapp upgrade of ndoe from RHEL8 to RHEL9 (he is following doc section https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/17.1/html-single/framework_for_upgrades_16.2_to_17.1/index#upgrading-the-undercloud-operating-system).

The problem that they experience that nic eno49 and eno50 belongs to Bond0 and after leapp upgrade both eno49 and eno50 are represented with same mac address.
Customer thinks "due to a bug in the ansible code generating this files it writes eno49 and eno50 which are part of the same bond the same mac.
It writes the same mac instead of taking care of the fact that they have the same mac because they are in a LACP bond, it would for eno50 take it's permaddr."
When you change this manually (for eno50 to :61) it works perfectly fine after a server reboot.
When you don't modify this interfaces get terribly mixed up.

Problematic code part:
------------------------
The wrong code can be found in the rendered ansible code for example:
- name: Keep nics with prefix in NICsPrefixesToUdev from renaming
              vars:
                nics_prefixes_to_keep: {get_attr: [RoleParametersValue, value, 'nics_prefixes_to_keep']}
              # (.ifname | test("^.*\\..*$") | not) removes vlan nics like ens1.1
              # (.ifname | test("^.*v[0-9]*$") | not) removes virtual function nics ens1v1
              # (.ifname | test("^.*_[0-9]*$") | not) also removes virtual function nics ens1_1
              shell: >
                  ip -j link show | jq -r --arg prefix "{{ item }}" '.[] | select((.ifname | startswith($prefix)) and (.ifname | test("^.*v[0-9]*$")|not) and (.ifname | test("^.*_[0-9]*$") | not) and (.ifname | test("^.*\\..*$") | not)) | "SUBSYSTEM==\"net\",ACTION==\"add\",DRIVERS==\"?*\"," + "NAME=\"" + .ifname +"\" ,ATTR{address}==\"" + .address + "\""' >> /etc/udev/rules.d/70-rhosp-persistent-net.rules
              loop: "{{ nics_prefixes_to_keep|list }}"

What workaround customer tried:
-------------------------------
Modifying the code to the following, to make sure the perm_address is used instead of the duplicate address does the trick:
ip -j link show | jq -r --arg prefix "en" '.[] | select((.ifname | startswith($prefix)) and (.ifname | test("^.*v[0-9]*$")|not) and (.ifname | test("^.*_[0-9]*$") | not) and (.ifname | test("^.*\\..*$") | not)) | if .permaddr? then .address=.permaddr else . end | "SUBSYSTEM==\"net\",ACTION==\"add\",DRIVERS==\"?*\"," + "NAME=\"" + .ifname +"\" ,ATTR{address}==\"" + .address + "\""'




Expected results:
- The deployment scripts used for RHOSP upgrade/Leapp should take care of this and upgrade should not fail.


Additional info:

1- This issue happens for all roles, here it's reported for director, but also all other roles suffer from this code bug.
2- Once the first upgrade fails and you run the second upgrade then /etc/udev/rules.d/70-rhosp-persistent-net.rules files will have duplicate entries. So, it seems if the upgrade failed first time, so at relaunch it doesn't clean the udev file it just appends.

Comment 9 Juan Badia Payno 2024-02-13 14:20:56 UTC
*** Bug 2263838 has been marked as a duplicate of this bug. ***

Comment 24 errata-xmlrpc 2024-05-22 20:42:17 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: openstack-tripleo-heat-templates and tripleo-ansible update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2024:2736

Comment 25 Juan Badia Payno 2024-10-08 12:22:22 UTC
*** Bug 2314924 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.