Bug 2142760

Summary: Upgrade [OSP16.2 -> OSP17.1] Network is failing after upgrade from rhel8.4 to rhel 9.x
Product: Red Hat OpenStack Reporter: Juan Badia Payno <jbadiapa>
Component: leapp-repository-openstackAssignee: Sergii Golovatiuk <sgolovat>
Status: CLOSED ERRATA QA Contact: Archana Singh <arcsingh>
Severity: high Docs Contact:
Priority: medium    
Version: 17.1 (Wallaby)CC: arcsingh, drosenfe, ekuris, jkreger, jpretori, jschluet, lbezdick, mburns, mciecier, pgrist, sgolovat
Target Milestone: gaKeywords: Triaged
Target Release: 17.1   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-tripleo-heat-templates-14.3.1-17.1.20230624230958.3c59c35.el9ost leapp-repository-openstack-0.0.7-1.el8ost Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-08-16 00:17:32 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2016660    

Description Juan Badia Payno 2022-11-15 07:59:37 UTC
In an infrared deelopment environment, after upgrading the controller-0 from rhel8.4 to rhel9.1. The network of the controller-0 didn't work.

To make the network work again, we needed firstly to enable the console to be use with "virsh console" and then enable and start the network.service

The network.service started with errors.

    [root@controller-0 network-scripts]#
     
    [root@controller-0 network-scripts]# journalctl -xeu network.service
    Nov 14 16:51:07 controller-0 network[367744]: See all profiles with `nmcli connection`.
    Nov 14 16:51:07 controller-0 network[367744]: Reload files from disk with `nmcli connection reload`
    Nov 14 16:51:07 controller-0 network[367744]: Activate the desired profile with `nmcli connection up \"$NAME\"`
    Nov 14 16:51:07 controller-0 network[367070]: [FAILED]
    Nov 14 16:51:08 controller-0 network[367070]: Bringing up interface vlan20:  [  OK  ]
    Nov 14 16:51:08 controller-0 ovs-vsctl[367998]: ovs|00001|vsctl|INFO|Called as ovs-vsctl -t 10 -- --if-exists del-port br-tenant vlan30 -- add-port br-tenant vlan30 tag=30 -- set Interface vlan30 type=internal
    Nov 14 16:51:10 controller-0 network[367070]: Bringing up interface vlan30:  [  OK  ]
    Nov 14 16:51:11 controller-0 network[367070]: Bringing up interface vlan40:  [  OK  ]
    Nov 14 16:51:11 controller-0 ovs-vsctl[368417]: ovs|00001|vsctl|INFO|Called as ovs-vsctl -t 10 -- --if-exists del-port br-tenant vlan50 -- add-port br-tenant vlan50 tag=50 -- set Interface vlan50 type=internal
    Nov 14 16:51:12 controller-0 network[367070]: Bringing up interface vlan50:  [  OK  ]
    Nov 14 16:51:12 controller-0 network[368502]: RTNETLINK answers: File exists
    Nov 14 16:51:12 controller-0 network[368503]: RTNETLINK answers: File exists
    Nov 14 16:51:12 controller-0 network[368504]: RTNETLINK answers: File exists
    Nov 14 16:51:12 controller-0 network[368505]: RTNETLINK answers: File exists
    Nov 14 16:51:12 controller-0 network[368506]: RTNETLINK answers: File exists
    Nov 14 16:51:12 controller-0 network[368507]: RTNETLINK answers: File exists
    Nov 14 16:51:12 controller-0 network[368508]: RTNETLINK answers: File exists
    Nov 14 16:51:12 controller-0 network[368509]: RTNETLINK answers: File exists
    Nov 14 16:51:12 controller-0 network[368510]: RTNETLINK answers: File exists
    Nov 14 16:51:12 controller-0 systemd[1]: network.service: Control process exited, code=exited, status=1/FAILURE
     Subject: Unit process exited
     Defined-By: systemd
     Support: https://access.redhat.com/support
     
     An ExecStart= process belonging to unit network.service has exited.
     
     The process' exit code is 'exited' and its exit status is 1.
    Nov 14 16:51:12 controller-0 systemd[1]: network.service: Failed with result 'exit-code'.
     Subject: Unit failed
     Defined-By: systemd
     Support: https://access.redhat.com/support
     
     The unit network.service has entered the 'failed' state with result 'exit-code'.
    Nov 14 16:51:12 controller-0 systemd[1]: network.service: Unit process 368236 (ifup-eth) remains running after unit stopped.
    Nov 14 16:51:12 controller-0 systemd[1]: network.service: Unit process 368346 (ifup-eth) remains running after unit stopped.
    Nov 14 16:51:12 controller-0 systemd[1]: network.service: Unit process 368348 (sleep) remains running after unit stopped.
    Nov 14 16:51:12 controller-0 systemd[1]: network.service: Unit process 368446 (arping) remains running after unit stopped.
    Nov 14 16:51:12 controller-0 systemd[1]: network.service: Unit process 368457 (ifup-eth) remains running after unit stopped.
    Nov 14 16:51:12 controller-0 systemd[1]: network.service: Unit process 368459 (sleep) remains running after unit stopped.
    Nov 14 16:51:12 controller-0 systemd[1]: Failed to start LSB: Bring up/down networking.
     Subject: A start job for unit network.service has failed
     Defined-By: systemd
     Support: https://access.redhat.com/support
     
     A start job for unit network.service has finished with a failure.
     
     The job identifier is 348476 and the job result is failed.
    Nov 14 16:51:12 controller-0 systemd[1]: network.service: Consumed 1.975s CPU time.
     Subject: Resources consumed by unit runtime
     Defined-By: systemd
    ░ Defined-By: systemd
     Support: https://access.redhat.com/support
     
     The unit network.service completed and consumed the indicated resources
     
     
    [root@controller-0 network-scripts]# nmcli connection
     
    [root@controller-0 network-scripts]# 

As the interfaces seem to be unmanaged by nmcli, we also try to enable them with:
    
    [root@controller-0 network-scripts]# sed -i -E 's/NM_CONTROLLED=no/NM_CONTROLLED=yes/g' *
    [root@controller-0 network-scripts]# nmcli connection reload

And restarting the service:

    [root@controller-0 network-scripts]# systemctl restart network
    Job for network.service failed because the control process exited with error code.
    See "systemctl status network.service" and "journalctl -xeu network.service" for details.
    
But we hit very similar issue:

[root@controller-0 network-scripts]# journalctl -xeu network.service
░░                                                                         
░░ The job identifier is 4779538.
Nov 15 07:56:56 controller-0 network[897663]: WARN      : [network] You are using 'network' service provided by 'network-scripts', which are now deprecated.
Nov 15 07:56:56 controller-0 network[897702]: You are using 'network' service provided by 'network-scripts', which are now deprecated.
Nov 15 07:56:56 controller-0 network[897663]: WARN      : [network] 'network-scripts' will be removed from distribution in near future.
Nov 15 07:56:56 controller-0 network[897703]: 'network-scripts' will be removed from distribution in near future.
Nov 15 07:56:56 controller-0 network[897663]: WARN      : [network] It is advised to switch to 'NetworkManager' instead for network management.
Nov 15 07:56:56 controller-0 network[897704]: It is advised to switch to 'NetworkManager' instead for network management.
Nov 15 07:56:56 controller-0 network[897663]: Bringing up loopback interface:  [  OK  ]
Nov 15 07:56:56 controller-0 network[897663]: Bringing up interface br-ex:  [  OK  ]
Nov 15 07:56:57 controller-0 network[897663]: Bringing up interface br-tenant:  [  OK  ]
Nov 15 07:56:57 controller-0 network[897663]: Bringing up interface ens3:  [  OK  ]
Nov 15 07:56:57 controller-0 ovs-vsctl[898030]: ovs|00001|vsctl|INFO|Called as ovs-vsctl -t 10 -- --if-exists del-port br-tenant ens4 -- add-port br-tenant ens4
Nov 15 07:56:57 controller-0 network[897663]: Bringing up interface ens4:
Nov 15 07:56:57 controller-0 network[898049]: Error: unknown connection '/etc/sysconfig/network-scripts/ifcfg-br-tenant'.
Nov 15 07:56:57 controller-0 network[898057]: Failure to activate file "br-tenant"!
Nov 15 07:56:57 controller-0 network[898057]: See all profiles with `nmcli connection`.
Nov 15 07:56:57 controller-0 network[898057]: Reload files from disk with `nmcli connection reload`
Nov 15 07:56:57 controller-0 network[898057]: Activate the desired profile with `nmcli connection up \"$NAME\"`
Nov 15 07:56:57 controller-0 network[897663]: [FAILED]
Nov 15 07:56:57 controller-0 ovs-vsctl[898205]: ovs|00001|vsctl|INFO|Called as ovs-vsctl -t 10 -- --if-exists del-port br-ex ens5 -- add-port br-ex ens5
Nov 15 07:56:57 controller-0 network[897663]: Bringing up interface ens5:
Nov 15 07:56:57 controller-0 network[898211]: Error: unknown connection '/etc/sysconfig/network-scripts/ifcfg-br-ex'.
Nov 15 07:56:57 controller-0 network[898215]: Failure to activate file "br-ex"!
Nov 15 07:56:57 controller-0 network[898215]: See all profiles with `nmcli connection`.
Nov 15 07:56:57 controller-0 network[898215]: Reload files from disk with `nmcli connection reload`
Nov 15 07:56:57 controller-0 network[898215]: Activate the desired profile with `nmcli connection up \"$NAME\"`
Nov 15 07:56:57 controller-0 network[897663]: [FAILED]
Nov 15 07:56:57 controller-0 network[897663]: Bringing up interface vlan20:  [  OK  ]
Nov 15 07:56:57 controller-0 ovs-vsctl[898289]: ovs|00001|vsctl|INFO|Called as ovs-vsctl -t 10 -- --if-exists del-port br-tenant vlan30 -- add-port br-tenant vlan30 tag=30 -- set Interface vlan30 type=internal
Nov 15 07:56:59 controller-0 network[897663]: Bringing up interface vlan30:  [  OK  ]
Nov 15 07:56:59 controller-0 network[897663]: Bringing up interface vlan40:  [  OK  ]
Nov 15 07:56:59 controller-0 ovs-vsctl[898489]: ovs|00001|vsctl|INFO|Called as ovs-vsctl -t 10 -- --if-exists del-port br-tenant vlan50 -- add-port br-tenant vlan50 tag=50 -- set Interface vlan50 type=internal
Nov 15 07:57:00 controller-0 network[897663]: Bringing up interface vlan50:  [  OK  ]
Nov 15 07:57:00 controller-0 network[898582]: RTNETLINK answers: File exists
Nov 15 07:57:00 controller-0 network[898583]: RTNETLINK answers: File exists
Nov 15 07:57:00 controller-0 network[898584]: RTNETLINK answers: File exists
Nov 15 07:57:00 controller-0 network[898585]: RTNETLINK answers: File exists
Nov 15 07:57:00 controller-0 network[898586]: RTNETLINK answers: File exists
Nov 15 07:57:00 controller-0 network[898587]: RTNETLINK answers: File exists
Nov 15 07:57:00 controller-0 network[898588]: RTNETLINK answers: File exists
Nov 15 07:57:00 controller-0 network[898589]: RTNETLINK answers: File exists
Nov 15 07:57:00 controller-0 network[898590]: RTNETLINK answers: File exists
Nov 15 07:57:00 controller-0 systemd[1]: network.service: Control process exited, code=exited, status=1/FAILURE
░░ Subject: Unit process exited
░░ Defined-By: systemd
░░ Support: https://access.redhat.com/support
░░
░░ An ExecStart= process belonging to unit network.service has exited.
░░
░░ The process' exit code is 'exited' and its exit status is 1.
Nov 15 07:57:00 controller-0 systemd[1]: network.service: Failed with result 'exit-code'.
░░ Subject: Unit failed
░░ Defined-By: systemd
░░ Support: https://access.redhat.com/support
░░
░░ The unit network.service has entered the 'failed' state with result 'exit-code'.
Nov 15 07:57:00 controller-0 systemd[1]: network.service: Unit process 898399 (ifup-eth) remains running after unit stopped.
Nov 15 07:57:00 controller-0 systemd[1]: network.service: Unit process 898401 (sleep) remains running after unit stopped.
Nov 15 07:57:00 controller-0 systemd[1]: network.service: Unit process 898531 (ifup-eth) remains running after unit stopped.
Nov 15 07:57:00 controller-0 systemd[1]: network.service: Unit process 898533 (sleep) remains running after unit stopped.
Nov 15 07:57:00 controller-0 systemd[1]: Failed to start LSB: Bring up/down networking.
░░ Subject: A start job for unit network.service has failed
░░ Defined-By: systemd
░░ Support: https://access.redhat.com/support
░░
░░ A start job for unit network.service has finished with a failure.
░░
░░ The job identifier is 4779538 and the job result is failed.
Nov 15 07:57:00 controller-0 systemd[1]: network.service: Consumed 1.863s CPU time.
░░ Subject: Resources consumed by unit runtime
░░ Defined-By: systemd
░░ Support: https://access.redhat.com/support
░░
░░ The unit network.service completed and consumed the indicated resources.
lines 592-666/666 (END)

Comment 30 errata-xmlrpc 2023-08-16 00:17:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Release of components for Red Hat OpenStack Platform 17.1 (Wallaby)), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2023:4580