Bug 1891437

Summary:

Reboot of host with ovs interface lefts the interface in activating state

Product:

Red Hat Enterprise Linux 8

Reporter:

Ales Musil <amusil>

Component:

NetworkManager

Assignee:

NetworkManager Development Team <nm-team>

Status:

CLOSED DUPLICATE

QA Contact:

Vladimir Benes <vbenes>

Severity:

high

Docs Contact:

Priority:

unspecified

Version:

8.3

CC:

acardace, atragler, bgalvani, dholler, fge, lrintel, mburman, mtessun, peljasz, rkhan, sukulkar, thaller, till, vbenes, ymankad

Target Milestone:

Keywords:

Reopened, Triaged

Target Release:

8.0

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

NetworkManager-1.30.0-2.el8

Doc Type:

If docs needed, set a value

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2021-03-04 10:43:58 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Bug Depends On:

Bug Blocks:

1920330

Attachments:

Description	Flags
Trace log	none
nmcli commands	none

Description Ales Musil 2020-10-26 09:33:13 UTC

Created attachment 1724127 [details]
Trace log

Description of problem:

After configuring multiple ovs interfaces on single host, the host reboot
wi8ll cause those interface to be stuck in activating state. 

Version-Release number of selected component (if applicable):
NetworkManager-1.26

How reproducible:
100%

Steps to Reproduce:
1. Create multiple ovs interface that are connected by bridge to physical interface
2. Reboot the host
3.

Actual results:
The ovs interfaces are stuck in activating state

Expected results:
The ovs interfaces should be up and running

Additional info:

Comment 1 Ales Musil 2020-10-26 09:33:44 UTC

Created attachment 1724128 [details]
nmcli commands

Comment 2 lejeczek 2020-11-16 20:24:10 UTC

Hi,
I'm having similar if not identical problem.
I have both(and probably other relevant components pulled in) NetworkManager and openvswitch2.11 from ovirt-4.4-copr and cannot confirm if the problem is exclusive to ovirt-4.


I use NM to set up OVS bridges and have a pretty straightforward setup I believe.
Each time after system reboot one (always the same) interface fails to start and stays in a weird state, actually it gets duplicated.

ovs0-int-10.3.3            d9c9268f-408c-4035-9684-97e4eaf92e18  ovs-interface  ovs0-int33 (colored yellow)
ovs0-int-10.3.3            d9c9268f-408c-4035-9684-97e4eaf92e18  ovs-interface  ovs0-int33 (colored red)

whereas the rest of the bridge:

ovs0-int-10.1.1            be5e79e7-2e4e-47ed-97d0-270c76e43b9a  ovs-interface  ovs0-int11 
ovs0                       9cf72215-eebc-4f7c-9466-b697f09e1bda  ovs-bridge     ovsbr0     
ovs0-port-11               1577414e-7eeb-41bc-a66f-7b7b596285d0  ovs-port       ovs0-port11
ovs0-port-33               5465d3c5-9a0c-4f78-89c0-de94b0294d0e  ovs-port       ovs0-port33
ovs0-port9                 b9c2b7cd-897e-4726-8fc7-e2f7c7aab067  ovs-port       ovs0-port9 
ovs0-port9-physical        14fb0b0e-f7bb-4248-9295-34f561058f4d  ethernet       enp7s0f1np1
ovsbr0-libvirt0-int        f217cd6c-976b-4002-b191-9f15271602cd  ovs-interface  ovsbr0     
ovsbr0-libvirt0-port       7c2d30bd-0019-4cdb-af05-3df666005261  ovs-port       ovsbr0 

And then only another reboot with either:
a) deletion of the interface prior to reboot and after the reboot creation anew
b) removal of /etc/openvswitch/conf.db prior to reboot.

When all is good and whole bridge is up & running then bridge looks like this:

004e89ac-4c45-4a33-9347-c1d343889ceb
    Bridge "ovsbr0"
        Port "ovs0-port11"
            tag: 11
            Interface "ovs0-int11"
                type: internal
        Port "ovsbr0"
            Interface "ovsbr0"
                type: internal
        Port "ovs0-port9"
            Interface "enp9s0f3"
                type: system
        Port "vnet0"
            tag: 33
            Interface "vnet0"
        Port "ovs0-port33"
            tag: 33
            Interface "ovs0-int33"
                type: internal
    ovs_version: "2.11.0"

Journal does not tells much, at least not by default
....
device (ovs0-int33): Activation: starting connection 'ovs0-int-10.3.3' (d9c9268f-408c-4035-9684-97e4eaf92e18)
....
When I try to 'c up' manually nmcli just "time outs".

nmcli shows:
...
GENERAL.STATE:                          activating
...
and also:
...
GENERAL.STATE:                          deactivating
...
at the same time as the interface is duplicated so are GENERAL entries.

I'm on Centos8(no Stream) and with kernel-ml.
It feels (I have no lab where I could fiddle more) that this should reproduce. I have a second system with slightly different hardware and this same problem occurs.

Feel free to ask for more info/logs.
many thanks, L.

Comment 3 lejeczek 2020-11-24 13:24:51 UTC

A workaround which is bit less invasive then a reboot:

1) [root@dzien ~]$ rm -f /etc/openvswitch/conf.db 
2) [root@dzien ~]$ systemctl restart openvswitch.service 
3) [root@dzien ~]$ systemctl restart NetworkManager
4) [root@dzien ~]$ { _IF=ovs0-port9-physical ; nmcli c d $_IF; sleep 2; nmcli c u $_IF; } # seems that (at least in my case) the physical iface needs such a "reset" to make network able to access it again

Comment 4 Beniamino Galvani 2020-12-03 08:45:19 UTC

Hi,

this looks related to bug 1861296. During a reboot, NM is stopped and tries to delete the interfaces that were added to the ovsdb. However, it fails to do that, and therefore some ovs interfaces are present in the ovsdb at the next boot, conflicting with the configuration that NM wants to do.

Comment 5 Beniamino Galvani 2021-01-28 09:13:13 UTC

Hi Ales, please retest with NetworkManager-1.30.0-0.8.el8 once it's available in the repositories.

Comment 6 Yash Mankad 2021-02-10 19:30:10 UTC

Ales,

could you try reproducing the issue with the build mentioned by Beniamino in the above comment.

Thanks!

Comment 7 Ales Musil 2021-02-11 06:22:31 UTC

Hi,

I tested this change and it seems like the issue is gone.

Thanks

Comment 8 Gris Ge 2021-03-03 05:19:28 UTC

Closing per above comment.

Comment 10 Vladimir Benes 2021-03-04 10:43:58 UTC

closing as duplicate

*** This bug has been marked as a duplicate of bug 1861296 ***