Bug 1730084 - systemctl restart network disables access to running VM's [NEEDINFO]
Summary: systemctl restart network disables access to running VM's
Keywords:
Status: NEW
Alias: None
Product: Red Hat Enterprise Linux 8
Classification: Red Hat
Component: libvirt
Version: 8.1
Hardware: All
OS: Linux
unspecified
high
Target Milestone: rc
: ---
Assignee: Virtualization Maintenance
QA Contact: Luyao Huang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-07-15 19:16 UTC by Venkatesh Kavtikwar
Modified: 2020-02-16 16:13 UTC (History)
14 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Type: Bug
Target Upstream Version:
kwalker: needinfo? (vkavtikw)


Attachments (Terms of Use)

Description Venkatesh Kavtikwar 2019-07-15 19:16:49 UTC
Description of problem:

systemctl restart network disables access to running VM's. The link between bridge and vnet interface is breaking due to a network restart & is causing network issue.


Version-Release number of selected component (if applicable):

initscripts-9.49.46-1.el7.x86_64
kernel-3.10.0-957.10.1.el7.x86_64


How reproducible:

- Start any VM on KVM host & access it over network
- Restart the network service
- Try to access the VM


Steps to Reproduce:

1.Start a VM on KVM host, it will create a vnet interface for the network device and link to the respective bridge on which it is created.

# ip a
4: ens9: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master br0 state UP group default qlen 1000
    link/ether 52:54:00:21:43:f6 brd ff:ff:ff:ff:ff:ff
6: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master br0 state UNKNOWN group default qlen 1000
    link/ether fe:54:00:7c:56:1d brd ff:ff:ff:ff:ff:ff
    inet6 fe80::fc54:ff:fe7c:561d/64 scope link 
       valid_lft forever preferred_lft forever
7: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 52:54:00:21:43:f6 brd ff:ff:ff:ff:ff:ff
    inet 192.168.122.190/24 brd 192.168.122.255 scope global noprefixroute dynamic br0

# brctl show
bridge name	bridge id		STP enabled	interfaces
br0		8000.5254002143f6	no		ens9
							vnet0
2. Restart network service.

# systemctl restart network

3. You will see vnet interface is not added back to bridge.

# brctl show
bridge name	bridge id		STP enabled	interfaces
br0		8000.5254002143f6	no		ens9

6: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN group default qlen 1000
    link/ether fe:54:00:7c:56:1d brd ff:ff:ff:ff:ff:ff
    inet6 fe80::fc54:ff:fe7c:561d/64 scope link 
 


Expected results:

Upon network restart "vnet" interfaces for the running VM's should be added back  to bridge to avoid network connectivity issue.

The workaround is to add the vnet interface back to bridge using "brctl addif <bridge> <vnet>". 



Additional info:

During the analysis we observed that, "libvirtd" creates these "vnet" interfaces and link it to bridge when the VM gets initiated. As these vnet interfaces does not have any physical existance, network service is not aware about it and so does not link it back to bridge when the network service is restarted. Please correct if this is not right. 

So we wanted to know whether this is really a bug or RFE or is there any active work going on for this problem?

Comment 2 Venkatesh Kavtikwar 2019-07-15 19:19:12 UTC
Restarting "libvirtd" service does not resolve the vnet/bridge connectivity issue.

Comment 3 Jaroslav Suchanek 2019-08-01 12:29:46 UTC
Laine,

any insight, what can be done about this from libvirt perspective?

Thanks.

Comment 4 Lukáš Nykrýn 2019-08-01 13:09:30 UTC
Who creates that bridge? Does is have regular ifcfg file?

Comment 5 Venkatesh Kavtikwar 2019-08-01 13:16:42 UTC
Yes, the bridge is having "ifcfg" file and configured as a part of their network setup.

Comment 6 Lukáš Nykrýn 2019-08-01 13:37:05 UTC
This is kinda hard to solve, either libvirt needs to react on such change and add the devices to the bridge, or the customer can workaround this by adding those brctl commands to ifup-local

Comment 7 Laine Stump 2019-08-02 20:41:42 UTC
(In reply to Venkatesh Kavtikwar from comment #2)
> Restarting "libvirtd" service does not resolve the vnet/bridge connectivity
> issue.


All the way back in libvirt-3.2.0 (April 2017), I added commit 85bcc022 to libvirt, which caused guests to be reconnected to their configured bridge, but unfortunately was only thinking about the bridges created by libvirt's virtual networks. So if you restart libvirtd, it will check all the guest network connections that are configured as <interface type='network'> and reconnect them properly.

This behavior was fixed *properly* by Dan Berrange in libvirt-5.3.0 commit de938b92, which moved the check for proper connection of tap devices out of libvirt's virtual network driver and into the qemu driver. Unfortunately for RHEL7, this was done as a part of a fairly major refactoring of the network driver, so it won't be possible to simply backport that patch (or even just a few patches). Instead, fixing it in RHEL7 will require making a downstream-only patch (or rebasing, but I don't think we'll be doing that for RHEL7 anymore).

Alternately, as a workaround that didn't require any changes to libvirt code, you could just define a libvirt virtual network that uses your existing bridge, e.g.:

  <network>
    <name>br0-net</name>
    <bridge name='br0'/>
    <forward mode='bridge'/>
  </network>

and then configure your guests to use that network, e.g. instead of the guest config containing:

   <interface type='bridge'>
     <source bridge='br0'/>
     ...


it would have:

   <interface type='network'>
     <source network='br0-net'/>
     ...


Of course whether you do this now, or wait until there is a patch to libvirt equivalent to Dan's de938b92, you will *still* need to restart libvirtd after restarting the network service.

(I'm curious - has the network service always tore down and recreated configured bridges when it's restarted? If so, I'm surprised that we've never encountnered (or even heard of) this problem before...)

Comment 8 Laine Stump 2019-08-02 20:43:58 UTC
Sorry, I noticed after I hit save that I had left out a crucial part of the explanation - the reconnecting of all guest tap devices to their configured bridges happens only when libvirtd is restarted. (I guess that becomes obvious in the 2nd to last paragraph, but I hadn't explicitly stated it).


Note You need to log in before you can comment on or make changes to this bug.