Bug 857279

Summary: xen 3.0.3 network-bridge xenbr0 under bonded interface creates potential for a switching loop
Product: Red Hat Enterprise Linux 5 Reporter: Philip Booysen <zer0tilt>
Component: xenAssignee: Xen Maintainance List <xen-maint>
Status: CLOSED NOTABUG QA Contact: Virtualization Bugs <virt-bugs>
Severity: high Docs Contact:
Priority: unspecified    
Version: 5.8CC: leiwang, moli, mrezanin, qguan, wshi, xen-maint
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-09-14 07:59:13 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Philip Booysen 2012-09-14 01:56:45 UTC
Description of problem:

The script /etc/xen/scripts/network-bridge does not test for the existence of an existing /etc/sysconfig/network-scripts or manually configured xenbr0. It should rather test for xenbr0 as an existing bridge and first remove it or fail the action of configuring xenbr0. Currently, in failing to test for the existence of a xenbr0 bridge, it configures the already existing xenbr0 further and with bonding enabled and bond0 already added to xenbr0, it adds eth0 into the bridge, creating a switching loop.

/etc/xen/xend-config.sxp has default configuration as "(network-script network-bridge)". It should rather have "(network-script /bin/true)" as per best practice and Red Hat documentation. See references [1] and [2] under "Additional Information". With a configured or manual xenbr0 already in place, the default xen package has xend-config.sxp configured to call the network-bridge script as per first paragraph, ending in the same unfavorable condition of creating a switching loop and broadcast storm.

Thus, under reasonable and favorable conditions, it is possible to, unintentionally or intentionally, configure xen under RHEL 5.8 to cause a switching loop between xenbr0, bond0 and eth0. The broadcast storm radiating from this virtualisation switching loop can potentially cause a DOS on the physical switch attached to eth0 and under certain physical switch network setups can cause a DDOS on the connected physical layer 2 network. Under ideal conditions, this can black out such a layer 2 based physical network within seconds until manual disconnect or automated prevention is administered.

This bug report is submitted to allow for the consideration of fixing the network-bridge script to test for existence of an already configured xenbr0 bridge and act accordingly, as well as having sane and safe defaults in xend-config.sxp. This would be to not allow for a configuration which could, under ideal conditions, cause a DDOS on a layer 2 physical network.

Version-Release number of selected component (if applicable):
xen-3.0.3-135.el5_8.5.x86_64
xen-libs-3.0.3-135.el5_8.5.x86_64
2.6.18-308.13.1.el5xen

How reproducible: Always. Happens every time.


Steps to Reproduce:
===================

1. Use RHEL 5.8 (2.6.18-308.13.1.el5xen)

2. Configure network as follows:

# /etc/sysconfig/network-scripts/ifcfg-xenbr0
DEVICE=xenbr0
TYPE=Bridge
BOOTPROTO=none
ONBOOT=yes
DELAY=0
BROADCAST=172.20.1.255
NETWORK=172.20.1.192
NETMASK=255.255.255.192
IPADDR=172.20.1.198

# /etc/sysconfig/network-scripts/ifcfg-bond0
DEVICE=bond0
USERCTL=no
ONBOOT=yes
TYPE=Ethernet
BOOTPROTO=static
BRIDGE=xenbr0

#/etc/sysconfig/network-scripts/ifcfg-eth0
DEVICE=eth0
BOOTPROTO=none
MASTER=bond0
SLAVE=yes
USERCTL=no
ONBOOT=yes

#/etc/sysconfig/network-scripts/ifcfg-eth1
DEVICE=eth1
BOOTPROTO=none
MASTER=bond0
SLAVE=yes
USERCTL=no
ONBOOT=yes

3. Start the network

# service network start

# ifconfig bond0
bond0     Link encap:Ethernet  HWaddr E4:1F:14:61:DF:75  
          UP BROADCAST MASTER MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)

# ifconfig eth0
eth0      Link encap:Ethernet  HWaddr E4:1F:14:61:DF:75
          UP BROADCAST SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
          Interrupt:16 Memory:92000000-92012800

# ifconfig xenbr0
xenbr0    Link encap:Ethernet  HWaddr E4:1F:14:61:DF:75  
          inet addr:172.20.1.198  Bcast:172.20.1.255  Mask:255.255.255.192
          inet6 addr: fe80::e61f:13ff:fe60:de64/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:51 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:0 (0.0 b)  TX bytes:2358 (2.3 KiB)

4) #brctl show
bridge name     bridge id               STP enabled     interfaces
virbr0          8000.000000000000       yes             
xenbr0          8000.e41f1360de64       no              vif0.0
                                                        bond0
5. Install Xen

# yum install xen-3.0.3-135.el5_8.5.x86_64

# Confirm network-bridge is default config

# grep network-script /etc/xen/xend-config.sxp
(network-script network-bridge)

6. Start xend and xendomains (no need to have a VM in place)

# service xendomains start
# service xend start

alternatively instead of starting xen, just run 

# /etc/xen/scripts/network-bridge start
  
Actual results:
===============

1) eth0 gets added to the previously configured xenbr0 bridge which already had bond0 added. bond0 has eth0 as active slave. Switching loop introduced between xenbr0, bond0 and eth0 and network broadcast storm commences.

2) #brctl show
bridge name     bridge id               STP enabled     interfaces
virbr0          8000.000000000000       yes             
xenbr0          8000.e41f1360de64       no              vif0.0
                                                        peth0
                                                        bond0

3) Switching loop gets created between xenbr0, peth0 and bond0 which radiates a broadcast storm into the physical layer 2 network, effectively creating the potential for a DDOS on network.

4) Witness broadcast traffic throughput after seconds of starting xend on bond0, eth0 and xenbr0 interfaces:

# ifconfig bond0
bond0     Link encap:Ethernet  HWaddr E4:1F:14:61:DF:75 
          inet6 addr: fe80::e61f:13ff:fe60:de64/64 Scope:Link
          UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
          RX packets:2073589 errors:0 dropped:0 overruns:0 frame:0
          TX packets:6046837 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:87090738 (83.0 MiB)  TX bytes:407148862 (388.2 MiB)

# ifconfig eth0
eth0      Link encap:Ethernet  HWaddr E4:1F:14:61:DF:75  
          UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
          RX packets:2073643 errors:0 dropped:0 overruns:0 frame:0
          TX packets:6046947 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:87093006 (83.0 MiB)  TX bytes:407155566 (388.2 MiB)

# ifconfig xenbr0
xenbr0    Link encap:Ethernet  HWaddr E4:1F:14:61:DF:75  
          inet addr:172.20.1.198  Bcast:172.20.1.255  Mask:255.255.255.192
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:8085008 errors:0 dropped:0 overruns:0 frame:0
          TX packets:54 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:378559360 (361.0 MiB)  TX bytes:2484 (2.4 KiB)

Expected results:
=================

1) xen uses default Red Hat provided xend-config.sxp configured for "(network-script network-bridge)",  then calls /etc/xen/scripts/network-bridge which establishes there exists already a previously configured xenbr0 and remove the xenbr0 bridge and interface first, then starts creating it, OR tests for the existence of xenbr0, reports on the matter to stdout and fails to configure, exit 1.

2) #brctl show
bridge name     bridge id               STP enabled     interfaces
virbr0          8000.000000000000       yes             
xenbr0          8000.e41f1360de64       no              vif0.0
                                                        bond0

3) No switching loop created between xenbr0, peth0 and bond0. The network-bridge script ran succesfully, and removed the active xenbr0 first, then re-created xenbr0. Now the expected functional bridge called xenbr0 between bond0 and vifx.x exists. Another expected result might be xend failed in starting up from SysV script claiming there was already an active xenbr0 running, please rectify either this matter or change the appropriate network-script line in xend-config.sxp.

Additional info:
================

1) Tested under RHEL 5.8 with xen-3.0.3-135.el5_8.5.x86_64 and xen-libs-3.0.3-135.el5_8.5.x86_64

2) Red Hat best practices [1][2] (setup xenbr0 self) mixed with the default Red Hat xend-config.sxp provided configuration causes a switching loop [1][2]

3) RPM runs post install script to enable xend and xendomains which implies should system be rebooted before changing xen config and with a preconfigured xenbr0 in place, it will startup xen aft boot-up and use the default configuration file /etc/xen/xend-config.sxp to run the network-bridge script and wrongly configure xenbr0 to create a switching loop.

Reference 1) : https://access.redhat.com/knowledge/docs/en-US/Red_Hat_Enterprise_Linux/5/html/Virtualization/sect-Virtualization-Network_Configuration-Bridged_networking_with_libvirt.html

Reference 2) : http://wiki.xen.org/wiki/Network_Configuration_Examples_%28Xen_4.1%2B%29#Red_Hat-style_bridge_configuration_.28e.g._RHEL.2C_Fedora.2C_CentOS.29

Comment 1 Miroslav Rezanina 2012-09-14 06:22:28 UTC
To use xen on machine with bonding use setup as desribed in [1]. Default configuration is not supposed to work on non-trivial network setting (like bonding or vlans). In this case network scripts has to be modified.

Can you retest if problems are hit with setting done as described in [1]?

[1]: https://access.redhat.com/knowledge/articles/22538

Comment 2 Philip Booysen 2012-09-14 07:28:59 UTC
(In reply to comment #1)
> To use xen on machine with bonding use setup as desribed in [1]. Default
> configuration is not supposed to work on non-trivial network setting (like
> bonding or vlans). In this case network scripts has to be modified.
> 
> Can you retest if problems are hit with setting done as described in [1]?
> 
> [1]: https://access.redhat.com/knowledge/articles/22538

Agreed, default configuration is not suppose to work on non-trivial network settings (including bonding and vlans).

Red Hat propose that in such a case, with non-trivial network settings, one should:

1) Disable the network-script using "(network-script /bin/true)" and configure non-trivial network settings outside libvirtd under /etc/sysconfig/network-script/ifcfg-* as proposed by [1] and [2] here under

OR

2) Use "(network-script 'network-bridge-bonding bridge=bond0 netdev=0')" as proposed by [3] here under.

Should a non-trivial networking configuration be setup by the System Adminstrator, including /etc/sysconfig/network-scripts/ifcfg-* , as per [1] and/or [2], and the current default xend-config.sxp setting called "(network-script network-bridge)" gets deployed, be it intentionally or unintentionally, a denial of service attack or even a DDOS can occur under favorable conditions on an attached layer 2 network of any size.

I believe a risk aversion for accidentally creating the above scenario is in the best interest of Red Hat's customers using xen-3 under RHEL 5. Such a risk aversion could be implemented by setting a safer and more sane default in xend-config.sxp in the origin of the xen rpm package to "(network-script /bin/true)". xenbr0 existence testing could also be added to default network-bridge script. The system and this xen network-script setting can there after be configured to the needs of the actual environment, before xen unintentionally misconfigures a non-trivial network setup and cause possible network outage.


[1]: https://access.redhat.com/knowledge/docs/en-US/Red_Hat_Enterprise_Linux/5/html/Virtualization/sect-Virtualization-Network_Configuration-Bridged_networking_with_libvirt.html

[2]: http://wiki.xen.org/wiki/Network_Configuration_Examples_%28Xen_4.1%2B%29#Red_Hat-style_bridge_configuration_.28e.g._RHEL.2C_Fedora.2C_CentOS.29

[3]: https://access.redhat.com/knowledge/articles/22538

Comment 3 Miroslav Rezanina 2012-09-14 07:59:13 UTC
(In reply to comment #2)
> Agreed, default configuration is not suppose to work on non-trivial network
> settings (including bonding and vlans).
> 
> Red Hat propose that in such a case, with non-trivial network settings, one
> should:
> 
> 1) Disable the network-script using "(network-script /bin/true)" and
> configure non-trivial network settings outside libvirtd under
> /etc/sysconfig/network-script/ifcfg-* as proposed by [1] and [2] here under
> 
> OR
> 
> 2) Use "(network-script 'network-bridge-bonding bridge=bond0 netdev=0')" as
> proposed by [3] here under.
> 
> Should a non-trivial networking configuration be setup by the System
> Adminstrator, including /etc/sysconfig/network-scripts/ifcfg-* , as per [1]
> and/or [2], and the current default xend-config.sxp setting called
> "(network-script network-bridge)" gets deployed, be it intentionally or
> unintentionally, a denial of service attack or even a DDOS can occur under
> favorable conditions on an attached layer 2 network of any size.
> 
> I believe a risk aversion for accidentally creating the above scenario is in
> the best interest of Red Hat's customers using xen-3 under RHEL 5. Such a
> risk aversion could be implemented by setting a safer and more sane default
> in xend-config.sxp in the origin of the xen rpm package to "(network-script
> /bin/true)". xenbr0 existence testing could also be added to default
> network-bridge script. The system and this xen network-script setting can
> there after be configured to the needs of the actual environment, before xen
> unintentionally misconfigures a non-trivial network setup and cause possible
> network outage.

Changing default configuration in current phase of lifetime would cause risk of breaking systems of current customers that is higher than benefits of this preventive measurements. 

Described scenario require user to modify default network configuration. In this case we do not guarantee correct working of xen configuration. Any manual changes to network setting on xen based system has to be done by person aware of relation between all parts of networking setup and in cooperation with Red Hat Support.

Therefore any of recommended change is not going to be implemented in RHEL 5. If   you experience difficulties with setting network up, please contact Support Team  to help you.