Bug 231227

Summary: xend causes a network disruption when started
Product: Red Hat Enterprise Linux 5 Reporter: Ryan McCabe <rmccabe>
Component: xenAssignee: Xen Maintainance List <xen-maint>
Status: CLOSED CANTFIX QA Contact: Virtualization Bugs <virt-bugs>
Severity: high Docs Contact:
Priority: medium    
Version: 5.0CC: berrange, bstevens, clalance, k.georgiou, minovotn, pbonzini, sct, sdake, xen-maint
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-11-10 17:54:00 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Ryan McCabe 2007-03-06 21:09:19 UTC
Description of problem:

xend disrupts the network for around 4-5s when it starts up late in the boot
order, causing (at least) the Red Hat Cluster Suite to fail.

See Bug #230783 for more info specific to the problems it creates in RHCS.

How reproducible:
100%

Steps to Reproduce:
1. chkconfig xend on
2. chkconfig cman on
3. reboot
  
Actual results:

xend causes network disruption, which in turn causes cluster nodes to become
confused and cluster status to be inconsistent among nodes.

Expected results:

no network disruption

Comment 1 Stephen Tweedie 2007-03-07 17:19:02 UTC
Right, that's standard Xen behaviour --- it sets up a networking bridge
environment, and the reconfiguration of NICs during that setup is going to have
side-effects for things like clustering.

We're looking at a possible alternative networking mode for 5.1 which won't do this.

Comment 2 Red Hat Bugzilla 2007-07-25 00:40:05 UTC
change QA contact

Comment 6 Daniel Berrangé 2009-06-12 11:06:13 UTC
There is really no way to change Xen's network-bridge script that would avoid network disruption - disruption is inherant in the approach of modifying existing network interffaces. The only practical is to get bridging enabled right from the moment the host's network is brought online, eg to use the regular network initscripts to configure bridging.

eg So in /etc/xen/xend-config.sxp, change the network script option to

 (network-script /bin/true)

And then with your network configs setup

# cat > ifcfg-eth0 <<EOF
DEVICE=eth0
HWADDR=00:16:76:D6:C9:45
ONBOOT=yes
BRIDGE=br0
EOF

# cat > ifcfg-br0 <<EOF
DEVICE=br0
TYPE=Bridge
BOOTPROTO=dhcp
ONBOOT=yes
DELAY=0
EOF

For further info consult

http://wiki.libvirt.org/page/Networking#Fedora.2FRHEL_Bridging


At best this is a documentation problem / kbase item.

Comment 12 Daniel Berrangé 2009-11-10 17:54:00 UTC
This is no way we can fix this issue in RHEL5.x, since fixing it would require stopping all use of  XenD's  networking scripts at boot time. This would cause major a regression for existing users of RHEL5 Xen. Thus we are closing this CANTFIX, and it will remain a manual task for people deploying cluster suite to disable Xen's networking scripts and configure bridging as per comment #6.

Comment 13 Steven Dake 2009-11-10 19:20:44 UTC
Daniel,

It would be helpful if the xen team documented this in a kb article and followed through on making sure customers are aware of this design problem with xen.

regards
-steve

Comment 14 Paolo Bonzini 2010-04-08 15:46:06 UTC
This bug was closed during 5.5 development and it's being removed from the internal tracking bugs (which are now for 5.6).