Description of problem:
xend disrupts the network for around 4-5s when it starts up late in the boot
order, causing (at least) the Red Hat Cluster Suite to fail.
See Bug #230783 for more info specific to the problems it creates in RHCS.
Steps to Reproduce:
1. chkconfig xend on
2. chkconfig cman on
xend causes network disruption, which in turn causes cluster nodes to become
confused and cluster status to be inconsistent among nodes.
no network disruption
Right, that's standard Xen behaviour --- it sets up a networking bridge
environment, and the reconfiguration of NICs during that setup is going to have
side-effects for things like clustering.
We're looking at a possible alternative networking mode for 5.1 which won't do this.
change QA contact
There is really no way to change Xen's network-bridge script that would avoid network disruption - disruption is inherant in the approach of modifying existing network interffaces. The only practical is to get bridging enabled right from the moment the host's network is brought online, eg to use the regular network initscripts to configure bridging.
eg So in /etc/xen/xend-config.sxp, change the network script option to
And then with your network configs setup
# cat > ifcfg-eth0 <<EOF
# cat > ifcfg-br0 <<EOF
For further info consult
At best this is a documentation problem / kbase item.
This is no way we can fix this issue in RHEL5.x, since fixing it would require stopping all use of XenD's networking scripts at boot time. This would cause major a regression for existing users of RHEL5 Xen. Thus we are closing this CANTFIX, and it will remain a manual task for people deploying cluster suite to disable Xen's networking scripts and configure bridging as per comment #6.
It would be helpful if the xen team documented this in a kb article and followed through on making sure customers are aware of this design problem with xen.
This bug was closed during 5.5 development and it's being removed from the internal tracking bugs (which are now for 5.6).