Bug 230783 - openais doesn't receive multicast traffic during xend startup
Summary: openais doesn't receive multicast traffic during xend startup
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: cman
Version: 5.0
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
: ---
Assignee: Ryan McCabe
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 240452
TreeView+ depends on / blocked
 
Reported: 2007-03-02 20:53 UTC by Ryan McCabe
Modified: 2009-04-16 22:43 UTC (History)
5 users (show)

Fixed In Version: RHBA-2007-0575
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2007-11-07 16:59:08 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
logs from huey.lab.boston.redhat.com (18.76 KB, text/plain)
2007-03-02 20:56 UTC, Ryan McCabe
no flags Details
logs from dewey.lab.boston.redhat.com (5.97 KB, text/plain)
2007-03-02 20:58 UTC, Ryan McCabe
no flags Details
logs from louey.lab.boston.redhat.com (18.38 KB, text/plain)
2007-03-02 21:01 UTC, Ryan McCabe
no flags Details
cluster.conf file used for the cluster (593 bytes, text/plain)
2007-03-02 21:01 UTC, Ryan McCabe
no flags Details
patch to work around xend bridged networking brain damage (2.08 KB, patch)
2007-04-23 17:52 UTC, Ryan McCabe
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2007:0575 0 normal SHIPPED_LIVE cman bug fix update 2007-10-31 12:26:24 UTC

Description Ryan McCabe 2007-03-02 20:53:46 UTC
The xend init script hangs the network when starting up, causing multicast
traffic to be lost, resulting in confused cluster nodes.

Seems to be the same problem (or related to it) discussed here:
http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=431

When the xend service is enabled at boot, it starts after cman, and hangs the
network long enough to cause the cluster to become unstable. On the three-node
cluster we're testing with, each of the three nodes reports a different view of
cluster membership. If xend is disabled at boot time, or xend is enabled, and
cman is disabled, but started after all nodes have booted, the problem does not
occur.

'cman_tool nodes' output when xend and cman services are enabled at boot:

[root@huey ~]# cman_tool nodes
NOTE: There are 1 disallowed nodes,
      members list may seem inconsistent across the cluster
Node  Sts   Inc   Joined               Name
   1   X     12                        louey.lab.boston.redhat.com
   2   M      4   2007-03-02 15:49:44  huey.lab.boston.redhat.com
   3   d     12   2007-03-02 15:49:44  dewey.lab.boston.redhat.com

[root@dewey ~]# cman_tool nodes
Node  Sts   Inc   Joined               Name
   1   X      8                        louey.lab.boston.redhat.com
   2   M     12   2007-03-02 15:49:42  huey.lab.boston.redhat.com
   3   M      4   2007-03-02 15:49:39  dewey.lab.boston.redhat.com

[root@louey ~]# cman_tool nodes
Node  Sts   Inc   Joined               Name
   1   M      4   2001-12-31 19:00:53  louey.lab.boston.redhat.com
   2   M     12   2001-12-31 19:00:55  huey.lab.boston.redhat.com
   3   M      8   2001-12-31 19:00:53  dewey.lab.boston.redhat.com

I'll attach /var/log/messages output for each of the three nodes.

Comment 1 Ryan McCabe 2007-03-02 20:56:01 UTC
Created attachment 149150 [details]
logs from huey.lab.boston.redhat.com

Comment 2 Ryan McCabe 2007-03-02 20:58:44 UTC
Created attachment 149151 [details]
logs from dewey.lab.boston.redhat.com

Comment 3 Ryan McCabe 2007-03-02 21:01:12 UTC
Created attachment 149152 [details]
logs from louey.lab.boston.redhat.com

Comment 4 Ryan McCabe 2007-03-02 21:01:53 UTC
Created attachment 149153 [details]
cluster.conf file used for the cluster

Comment 5 Stephen Tweedie 2007-03-02 21:24:43 UTC
Can you please supply "ifconfig" and "ip route" output, both after a successful
(non-Xen) boot, and after xend has started?

Comment 6 Ryan McCabe 2007-03-02 21:50:09 UTC
Before starting xend:

[root@huey ~]# ifconfig
eth0      Link encap:Ethernet  HWaddr 00:04:23:B5:3F:DA  
          inet addr:192.168.77.141  Bcast:192.168.79.255  Mask:255.255.252.0
          inet6 addr: fe80::204:23ff:feb5:3fda/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:1690 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1476 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:100 
          RX bytes:393906 (384.6 KiB)  TX bytes:271689 (265.3 KiB)
          Base address:0xd880 Memory:fcfa0000-fcfc0000 

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:81 errors:0 dropped:0 overruns:0 frame:0
          TX packets:81 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:9762 (9.5 KiB)  TX bytes:9762 (9.5 KiB)

[root@huey ~]# ip route
192.168.76.0/22 dev eth0  proto kernel  scope link  src 192.168.77.141 
169.254.0.0/16 dev eth0  scope link 
default via 192.168.79.254 dev eth0 

-----

[root@dewey ~]# ifconfig
eth0      Link encap:Ethernet  HWaddr 00:04:23:B5:41:1E  
          inet addr:192.168.77.142  Bcast:192.168.79.255  Mask:255.255.252.0
          inet6 addr: fe80::204:23ff:feb5:411e/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:2140 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1540 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:100 
          RX bytes:455460 (444.7 KiB)  TX bytes:263000 (256.8 KiB)
          Base address:0xd880 Memory:fcfa0000-fcfc0000 

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:148 errors:0 dropped:0 overruns:0 frame:0
          TX packets:148 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:18740 (18.3 KiB)  TX bytes:18740 (18.3 KiB)

[root@dewey ~]# ip route
192.168.76.0/22 dev eth0  proto kernel  scope link  src 192.168.77.142 
169.254.0.0/16 dev eth0  scope link 
default via 192.168.79.254 dev eth0 


-----

[root@louey ~]# ifconfig
eth0      Link encap:Ethernet  HWaddr 00:04:23:B5:47:70  
          inet addr:192.168.77.143  Bcast:192.168.79.255  Mask:255.255.252.0
          inet6 addr: fe80::204:23ff:feb5:4770/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:2267 errors:0 dropped:0 overruns:0 frame:0
          TX packets:1610 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:100 
          RX bytes:476602 (465.4 KiB)  TX bytes:275827 (269.3 KiB)
          Base address:0xd880 Memory:fcfa0000-fcfc0000 

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:81 errors:0 dropped:0 overruns:0 frame:0
          TX packets:81 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:9762 (9.5 KiB)  TX bytes:9762 (9.5 KiB)

[root@louey ~]# ip route
192.168.76.0/22 dev eth0  proto kernel  scope link  src 192.168.77.143 
169.254.0.0/16 dev eth0  scope link 
default via 192.168.79.254 dev eth0 



After starting xend:

[root@huey ~]# ifconfig
eth0      Link encap:Ethernet  HWaddr 00:04:23:B5:3F:DA  
          inet addr:192.168.77.141  Bcast:192.168.79.255  Mask:255.255.252.0
          inet6 addr: fe80::204:23ff:feb5:3fda/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:449 errors:0 dropped:0 overruns:0 frame:0
          TX packets:117 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:57429 (56.0 KiB)  TX bytes:21966 (21.4 KiB)

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:81 errors:0 dropped:0 overruns:0 frame:0
          TX packets:81 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:9762 (9.5 KiB)  TX bytes:9762 (9.5 KiB)

peth0     Link encap:Ethernet  HWaddr FE:FF:FF:FF:FF:FF  
          inet6 addr: fe80::fcff:ffff:feff:ffff/64 Scope:Link
          UP BROADCAST RUNNING NOARP  MTU:1500  Metric:1
          RX packets:2965 errors:0 dropped:0 overruns:0 frame:0
          TX packets:2297 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:100 
          RX bytes:569858 (556.5 KiB)  TX bytes:422530 (412.6 KiB)
          Base address:0xd880 Memory:fcfa0000-fcfc0000 

vif0.0    Link encap:Ethernet  HWaddr FE:FF:FF:FF:FF:FF  
          inet6 addr: fe80::fcff:ffff:feff:ffff/64 Scope:Link
          UP BROADCAST RUNNING NOARP  MTU:1500  Metric:1
          RX packets:119 errors:0 dropped:0 overruns:0 frame:0
          TX packets:449 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:22370 (21.8 KiB)  TX bytes:57429 (56.0 KiB)

xenbr0    Link encap:Ethernet  HWaddr FE:FF:FF:FF:FF:FF  
          UP BROADCAST RUNNING NOARP  MTU:1500  Metric:1
          RX packets:310 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:34365 (33.5 KiB)  TX bytes:0 (0.0 b)

[root@huey ~]# ip route
192.168.76.0/22 dev eth0  proto kernel  scope link  src 192.168.77.141 
169.254.0.0/16 dev eth0  scope link 
default via 192.168.79.254 dev eth0 

-----

[root@dewey ~]# ifconfig
eth0      Link encap:Ethernet  HWaddr 00:04:23:B5:41:1E  
          inet addr:192.168.77.142  Bcast:192.168.79.255  Mask:255.255.252.0
          inet6 addr: fe80::204:23ff:feb5:411e/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:219 errors:0 dropped:0 overruns:0 frame:0
          TX packets:64 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:30235 (29.5 KiB)  TX bytes:11832 (11.5 KiB)

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:148 errors:0 dropped:0 overruns:0 frame:0
          TX packets:148 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:18740 (18.3 KiB)  TX bytes:18740 (18.3 KiB)

peth0     Link encap:Ethernet  HWaddr FE:FF:FF:FF:FF:FF  
          inet6 addr: fe80::fcff:ffff:feff:ffff/64 Scope:Link
          UP BROADCAST RUNNING NOARP  MTU:1500  Metric:1
          RX packets:3605 errors:0 dropped:0 overruns:0 frame:0
          TX packets:2193 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:100 
          RX bytes:668610 (652.9 KiB)  TX bytes:385786 (376.7 KiB)
          Base address:0xd880 Memory:fcfa0000-fcfc0000 

vif0.0    Link encap:Ethernet  HWaddr FE:FF:FF:FF:FF:FF  
          inet6 addr: fe80::fcff:ffff:feff:ffff/64 Scope:Link
          UP BROADCAST RUNNING NOARP  MTU:1500  Metric:1
          RX packets:66 errors:0 dropped:0 overruns:0 frame:0
          TX packets:219 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:12236 (11.9 KiB)  TX bytes:30235 (29.5 KiB)

xenbr0    Link encap:Ethernet  HWaddr FE:FF:FF:FF:FF:FF  
          UP BROADCAST RUNNING NOARP  MTU:1500  Metric:1
          RX packets:172 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:19359 (18.9 KiB)  TX bytes:0 (0.0 b)

[root@dewey ~]# ip route
192.168.76.0/22 dev eth0  proto kernel  scope link  src 192.168.77.142 
169.254.0.0/16 dev eth0  scope link 
default via 192.168.79.254 dev eth0 

-----

[root@louey ~]# ifconfig
eth0      Link encap:Ethernet  HWaddr 00:04:23:B5:47:70  
          inet addr:192.168.77.143  Bcast:192.168.79.255  Mask:255.255.252.0
          inet6 addr: fe80::204:23ff:feb5:4770/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:172 errors:0 dropped:0 overruns:0 frame:0
          TX packets:54 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:22707 (22.1 KiB)  TX bytes:8838 (8.6 KiB)

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:16436  Metric:1
          RX packets:81 errors:0 dropped:0 overruns:0 frame:0
          TX packets:81 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:9762 (9.5 KiB)  TX bytes:9762 (9.5 KiB)

peth0     Link encap:Ethernet  HWaddr FE:FF:FF:FF:FF:FF  
          inet6 addr: fe80::fcff:ffff:feff:ffff/64 Scope:Link
          UP BROADCAST RUNNING NOARP  MTU:1500  Metric:1
          RX packets:4013 errors:0 dropped:0 overruns:0 frame:0
          TX packets:2433 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:100 
          RX bytes:730724 (713.5 KiB)  TX bytes:428606 (418.5 KiB)
          Base address:0xd880 Memory:fcfa0000-fcfc0000 

vif0.0    Link encap:Ethernet  HWaddr FE:FF:FF:FF:FF:FF  
          inet6 addr: fe80::fcff:ffff:feff:ffff/64 Scope:Link
          UP BROADCAST RUNNING NOARP  MTU:1500  Metric:1
          RX packets:56 errors:0 dropped:0 overruns:0 frame:0
          TX packets:172 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:9242 (9.0 KiB)  TX bytes:22707 (22.1 KiB)

xenbr0    Link encap:Ethernet  HWaddr FE:FF:FF:FF:FF:FF  
          UP BROADCAST RUNNING NOARP  MTU:1500  Metric:1
          RX packets:136 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:14485 (14.1 KiB)  TX bytes:0 (0.0 b)

[root@louey ~]# ip route
192.168.76.0/22 dev eth0  proto kernel  scope link  src 192.168.77.143 
169.254.0.0/16 dev eth0  scope link 
default via 192.168.79.254 dev eth0 

Comment 7 Kiersten (Kerri) Anderson 2007-04-23 17:48:29 UTC
Fixing Product Name.  Cluster Suite components were integrated into Enterprise
Linux version 5.0.

Comment 8 Ryan McCabe 2007-04-23 17:52:11 UTC
Created attachment 153296 [details]
patch to work around xend bridged networking brain damage

I've attached a patch to work around the xend network-bridge issues. I modified
the cman init script to check whether xend will start and if it will start and
is configured to use bridged networking. If both conditions are satisfied, the
cman init script will run '/etc/xen/scripts/network-bridge start' before doing
anything else. xend will do the same thing when it starts, but it'll
essentially be a noop.	As far as I can tell, nothing needs to be done when
cman stops.

Comment 9 Steven Dake 2007-04-24 23:18:31 UTC
Ryan
Since your working issue I'm reassigning to you.  Thanks
-steve

Comment 10 Ryan McCabe 2007-04-28 05:05:52 UTC
The workaround (posted above) has been committed to CVS now.

Comment 11 RHEL Program Management 2007-04-28 05:24:21 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 15 errata-xmlrpc 2007-11-07 16:59:08 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2007-0575.html



Note You need to log in before you can comment on or make changes to this bug.