Bug 459720 - [PATCH] fence_xvmd cannot start if default route is not set
Summary: [PATCH] fence_xvmd cannot start if default route is not set
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: cman
Version: 5.2
Hardware: All
OS: Linux
high
medium
Target Milestone: rc
: ---
Assignee: Lon Hohberger
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2008-08-21 15:26 UTC by Satoru SATOH
Modified: 2009-04-16 22:23 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-01-20 21:50:43 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
A patch for fence_xvmd to add option to select correct network interface explicitly (4.59 KB, patch)
2008-08-21 15:29 UTC, Satoru SATOH
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2009:0189 0 normal SHIPPED_LIVE cman bug-fix and enhancement update 2009-01-20 16:05:55 UTC

Description Satoru SATOH 2008-08-21 15:26:46 UTC
Description of problem: fence_xvmd cannot start if default route is set.


Version-Release number of selected component (if applicable):
cman-2.0.84-2.el5


How reproducible: always


Steps to Reproduce:
1. del or not set default route
2. start fence_xvmd by manual (with -f option not to be daemon)

Here is the log:

[root@cluster-1 ~]# rpm -qf /sbin/fence_xvmd
cman-2.0.84-2.el5
[root@cluster-1 ~]# fence_xvmd -LX -fddd
Debugging threshold is now 3
-- args @ 0xbfd70fbc --
  args->addr = 225.0.0.12
  args->domain = (null)
  args->key_file = /etc/cluster/fence_xvm.key
  args->op = 2
  args->hash = 2
  args->auth = 2
  args->port = 1229
  args->family = 2
  args->timeout = 30
  args->retr_time = 20
  args->flags = 259
  args->debug = 3
-- end args --
Reading in key file /etc/cluster/fence_xvm.key into 0xbfd6ffbc (4096 max size)
Actual key length = 4096 bytesMy Node ID = 1
Domain                   UUID                                 Owner State
------                   ----                                 ----- -----
Domain-0                 00000000-0000-0000-0000-000000000000 00001 00001
[root@cluster-1 ~]# ip route show
192.168.12.0/24 dev virbr0  proto kernel  scope link  src 192.168.12.1
192.168.122.0/24 dev eth0  proto kernel  scope link  src 192.168.122.11
169.254.0.0/16 dev eth0  scope link
default via 192.168.122.1 dev eth0
[root@cluster-1 ~]# ip route del default via 192.168.122.1 dev eth0
[root@cluster-1 ~]# ip route show
192.168.12.0/24 dev virbr0  proto kernel  scope link  src 192.168.12.1
192.168.122.0/24 dev eth0  proto kernel  scope link  src 192.168.122.11
169.254.0.0/16 dev eth0  scope link
[root@cluster-1 ~]# fence_xvmd -LX -fddd
Debugging threshold is now 3
-- args @ 0xbf93662c --
  args->addr = 225.0.0.12
  args->domain = (null)
  args->key_file = /etc/cluster/fence_xvm.key
  args->op = 2
  args->hash = 2
  args->auth = 2
  args->port = 1229
  args->family = 2
  args->timeout = 30
  args->retr_time = 20
  args->flags = 259
  args->debug = 3
-- end args --
Reading in key file /etc/cluster/fence_xvm.key into 0xbf93562c (4096 max size)
Actual key length = 4096 bytesFailed to bind multicast receive socket to 225.0.0.12: No such device
Check network configuration.
Could not set up multicast listen socket
[root@cluster-1 ~]# ip route add default via 192.168.122.1 dev eth0
[root@cluster-1 ~]# fence_xvmd -LX -fddd
Debugging threshold is now 3
-- args @ 0xbfb76dbc --
  args->addr = 225.0.0.12
  args->domain = (null)
  args->key_file = /etc/cluster/fence_xvm.key
  args->op = 2
  args->hash = 2
  args->auth = 2
  args->port = 1229
  args->family = 2
  args->timeout = 30
  args->retr_time = 20
  args->flags = 259
  args->debug = 3
-- end args --
Reading in key file /etc/cluster/fence_xvm.key into 0xbfb75dbc (4096 max size)
Actual key length = 4096 bytesMy Node ID = 1
Domain                   UUID                                 Owner State
------                   ----                                 ----- -----
Domain-0                 00000000-0000-0000-0000-000000000000 00001 00001
[root@cluster-1 ~]#



Actual results: it cannot start


Expected results: it starts w/o any problems.


Additional info:
This probelm is because that interface to listen is not selected explicitly in 
both ipv4 and ipv6 code, I think.

It's my understanding that listening interface is selected by linux kernel in
this case and it should be the one to default gw if appropriate route to 
the multicast network to join is not set. (do_ip_setsockopt, ip_mc_join_group
and other related functions in kernel/net/ipv4/ip_sockglue.c).

Also, this behavior checks with the BSD variants' behavior noted in the 
multicast section in the very famous book, Unix networking programming.


There is another problem behind this. 

There is no way to indicate fence_xvmd to select correct interface if host has 
multiple network interfaces. So, fence_xvmd might listen on wrong interface and 
cannot communicate with fence_xvm at all.


To fix these problems, fence_xvmd must have the way to select correct interface
to listen.

I'll post a patch solves these problems at once.

Comment 1 Satoru SATOH 2008-08-21 15:29:01 UTC
Created attachment 314709 [details]
A patch for fence_xvmd to add option to select correct network interface explicitly

Comment 2 Satoru SATOH 2008-08-21 15:37:26 UTC
Here is a log of fence_xvmd with my patch attached:

[root@cluster-1 ~]# /var/tmp/cman-2.0.84-2-root-root/sbin/fence_xvmd -LX -fddddd -I virbr0
Debugging threshold is now 5
-- args @ 0xbfc095b8 --
  args->addr = 225.0.0.12
  args->domain = (null)
  args->key_file = /etc/cluster/fence_xvm.key
  args->op = 2
  args->hash = 2
  args->auth = 2
  args->port = 1229
  args->ifindex = 5
  args->family = 2
  args->timeout = 30
  args->retr_time = 20
  args->flags = 259
  args->debug = 5
-- end args --
Reading in key file /etc/cluster/fence_xvm.key into 0xbfc085b8 (4096 max size)
Actual key length = 4096 bytesSetting up ipv4 multicast receive (225.0.0.12:1229)
Joining multicast group
ipv4_recv_sk: success, fd = 3
My Node ID = 1
Domain                   UUID                                 Owner State
------                   ----                                 ----- -----
Domain-0                 00000000-0000-0000-0000-000000000000 00001 00001
[root@cluster-1 ~]# ip route del default via 192.168.122.1 dev eth0
[root@cluster-1 ~]# /var/tmp/cman-2.0.84-2-root-root/sbin/fence_xvmd -LX -fddddd -I virbr0
Debugging threshold is now 5
-- args @ 0xbf95b308 --
  args->addr = 225.0.0.12
  args->domain = (null)
  args->key_file = /etc/cluster/fence_xvm.key
  args->op = 2
  args->hash = 2
  args->auth = 2
  args->port = 1229
  args->ifindex = 5
  args->family = 2
  args->timeout = 30
  args->retr_time = 20
  args->flags = 259
  args->debug = 5
-- end args --
Reading in key file /etc/cluster/fence_xvm.key into 0xbf95a308 (4096 max size)
Actual key length = 4096 bytesSetting up ipv4 multicast receive (225.0.0.12:1229)
Joining multicast group
ipv4_recv_sk: success, fd = 3
My Node ID = 1
Domain                   UUID                                 Owner State
------                   ----                                 ----- -----
Domain-0                 00000000-0000-0000-0000-000000000000 00001 00001
[root@cluster-1 ~]#


If no interface is specified with the "-I" option, default interface #0 
corresponding to INADDR_ANY in ipv4, will be used. In this case, default route 
is not set so that fence_xvmd does not start as expected.

[root@cluster-1 ~]# /var/tmp/cman-2.0.84-2-root-root/sbin/fence_xvmd -LX -fddddd
Debugging threshold is now 5
-- args @ 0xbffea1a8 --
  args->addr = 225.0.0.12
  args->domain = (null)
  args->key_file = /etc/cluster/fence_xvm.key
  args->op = 2
  args->hash = 2
  args->auth = 2
  args->port = 1229
  args->ifindex = 0
  args->family = 2
  args->timeout = 30
  args->retr_time = 20
  args->flags = 259
  args->debug = 5
-- end args --
Reading in key file /etc/cluster/fence_xvm.key into 0xbffe91a8 (4096 max size)
Actual key length = 4096 bytesSetting up ipv4 multicast receive (225.0.0.12:1229)
Joining multicast group
Failed to bind multicast receive socket to 225.0.0.12: No such device
Check network configuration.
Could not set up multicast listen socket
[root@cluster-1 ~]# ip route add default via 192.168.122.1 dev eth0
[root@cluster-1 ~]#

Comment 3 Satoru SATOH 2008-08-22 07:47:56 UTC
(In reply to comment #2)
> Here is a log of fence_xvmd with my patch attached:

s/attached/applied/ :P

Comment 7 Jasper Capel 2008-12-08 16:25:27 UTC
Is this patch scheduled for inclusion in RHEL5.2? If so, do you have a release date? I'm willing to test this, if required.

Kind regards,
Jasper Capel

Comment 9 errata-xmlrpc 2009-01-20 21:50:43 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2009-0189.html


Note You need to log in before you can comment on or make changes to this bug.