Bug 822134

Summary:

fencing for ipmilan fails

Product:

Red Hat Enterprise Linux 5

Reporter:

Klaus Steinberger <klaus.steinberger>

Component:

cman

Assignee:

David Teigland <teigland>

Status:

CLOSED NOTABUG

QA Contact:

Cluster QE <mspqa-list>

Severity:

high

Docs Contact:

Priority:

medium

Version:

5.8

CC:

cluster-maint, edamato, fdinitto, klaus.steinberger, shankarjha21

Target Milestone:

Target Release:

---

Hardware:

x86_64

OS:

Linux

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2012-07-12 08:29:16 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
our cluster.conf (passwords replaced by sometext)	none

Description Klaus Steinberger 2012-05-16 12:49:05 UTC

Created attachment 584955 [details]
our cluster.conf (passwords replaced by sometext)

Description of problem:
fence fails on our cluster, the problem appeard somewhere in 5.7 and did not go away with 5.8. We tested of course the fencing when the cluster was setup somewhere with 5.5 or so.

Version-Release number of selected component (if applicable):
cman-2.0.115-96.el5_8.1.x86_64

How reproducible:
every time

and it happens on two similarly clusters


Steps to Reproduce:
1. calling fence_node nodename
2. command comes back immediately
3.
  
Actual results:
an entry in the log:

May 14 12:15:42 cip-ha-xen03-hb fence_node[1190]: Fence of "cip-ha-xen02-hb" was unsuccessful 
May 16 09:57:08 cip-ha-xen03-hb fence_node[13071]: Fence of "cip-ha-xen02" was unsuccessful 
May 16 12:18:29 cip-ha-xen03-hb fence_node[2687]: Fence of "cip-ha-xen02" was unsuccessful 

Expected results:
fencing node

Additional info:

Comment 1 RHEL Program Management 2012-05-21 21:18:58 UTC

This request was evaluated by Red Hat Product Management for inclusion
in a Red Hat Enterprise Linux release.  Product Management has
requested further review of this request by Red Hat Engineering, for
potential inclusion in a Red Hat Enterprise Linux release for currently
deployed products.  This request is not yet committed for inclusion in
a release.

Comment 2 Marek Grac 2012-05-24 12:58:18 UTC

@Klaus:

Thanks for a report. I cannot find a problem directly in source code. Can I ask you to run fence agent directly? 

Obtain machine status:

fence_ipmilan -a IPADRRESS -l fence -p PASSWDORD -P -v -o status
fence_ipmilan -a IPADRRESS -l fence -p PASSWDORD -P -v -A password -o status

if this will work, then also same with '-o status' changed to '-o reboot'. If it won't work please try to add option '-T 4'. What kind of hardware do you have?

Comment 3 Klaus Steinberger 2012-05-24 13:03:45 UTC

fence_ipmilan works, we already tested that (also with reboot)

Originally under 5.5 (or so) when we installed the cluster fencing worked well.
We can´t tell when the problem did arise, but probably with the update to 5.7

We have this problem on two clusters (both with same hardware), we use Fuijtsu RX300S5

Sincerly,
Klaus

Comment 4 Marek Grac 2012-05-24 13:09:13 UTC

Hmm, that's really strange. If fence_ipmilan works then I can think only about bug in parsing standard input vs command-line arguments. Can you try:

fence_ipmilan

and on STDIN:

auth=password
lanplus=1
ipaddr=10.153.24.1
login=fence
passwd=somepassword

Fence node should run it in very same way. 

Perhaps one more thing, if you use SELinux then fence_node is runned under different contexts from cluster and from user-space.

Comment 5 Klaus Steinberger 2012-05-24 16:38:53 UTC

That works:

root@gar-ha-xen01 ~]# fence_ipmilan 
auth=password
lanplus=1
ipaddr=10.153.180.25
login=fence
passwd=somepassword
action=status
Getting status of IPMI:10.153.180.25...Chassis power = On
Done
[root@gar-ha-xen01 ~]#

and on the other cluster too:

[root@cip-ha-xen01 ~]# fence_ipmilan 
auth=password
lanplus=1
ipaddr=10.153.24.1
login=fence
action=status
passwd=somepassword
Getting status of IPMI:10.153.24.1...Chassis power = On
Done
[root@cip-ha-xen01 ~]# 

selinux is switched off on our 5.8 cluster systems

Comment 6 Marek Grac 2012-05-31 13:36:21 UTC

@Klaus:

Thanks for response. Now, I'm pretty sure that bug is not in fence agent itself but elsewhere. I'm moving this bug to colleague who can handle infrastructure better than me. Please can you add a relevant part of /var/log/messages?

Comment 7 David Teigland 2012-05-31 14:46:32 UTC

To test the config parsing, change:

<fencedevice agent="fence_ipmilan" ...

to

<fencedevice agent="/bin/true" ...

and then run fence_node cip-ha-xen02-hb

Comment 8 Klaus Steinberger 2012-06-15 07:43:20 UTC

I probably can run the test somewhere in the week from 25th june

Comment 9 shankar 2012-06-26 14:50:31 UTC

Hello Friends,

I am trying to configure rhel cluster on RHEL6.1.


[root@web ~]# fence_ipmilan -a 172.16.18.x -l root -p root1234 -o reboot -M cycle -v
Rebooting machine @ IPMI:172.16.18.x...Spawning: '/usr/bin/ipmitool -I lan -H '172.16.18.122' -U 'root' -P 'root1234' -v chassis power status'...
Spawning: '/usr/bin/ipmitool -I lan -H '172.16.18.x' -U 'root' -P 'root1234' -v chassis power cycle'...
Failed
[root@web ~]#

[root@web ~]# /usr/sbin/fence_ipmilan -a 172.16.18.y -l root -p root1234 -o reboot
Rebooting machine @ IPMI:172.16.18.y...Failed
[root@web ~]#


[root@web ~]# ipmitool -H 172.16.18.x -I lanplus -U root -P root1234 chassis power status
Chassis Power is on
[root@web ~]#

[root@web ~]# cat /etc/cluster/cluster.conf
<?xml version="1.0"?>
<cluster config_version="52" name="cseb_cluster">
        <clusternodes>
                <clusternode name="mail.cseb.gov.in" nodeid="1">
                        <fence>
                                <method name="mailfailover">
                                        <device name="mailfailover"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="web.cseb.gov.in" nodeid="2">
                        <fence>
                                <method name="webfailover">
                                        <device name="webfailover"/>
                                </method>
                        </fence>
                </clusternode>
        </clusternodes>
        <cman expected_votes="1" two_node="1"/>
        <fence_daemon post_join_delay="6"/>
        <fencedevices>
                <fencedevice agent="fence_ipmilan" auth="password" ipaddr="172.16.18.123" lanplus="on" login="root" name="mailfailover" passwd="root1234"/>
                <fencedevice agent="fence_ipmilan" auth="password" ipaddr="172.16.18.122" lanplus="on" login="root" name="webfailover" passwd="root1234"/>
        </fencedevices>
        <rm>
                <failoverdomains>
                        <failoverdomain name="mailing" nofailback="0" ordered="1" restricted="0">
                                <failoverdomainnode name="mail.cseb.gov.in" priority="1"/>
                                <failoverdomainnode name="web.cseb.gov.in" priority="2"/>
                        </failoverdomain>
                        <failoverdomain name="web" nofailback="0" ordered="1" restricted="0">
                                <failoverdomainnode name="mail.cseb.gov.in" priority="2"/>
                                <failoverdomainnode name="web.cseb.gov.in" priority="1"/>
                        </failoverdomain>
                </failoverdomains>
        </rm>
</cluster>
[root@web ~]#


Expert advice is required.

Thanks,
Shankar

Comment 10 Fabio Massimo Di Nitto 2012-07-02 14:33:18 UTC

Klaus,

have you been able to reproduce the issue and/or perform the requested testing?

Comment 11 Klaus Steinberger 2012-07-03 06:49:05 UTC

Due to some system constraints (production and a very important running restore of big data) I was not able to run the test until yet. I will try it probably on monday.

Sincerly,
Klaus

Comment 12 Klaus Steinberger 2012-07-10 09:36:46 UTC

Hi, 

I did the test now with /bin/true:

                <fencedevice agent="/bin/true" ipaddr="10.153.180.26" login="fence" name="ipmi-gar-ha-xen03" lanplus="1" passwd="somepw"/>


fence_node returned nothing:

[root@gar-ha-xen01 cluster]# fence_node gar-ha-xen03-hb
[root@gar-ha-xen01 cluster]# 


But the logs:

Jul 10 11:30:59 gar-ha-xen01 fence_node[533]: Fence of "gar-ha-xen03-hb" was unsuccessful

Comment 13 David Teigland 2012-07-10 15:05:37 UTC

The fence_node command takes the node name in cluster.conf clusternode.
Using the cluster.conf in comment 9, you'd need to run either
fence_node mail.cseb.gov.in or fence_node web.cseb.gov.in

Comment 14 Klaus Steinberger 2012-07-11 05:50:27 UTC

(In reply to comment #13)
> The fence_node command takes the node name in cluster.conf clusternode.
> Using the cluster.conf in comment 9, you'd need to run either
> fence_node mail.cseb.gov.in or fence_node web.cseb.gov.in

The cluster.conf in comment #9 has nothing todo with my problem. gar-ha-xen03-hb was the correct node name. I have two clusters both with the very same fencing problem one of the cluster.conf files is attached to this bug, the second one (the cluster I used for the test yesterday) is here:

<?xml version="1.0"?>
<cluster config_version="76" name="gar-ha-cluster">
        <fence_daemon post_fail_delay="0" post_join_delay="3"/>
        <clusternodes>
                <clusternode name="gar-ha-xen01-hb" nodeid="1" votes="1">
                        <fence>
                                <method name="1">
                                        <device name="impi-gar-ha-xen01"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="gar-ha-xen02-hb" nodeid="2" votes="1">
                        <fence>
                                <method name="1">
                                        <device name="impi-gar-ha-xen02"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="gar-ha-xen03-hb" nodeid="3" votes="1">
                        <fence>
                                <method name="1">
                                        <device name="impi-gar-ha-xen03"/>
                                </method>
                        </fence>
                </clusternode>
        </clusternodes>
        <cman/>
        <fencedevices>
                <fencedevice agent="fence_ipmilan" ipaddr="10.153.180.24" login="fence" name="ipmi-gar-ha-xen01" lanplus="1" passwd="somepw"/>
                <fencedevice agent="fence_ipmilan" ipaddr="10.153.180.25" login="fence" name="ipmi-gar-ha-xen02" lanplus="1" passwd="somepw"/>
                <fencedevice action="reboot" agent="fence_ipmilan" auth="password" ipaddr="10.153.180.26" login="fence" name="ipmi-gar-ha-xen03" lanplus="1" passwd="somepw"/>
        </fencedevices>
        <rm>
                <failoverdomains>
                        <failoverdomain name="fod_A" ordered="1" restricted="0">
                                <failoverdomainnode name="gar-ha-xen01-hb" priority="1"/>
                                <failoverdomainnode name="gar-ha-xen02-hb" priority="2"/>
                                <failoverdomainnode name="gar-ha-xen03-hb" priority="3"/>                                                                                                           
                        </failoverdomain>                                                                                                                                                           
                        <failoverdomain name="fod_B" ordered="1">                                                                                                                                   
                                <failoverdomainnode name="gar-ha-xen01-hb" priority="2"/>                                                                                                           
                                <failoverdomainnode name="gar-ha-xen02-hb" priority="1"/>                                                                                                           
                                <failoverdomainnode name="gar-ha-xen03-hb" priority="3"/>                                                                                                           
                        </failoverdomain>                                                                                                                                                           
                        <failoverdomain name="fod_C" ordered="1">                                                                                                                                   
                                <failoverdomainnode name="gar-ha-xen01-hb" priority="3"/>                                                                                                           
                                <failoverdomainnode name="gar-ha-xen02-hb" priority="2"/>                                                                                                           
                                <failoverdomainnode name="gar-ha-xen03-hb" priority="1"/>                                                                                                           
                        </failoverdomain>                                                                                                                                                           
                </failoverdomains>                                                                                                                                                                  
                <resources/>                                                                                                                                                                        
                <vm autostart="1" domain="fod_A" exclusive="0" name="kvm_gar-ts01" path="/etc/libvirt/qemu" recovery="restart" migrate="live"/>                                                     
                <vm autostart="1" domain="fod_A" exclusive="0" name="kvm_nfsdiv01" path="/etc/libvirt/qemu" recovery="restart" migrate="live"/>                                                     
                <vm autostart="1" domain="fod_A" exclusive="0" name="kvm_etprd04" path="/etc/libvirt/qemu" recovery="restart" migrate="live"/>                                                      
                <vm autostart="1" domain="fod_A" exclusive="0" name="kvm_gar-sv-login01" path="/etc/libvirt/qemu" recovery="restart" migrate="live"/>                                               
                <vm autostart="1" domain="fod_B" exclusive="0" name="kvm_gar_sv_login_1" path="/etc/libvirt/qemu" recovery="restart" migrate="live"/>                                               
                <vm autostart="1" domain="fod_B" exclusive="0" name="kvm_filer_map" path="/etc/libvirt/qemu" recovery="restart" migrate="live"/>                                                    
                <vm autostart="1" domain="fod_B" exclusive="0" name="kvm_gar-ex-etpgrid01" path="/etc/libvirt/qemu" recovery="restart" migrate="live"/>                                             
                <vm autostart="1" domain="fod_B" exclusive="0" name="kvm_gar-ex-gonio" path="/etc/libvirt/qemu" recovery="restart" migrate="live"/>                                                 
                <vm autostart="1" domain="fod_B" exclusive="0" name="kvm_gar-sv-hanfs" path="/etc/libvirt/qemu" recovery="restart" migrate="live"/>
                <vm autostart="1" domain="fod_B" exclusive="0" name="kvm_gar-sv-login02" path="/etc/libvirt/qemu" recovery="restart" migrate="live"/>
                <vm autostart="1" domain="fod_C" exclusive="0" name="kvm_ha-mws" path="/etc/libvirt/qemu" recovery="restart" migrate="live"/>
                <vm autostart="1" domain="fod_C" exclusive="0" name="kvm_gar-sv-mllnfs01" path="/etc/libvirt/qemu" recovery="restart" migrate="live"/>
                <vm autostart="1" domain="fod_C" exclusive="0" name="kvm_gar-sv-snanfs" path="/etc/libvirt/qemu" recovery="restart" migrate="live"/>
                <vm autostart="1" domain="fod_C" exclusive="0" name="kvm_gar-sv-sstrahl" path="/etc/libvirt/qemu" recovery="restart" migrate="live"/>
                <vm autostart="1" domain="fod_C" exclusive="0" name="kvm_gar-sv-mllnfs02" path="/etc/libvirt/qemu" recovery="restart" migrate="live"/>

        </rm>
</cluster>

Comment 15 David Teigland 2012-07-11 17:02:11 UTC

This looks like a ccs problem.

in fenced/agent.c dispatch_fence_agent() -> use_device() -> ccs_get()

The ccs_get function is often returning -ENODATA.  I've tried endless variations of the cluster.conf parameters and sometimes it works, but often it doesn't.  I haven't seen any pattern, it seems random.  The behavior seems to be consistent while ccs is running, but will sometimes change after the cluster is restarted or the machines are rebooted.


<?xml version="1.0"?>
<cluster config_version="77" name="gar-ha-cluster">
        <fence_daemon post_fail_delay="0" post_join_delay="3"/>
        <clusternodes>
        <clusternode name="dct-xen-01" nodeid="1" votes="1">
                <fence>
                <method name="1">
                <device name="xen01"/>
                </method>
                </fence>
        </clusternode>
        <clusternode name="dct-xen-02" nodeid="2" votes="1">
                <fence>
                <method name="1">
                <device name="impi-gar-ha-xen02"/>
                </method>
                </fence>
        </clusternode>
        <clusternode name="dct-xen-03" nodeid="3" votes="1">
                <fence>
                <method name="1">
                <device name="impi-gar-ha-xen03"/>
                </method>
                </fence>
        </clusternode>
        </clusternodes>
        <cman/>
        <fencedevices>
        <fencedevice agent="/bin/true" ipaddr="10.153.180.24" login="fence" name="xen01" lanplus="1" passwd="somepw"/>
        <fencedevice name="ipmi-gar-ha-xen02" agent="/bin/true" ipaddr="10.153.180.25" login="fence" lanplus="1" passwd="somepw"/>
        <fencedevice action="reboot" agent="/bin/true" auth="password" ipaddr="10.153.180.26" login="fence" name="ipmi-gar-ha-xen03" lanplus="1" passwd="somepw"/>
        </fencedevices>
        <rm>
        </rm>
</cluster>

[root@dct-xen-01 fence_node]# ccs_test get 2790 /cluster/fencedevices/fencedevice[@name=\"impi-gar-ha-xen03\"]/@agent
ccs_get failed: No data available
[root@dct-xen-01 fence_node]# ccs_test get 2790 /cluster/fencedevices/fencedevice[@name=\"impi-gar-ha-xen02\"]/@agent
ccs_get failed: No data available
[root@dct-xen-01 fence_node]# ccs_test get 2790 /cluster/fencedevices/fencedevice[@name=\"xen01\"]/@agent
Get successful.
 Value = </bin/true>


restart cluster with new cluster.conf


<?xml version="1.0"?>
<cluster config_version="77" name="gar-ha-cluster">
        <fence_daemon post_fail_delay="0" post_join_delay="3"/>
        <clusternodes>
        <clusternode name="dct-xen-01" nodeid="1" votes="1">
                <fence>
                <method name="1">
                <device name="xen-01"/>
                </method>
                </fence>
        </clusternode>
        <clusternode name="dct-xen-02" nodeid="2" votes="1">
                <fence>
                <method name="1">
                <device name="impi_gar_ha_xen02"/>
                </method>
                </fence>
        </clusternode>
        <clusternode name="dct-xen-03" nodeid="3" votes="1">
                <fence>
                <method name="1">
                <device name="impigarhaxen03"/>
                </method>
                </fence>
        </clusternode>
        </clusternodes>
        <cman/>
        <fencedevices>
        <fencedevice agent="/bin/true" ipaddr="10.153.180.24" login="fence" name="xen-01" lanplus="1" passwd="somepw"/>
        <fencedevice name="impi_gar_ha_xen02" agent="/bin/true" ipaddr="10.153.180.25" login="fence" lanplus="1" passwd="somepw"/>
        <fencedevice action="reboot" agent="/bin/true" auth="password" ipaddr="10.153.180.26" login="fence" name="impigarhaxen03" lanplus="1" passwd="somepw"/>
        </fencedevices>
        <rm>
        </rm>
</cluster>

[root@dct-xen-01 fence_node]# ccs_test get 271 /cluster/clusternodes/clusternode[@name=\"dct-xen-01\"]/@nodeid
Get successful.
 Value = <1>
[root@dct-xen-01 fence_node]# ccs_test get 271 /cluster/fencedevices/fencedevice[@name=\"xen-01\"]/@agent
Get successful.
 Value = </bin/true>
[root@dct-xen-01 fence_node]# ccs_test get 271 /cluster/fencedevices/fencedevice[@name=\"impi_gar_ha_xen02\"]/@agent
Get successful.
 Value = </bin/true>
[root@dct-xen-01 fence_node]# ccs_test get 271 /cluster/fencedevices/fencedevice[@name=\"impigarhaxen03\"]/@agent
Get successful.


restart cluster with new cluster.conf


<?xml version="1.0"?>
<cluster config_version="77" name="gar-ha-cluster">
        <fence_daemon post_fail_delay="0" post_join_delay="3"/>
        <clusternodes>
        <clusternode name="dct-xen-01" nodeid="1" votes="1">
                <fence>
                <method name="1">
                <device name="impi-gar-ha-xen01"/>
                </method>
                </fence>
        </clusternode>
        <clusternode name="dct-xen-02" nodeid="2" votes="1">
                <fence>
                <method name="1">
                <device name="impi-gar-ha-xen02"/>
                </method>
                </fence>
        </clusternode>
        <clusternode name="dct-xen-03" nodeid="3" votes="1">
                <fence>
                <method name="1">
                <device name="impi-gar-ha-xen03"/>
                </method>
                </fence>
        </clusternode>
        </clusternodes>
        <cman/>
        <fencedevices>
        <fencedevice agent="/bin/true" ipaddr="10.153.180.24" login="fence" name="impi-gar-ha-xen01" lanplus="1" passwd="somepw"/>
        <fencedevice name="impi-gar-ha-xen02" agent="/bin/true" ipaddr="10.153.180.25" login="fence" lanplus="1" passwd="somepw"/>
        <fencedevice action="reboot" agent="/bin/true" auth="password" ipaddr="10.153.180.26" login="fence" name="impi-gar-ha-xen03" lanplus="1" passwd="somepw"/>
        </fencedevices>
        <rm>
        </rm>
</cluster>

[root@dct-xen-01 fence_node]# ccs_test get 241 /cluster/fencedevices/fencedevice[@name=\"impi-gar-ha-xen01\"]/@agent
Get successful.
 Value = </bin/true>
[root@dct-xen-01 fence_node]# ccs_test get 241 /cluster/fencedevices/fencedevice[@name=\"impi-gar-ha-xen02\"]/@agent
Get successful.
 Value = </bin/true>
[root@dct-xen-01 fence_node]# ccs_test get 241 /cluster/fencedevices/fencedevice[@name=\"impi-gar-ha-xen03\"]/@agent
Get successful.
 Value = </bin/true>

Comment 16 Lon Hohberger 2012-07-11 17:36:43 UTC

(In reply to comment #15)
> This looks like a ccs problem.

Unfortunately, it looks the tests were run with mismatches vs. what was in cluster.conf...


>         </clusternodes>
>         <cman/>
>         <fencedevices>
>         <fencedevice agent="/bin/true" ipaddr="10.153.180.24" login="fence"
> name="xen01" lanplus="1" passwd="somepw"/>

This one worked - because it matched what was in cluster.conf.

>         <fencedevice name="ipmi-gar-ha-xen02" agent="/bin/true"
                             ^^^^ ipmi

> ipaddr="10.153.180.25" login="fence" lanplus="1" passwd="somepw"/>
>         <fencedevice action="reboot" agent="/bin/true" auth="password"
> ipaddr="10.153.180.26" login="fence" name="ipmi-gar-ha-xen03" lanplus="1"
> passwd="somepw"/>
                                             ^^^^ ipmi


> 
> [root@dct-xen-01 fence_node]# ccs_test get 2790
> /cluster/fencedevices/fencedevice[@name=\"impi-gar-ha-xen03\"]/@agent
> ccs_get failed: No data available

impi-gar-ha-xen03 != ipmi-gar-ha-xen03
^^^^                 ^^^^

> [root@dct-xen-01 fence_node]# ccs_test get 2790
> /cluster/fencedevices/fencedevice[@name=\"impi-gar-ha-xen02\"]/@agent
> ccs_get failed: No data available

impi-gar-ha-xen02 != ipmi-gar-ha-xen02
^^^^                 ^^^^

Maybe a bad paste or something?


> [root@dct-xen-01 fence_node]# ccs_test get 2790
> /cluster/fencedevices/fencedevice[@name=\"xen01\"]/@agent
> Get successful.
>  Value = </bin/true>

No typo.


> restart cluster with new cluster.conf

>         <fencedevice agent="/bin/true" ipaddr="10.153.180.24" login="fence"
> name="xen-01" lanplus="1" passwd="somepw"/>
>         <fencedevice name="impi_gar_ha_xen02" agent="/bin/true"
> ipaddr="10.153.180.25" login="fence" lanplus="1" passwd="somepw"/>
>         <fencedevice action="reboot" agent="/bin/true" auth="password"
> ipaddr="10.153.180.26" login="fence" name="impigarhaxen03" lanplus="1"
> passwd="somepw"/>
>         </fencedevices>
>         <rm>
>         </rm>
> </cluster>
> 
> [root@dct-xen-01 fence_node]# ccs_test get 271
> /cluster/clusternodes/clusternode[@name=\"dct-xen-01\"]/@nodeid
> Get successful.
>  Value = <1>
> [root@dct-xen-01 fence_node]# ccs_test get 271
> /cluster/fencedevices/fencedevice[@name=\"xen-01\"]/@agent
> Get successful.
>  Value = </bin/true>
> [root@dct-xen-01 fence_node]# ccs_test get 271
> /cluster/fencedevices/fencedevice[@name=\"impi_gar_ha_xen02\"]/@agent
> Get successful.
>  Value = </bin/true>
> [root@dct-xen-01 fence_node]# ccs_test get 271
> /cluster/fencedevices/fencedevice[@name=\"impigarhaxen03\"]/@agent
> Get successful.

These all matched what was in the cluster.conf.  Note "ipmi" in the names was changed to "impi" iteration of cluster.conf.


> restart cluster with new cluster.conf

>         <fencedevices>
>         <fencedevice agent="/bin/true" ipaddr="10.153.180.24" login="fence"
> name="impi-gar-ha-xen01" lanplus="1" passwd="somepw"/>
>         <fencedevice name="impi-gar-ha-xen02" agent="/bin/true"
> ipaddr="10.153.180.25" login="fence" lanplus="1" passwd="somepw"/>
>         <fencedevice action="reboot" agent="/bin/true" auth="password"
> ipaddr="10.153.180.26" login="fence" name="impi-gar-ha-xen03" lanplus="1"
                                             ^^^^  Now it's impi.


> passwd="somepw"/>
>         </fencedevices>
>         <rm>
>         </rm>
> </cluster>
> 
> [root@dct-xen-01 fence_node]# ccs_test get 241
> /cluster/fencedevices/fencedevice[@name=\"impi-gar-ha-xen01\"]/@agent
> Get successful.
>  Value = </bin/true>

This iteration of cluster.conf uses "impi" instead of "ipmi", so these tests started working.


> [root@dct-xen-01 fence_node]# ccs_test get 241
> /cluster/fencedevices/fencedevice[@name=\"impi-gar-ha-xen02\"]/@agent
> Get successful.
>  Value = </bin/true>
> [root@dct-xen-01 fence_node]# ccs_test get 241
> /cluster/fencedevices/fencedevice[@name=\"impi-gar-ha-xen03\"]/@agent
> Get successful.
>  Value = </bin/true>

Using what's in cluster.conf, I can't get this to fail.

Comment 18 Lon Hohberger 2012-07-11 17:50:37 UTC

Klaus, that's what's wrong with your config.

                <fencedevice agent="fence_ipmilan" ipaddr="10.153.180.24" login="fence" name="ipmi-gar-ha-xen01" lanplus="1" passwd="somepw"/>
                <fencedevice agent="fence_ipmilan" ipaddr="10.153.180.25" login="fence" name="ipmi-gar-ha-xen02" lanplus="1" passwd="somepw"/>
                <fencedevice action="reboot" agent="fence_ipmilan" auth="password" ipaddr="10.153.180.26" login="fence" name="ipmi-gar-ha-xen03" lanplus="1" passwd="somepw"/>

bunch of ipmi-x-y

... but device lines are wrong:

                <device name="impi-gar-ha-xen01"/>
                <device name="impi-gar-ha-xen02"/>
                <device name="impi-gar-ha-xen03"/>

ipmi and impi are different.

Comment 19 Lon Hohberger 2012-07-11 18:11:07 UTC

(In reply to comment #18)
> Klaus, that's what's wrong with your config.

Well, at least with the one in comment #14...

Comment 20 Klaus Steinberger 2012-07-12 07:25:43 UTC

(In reply to comment #18)
> Klaus, that's what's wrong with your config.


Aiiiiiii!

The blindest spot is always in front of the own nose!

Many thanks!

It works now.

Comment 21 Ryan McCabe 2012-07-20 14:12:17 UTC

*** Bug 822104 has been marked as a duplicate of this bug. ***