Bug 804805

Summary: fence_node -U fences instead of unfencing
Product: Red Hat Enterprise Linux 6 Reporter: Kapetanakis Giannis <bilias>
Component: fence-agentsAssignee: Marek Grac <mgrac>
Status: CLOSED ERRATA QA Contact: Cluster QE <mspqa-list>
Severity: high Docs Contact:
Priority: medium    
Version: 6.2CC: ccaulfie, cluster-maint, djansa, fdinitto, lhh, mjuricek, rpeterso, teigland
Target Milestone: rc   
Target Release: ---   
Hardware: powerpc   
OS: Unspecified   
Whiteboard:
Fixed In Version: 3.2.5-17 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2012-06-20 14:40:31 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Kapetanakis Giannis 2012-03-19 19:54:02 UTC
Description of problem:
While starting /etc/init.d/cman there is a call to unfence_self()
which does fence_node -U
This actually fences the host at least with my fence_brocade.

s2# fence_node -U
unfence s2.example.com success

Mar 19 21:29:30 s2 kernel: qla2xxx 0000:05:00.0: LOOP DOWN detected (2 3 0 0).
Mar 19 21:29:40 s2 fence_node[28391]: unfence v2.example.com success

It actually disables both ports (6,7) instead of enabling them...
Same happens when I do /etc/init.d/cman start or when the machine boots.

This is my configuration:
2.6.32-220.7.1.el6.x86_64
cman-3.0.12.1-23.el6.x86_64

<clusternode name="s2.example.com" nodeid="2">
    <fence>
       <method name="s2_san">
            <device name="san" port="6"/>
            <device name="san" port="7"/>
       </method>
       <method name="s2_drac">
            <device name="fence_drac_s2"/>
       </method>
    </fence>
    <unfence>
       <device action="enable" name="san" port="6"/>
       <device action="enable" name="san" port="7"/>
    </unfence>
</clusternode>

My fence device (Brocade 300):
    <fencedevice agent="fence_brocade" ipaddr="10.0.0.1" login="username" name="san" passwd="pass"/>

From command line the fence agent works fine
# fence_brocade -l username -p password -a 10.0.0.1 -o enable/disable -n 6/7

Furthermore, I believe the call to fence_node -U in /etc/init.d/cman should be done prior to qdisk.
The qdisk device in on the SAN, so until the ports are enabled, qdisk device is not available.

regards,

Giannis

Comment 2 Kapetanakis Giannis 2012-03-20 09:57:59 UTC
I've added debug actions in /usr/sbin/fence_brocade today
and when I'm doing
/etc/init.d/cman start
I get:

Tue Mar 20 11:19:11 2012
success: portdisable 6
Tue Mar 20 11:19:18 2012
success: portdisable 7

So it's actually doing portdisable instead of portenable

Giannis

Comment 3 David Teigland 2012-03-20 14:09:18 UTC
We "standardized" on "action=" long ago, but it appears fence_brocade was never updated.  It still only uses "option=".  If you try option="enable" it may work.

WRT qdisk, I'd first check whether you really need qdisk at all; it's best not to use it.

Comment 4 Kapetanakis Giannis 2012-03-20 15:28:28 UTC
I've added both (just in case it gets updated) and now it works fine :)

<unfence>
   <device action="enable" option="enable" name="san" port="6"/>
   <device action="enable" option="enable" name="san" port="7"/>
</unfence>

I'm using two-node-cluster so I thought I should try qdisk. I've read that it helps...

Anyway shouldn't unfencing be performed prior to qdisk initialization?

Thanks

Giannis

Comment 5 Fabio Massimo Di Nitto 2012-03-26 12:57:46 UTC
(In reply to comment #4)

> I'm using two-node-cluster so I thought I should try qdisk. I've read that it
> helps...
> 
> Anyway shouldn't unfencing be performed prior to qdisk initialization?

It is best to avoid qdisk in this scenario and use cman two_node + fence delay but it also depends on many other configuration bits. Please submit a GSS ticket for an architecture review and we will be able to provide more information.

Comment 6 Fabio Massimo Di Nitto 2012-03-26 12:59:36 UTC
Marek, can we fix fence_brocade to behave consistently?

Comment 14 errata-xmlrpc 2012-06-20 14:40:31 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2012-0943.html