Bug 489809 - Broken device detection for DRAC3 ERA/O in fence_drac
Broken device detection for DRAC3 ERA/O in fence_drac
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: cman (Show other bugs)
5.4
All Linux
low Severity medium
: ---
: ---
Assigned To: Marek Grac
Cluster QE
: OtherQA
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2009-03-11 19:12 EDT by Gordan Bobic
Modified: 2016-04-26 09:45 EDT (History)
5 users (show)

See Also:
Fixed In Version: cman-2.0.115-18.el5
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2010-03-30 04:41:13 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Patch to fix operation of fence_drac on Dell embedded DRAC3 ERA/O cards (449 bytes, patch)
2009-03-11 19:12 EDT, Gordan Bobic
no flags Details | Diff

  None (edit)
Description Gordan Bobic 2009-03-11 19:12:39 EDT
Created attachment 334870 [details]
Patch to fix operation of fence_drac on Dell embedded DRAC3 ERA/O cards

Description of problem:

Fencing agent for Dell Remote Access Controller (DRAC) ERA/O (DRAC3 variant) has broken device detection. The fencing agent looks for the following regular expression:

/Dell Embedded Remote Access Controller \(ERA\)\nFirmware Version/

The actual device string (with latest firmware), is:

Dell Embedded Remote Access Controller (ERA/O)
Firmware Version 3.37 (Build 08.13)

Thus, the regular expression match should be:
/Dell Embedded Remote Access Controller \(ERA\/O\)\nFirmware Version/

Version-Release number of selected component (if applicable):
All versions of cman up to and including 2.0.98 in RHEL 5.3

How reproducible:

100%

Steps to Reproduce:
1. Set up a cluster of nodes, one of which has a DRAC3 ERA/O in it (e.g. Dell PowerEdge 1650)
2. Pull the plug on the DRAC3 node.
3. Cluster will hang. The surviving nodes will try to fence using DRAC but won't be able to identify the DRAC card, and will keep failing.

Actual results:
Cluster hangs indefinitely waiting for the node to get fenced.

Expected results:
Node gets fenced and cluster resumes operation.

Additional info:
Patch to fix this is attached.
Comment 1 Marek Grac 2009-03-20 07:50:29 EDT
Thanks for patch.

But I would like to ask you if you can help us with writing new fence agent (fence_drac5.py) to support also your device. I will try to write it (using old agent) but I don't have device to test it. I believe that we can do that in 2-3 iterations (I will need just verbose output).
Comment 2 Gordan Bobic 2009-03-20 08:06:27 EDT
Sure, I'll be happy to test it for you and forward any output back to you. Please email me the instructions.
Comment 6 Gordan Bobic 2009-11-08 20:06:04 EST
I see this patch hasn't made it into RHEL5.4 (cman-2.0.115-1.el5_4.3). Is it likely to get pushed out any time soon? The current fence_drac agent completely fails to work on the DRAC 3 ERA/O management modules without the provided patch.
Comment 7 Marek Grac 2009-11-09 09:47:54 EST
Patch changed so it should not break backward compatibility.

--
if (/Dell Embedded Remote Access Controller \(ERA(\/O)?\)\nFirmware Version/m)

--

If it is possible, please try test build: cman-2.0.115-18.el5
Comment 8 Gordan Bobic 2009-11-09 12:20:22 EST
Gladly, where can I get the new package?
Comment 9 Marek Grac 2009-11-11 06:44:26 EST
Sure, 

http://marx.fedorapeople.org/cman-2.0.115-18.el5.src.rpm
Comment 13 Marek Grac 2010-02-24 10:18:18 EST
@Gordan: Can you please test a new package and send results?
Comment 14 Jaroslav Kortus 2010-03-15 13:13:17 EDT
Any feedback on this yet?
Comment 15 Gordan Bobic 2010-03-15 17:01:47 EDT
Sorry, forgot to get back to you about this. The updated version linked above has been working absolutely fine for me.
Comment 16 Jaroslav Kortus 2010-03-16 05:54:51 EDT
Thank you.
Comment 18 errata-xmlrpc 2010-03-30 04:41:13 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2010-0266.html

Note You need to log in before you can comment on or make changes to this bug.