Bug 608500

Summary: fence_ilo intermittently leaves server offline.
Product: [Retired] Red Hat Cluster Suite Reporter: Jason Nelson <jason.nelson>
Component: fenceAssignee: Marek Grac <mgrac>
Status: CLOSED DUPLICATE QA Contact: Cluster QE <mspqa-list>
Severity: medium Docs Contact:
Priority: low    
Version: 4CC: cluster-maint, edamato
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-07-14 12:54:33 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jason Nelson 2010-06-27 21:18:07 UTC
Description of problem:

When fence_ilo fires off, the server will power off, but will remain in a power off state

Version-Release number of selected component (if applicable):

This has been seen as recently as the fenced deployed with 5.4


How reproducible:

We're seeing this on DL385, and DL385G2.  Have not yet tested on the DL580s we have.


Steps to Reproduce:
1.  Setup working cluster
2.  Kernel panic active node
3.  Fence will power off panic'd node, but will not power back on.
  
Actual results:

System is left with power off.  Can be verified via telnet/ssh into iLo, and viewing power status.

Expected results:

Server should power back on after power is turned off.

Additional info:

We've been resolving this on our end by editing the /sbin/fence_ilo script to do:

poweroff
poweron
poweron

Running poweron twice has fixed it, I have not tested by putting a delay in between off and on.

BTW, nice meeting the team at Summit.  Like I said, we'll try to keep you in the loop on the bugs we see that we workaround in RHCS.

-Jason Nelson
Lead Linux Engineer
Rackspace Hosting

Comment 1 Marek Grac 2010-06-28 12:44:53 UTC
Can you take a look at bug #545682 ? I believe that it is same problem

Comment 2 Jason Nelson 2010-06-28 20:03:00 UTC
Ah, does look like a duplicate here.  Is this going to be back ported into earlier versions of RHEL 4/5?

-Jason Nelson
Lead Linux Engineer
Rackspace Hosting

Comment 3 Marek Grac 2010-07-14 12:54:33 UTC
@Jason:

Do you mean?
* 4.9 update (yes)
* z-stream (yes, already in 4.8.z)
* in RHEL5/6 (yes, not cloned from same bugs as timing options were added for all fence agents)

*** This bug has been marked as a duplicate of bug 545682 ***