Bug 1577530 - stonith:fence_ipmilan
Summary: stonith:fence_ipmilan
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: pacemaker
Version: 7.4
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: rc
: ---
Assignee: Ken Gaillot
QA Contact: cluster-qe@redhat.com
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-05-12 19:13 UTC by Taoufik07
Modified: 2018-05-14 14:43 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-05-14 14:43:15 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Taoufik07 2018-05-12 19:13:31 UTC
Description of problem:

fence_cms1     (stonith:fence_ipmilan):        Started cms2
 fence_cms2     (stonith:fence_ipmilan):        Started cms1

Failed Actions:
* fence_cms2_start_0 on cms2 'unknown error' (1): call=98, status=Timed Out, exitreason='none',
    last-rc-change='Sat May 12 21:29:15 2018', queued=0ms, exec=20005ms


Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

I have this warning but my resource Started correctly

Version-Release number of selected component (if applicable):


pacemaker-1.1.16-12.el7.x86_64
pacemaker-cli-1.1.16-12.el7.x86_64
pacemaker-cluster-libs-1.1.16-12.el7.x86_64
pacemaker-libs-1.1.16-12.el7.x86_64


corosynclib-2.4.0-9.el7.x86_64
corosync-2.4.0-9.el7.x86_64
corosync-qnetd-2.4.0-9.el7.x86_64
corosynclib-devel-2.4.0-9.el7.x86_64
corosync-qdevice-2.4.0-9.el7.x86_64


Red Hat Enterprise Linux Server release 7.4 (Maipo)


Steps to Reproduce:
1.I create a resource fence_ipmlan for the first node result Succes
2.I create a resource fence_ipmlan for the second node the reresult Succes

but I have a warning

Failed Actions:
* fence_cms2_start_0 on cms2 'unknown error' (1): call=98, status=Timed Out, exitreason='none',
    last-rc-change='Sat May 12 21:29:15 2018', queued=0ms, exec=20005ms

I pcs stonith update fence_cms2 power_timeout=60

Actual results:
Failed Actions:
* fence_cms2_start_0 on cms2 'unknown error' (1): call=98, status=Timed Out, exitreason='none',
    last-rc-change='Sat May 12 21:29:15 2018', queued=0ms, exec=20005m

Expected results:


Additional info:

I have HP Gen10 with ILO5
i activate the ipmilan in bios.

Comment 2 Ken Gaillot 2018-05-14 14:28:00 UTC
The "Failed Actions" section of the status display shows all past failures. This particular message indicates that the cluster timed out trying to contact this fence device. As a high availability platform, the cluster will automatically recover from errors when possible, so it was able to successfully start the device on another try.

You can clear the message with "pcs resource cleanup fence_cms2".

The root cause of the timeout itself is unlikely to be related to the pacemaker component, so resolving it will require a support case, which can look at the wider environment and how components work together, rather than here in bugzilla, which focuses on bugs in a single software package.

You can initiate a case with Red Hat's Global Support Services group through one of the methods listed at the following link:

  https://access.redhat.com/start/how-to-engage-red-hat-support

From there, we'll collect some additional information from you and take a
closer look at the specifics of this incident to help you resolve the
underlying problem.

Comment 3 Taoufik07 2018-05-14 14:36:36 UTC
In connect to the redhat and i increase the time out 
but not working after cleanup the resource.
and I connect to my ILO and change the timeOut from 30s to 120s
after cleanUp my resource it's Working

many thansk

Comment 4 Ken Gaillot 2018-05-14 14:43:15 UTC
Great, that's good to hear :-)


Note You need to log in before you can comment on or make changes to this bug.