Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
For bugs related to Red Hat Enterprise Linux 5 product line. The current stable release is 5.10. For Red Hat Enterprise Linux 6 and above, please visit Red Hat JIRA https://issues.redhat.com/secure/CreateIssue!default.jspa?pid=12332745 to report new issues.

Bug 450694

Summary: failover loop with IBM Power Raid HA-RAID and preferred primary
Product: Red Hat Enterprise Linux 5 Reporter: Bryn M. Reeves <bmr>
Component: iprutilsAssignee: Roman Rakus <rrakus>
Status: CLOSED ERRATA QA Contact:
Severity: low Docs Contact:
Priority: low    
Version: 5.2CC: aparanja, borgan, cward, jlaska, tao, tsmetana
Target Milestone: rcKeywords: OtherQA
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-01-20 20:33:55 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Diff of changes between iprutils 2.2.8 & 2.2.9
none
Backported fix none

Description Bryn M. Reeves 2008-06-10 14:08:07 UTC
Description of problem:
The ipr_init command, an IPR user space daemon saves the adapter configuration
set by the ipr_config utility so that the adapter status can be restored
if a bad adapter needs to be replaced.

Unlike other adapter attributes, the preferred primary attribute can be changed
autonomously by the adapter firmware. When ipr_init finds that the preferred
primary attribute does not match the value set by ipr_config, it restores the
attribute value.

In a HA configuration with two systems when the ipr_init on host#1 restores the
prefferred primary attribute, it will cause a failover and change the
prefferred primary attribute on the adapter in host#2. When the ipr_init on
host#2 detects the prefferred primary attribute is changed and restores the
attribute, a failover happens again. The prefferred primary attribute will
ping/pong between the two hosts and cause an infinite failover loop between the two.

Since the preferred primary attribute is not solely controlled by ipr_config,
ipr_init should not restore its value. Instead, it should always re-read it
from the adapter. The ipr_init code in iprutils will be changed to implement
this change.

Version-Release number of selected component (if applicable):
2.2.8

How reproducible:
100%

Steps to Reproduce:
1. Requires two partitions equipped with SAS IOA adapters (572A cadet or 572B
squib) in two seperate partitions configured in a Dual Controller HA environment.
2. On partition A use iprconfig option 7 to work with adapters to set the SAS
adapter as the preferred primary. Wait a few minutes to allow any failover to
take place.
3. On partition B repeat the step to set it's SAS ioa adapter to be
the preferred primary. Both are now set as the preferred primary.

Actual results:
The systems will continuely failover to each other, as each IOA tries to assert
it's preferred primary status. 

Expected results:
No failover loop.

Additional info:
This can be triggered in practise for e.g. when a failed card (that had the
preferred primary attribute set) is replaced. Since the firmware will have moved
the attribute to the other host's adapter, replacing the adapter and having
ipr_init restore the attribute on the replaced adapter will trigger the same
looping failover behavior.

This is fixed upstream in iprutils 2.2.9.

Comment 1 Bryn M. Reeves 2008-06-10 14:13:47 UTC
Created attachment 308816 [details]
Diff of changes between iprutils 2.2.8 & 2.2.9

All changes between iprutils 2.2.8 & 2.2.9 (the fix for this bug is the only
functional change included)

Comment 2 Bryn M. Reeves 2008-06-10 14:19:10 UTC
Created attachment 308818 [details]
Backported fix

This patch only includes the functional changes required for this bug. Applies
cleanly to RHEL5's iprutils.

Comment 10 Brock Organ 2008-11-14 19:36:53 UTC
It looks like this issue may be resolved in iprutils-2.2.8-2.el5.ppc.rpm, available in the RHEL 5.3 public beta or later test trees, can you please verify that this issue is successfully resolved?

Comment 14 errata-xmlrpc 2009-01-20 20:33:55 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2009-0065.html