Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1323547

Summary:	[RFE] resource agent request to monitor FC HBA or multipathd
Product:	Red Hat Enterprise Linux 7	Reporter:	jajeon
Component:	resource-agents	Assignee:	Oyvind Albrigtsen <oalbrigt>
Status:	CLOSED NOTABUG	QA Contact:	cluster-qe <cluster-qe>
Severity:	low	Docs Contact:
Priority:	low
Version:	7.2	CC:	agk, cfeist, cluster-maint, fdinitto, jajeon, jruemker
Target Milestone:	rc	Keywords:	FutureFeature
Target Release:	---
Hardware:	All
OS:	All
Whiteboard:
Fixed In Version:		Doc Type:	Enhancement
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2016-05-25 15:00:48 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description jajeon 2016-04-04 02:46:33 UTC

Currently there seems no available resource agent that can monitor FC HBA status. (such as link failure, HBA card failure...ETC)
Similar to "ethmonitor", resource that can monitor FC HBA would be required on large systems.

Already there are LVM2 or Filesystem resource which can do similar functionality but this not monitoring FC HBA itself.

Hence, requesting resource agent that can monitor FC HBA.

Comment 3 John Ruemker 2016-04-05 15:23:06 UTC

>> Already there are LVM2 or Filesystem resource which can do similar functionality but this not monitoring FC HBA itself.

Those resources would be indirectly monitoring the status of the FC links.  If your link to the storage device is disrupted in some way, then any I/O issued to the device should present an error, possibly after some amount of waiting to timeout or possibly immediately if there is a clear error condition, but in either case you should get an error.  If you have multiple links aggregated under some sort of multipath device, then it will be monitoring success/failure of I/O issued over the individual links and will take action to reroute failed I/O's when needed, thereby lessening the chance that any such I/O will fail entirely.  If you get to a point where all paths to a device have failed, then I/O to that device should either receive an error in response, or be queued (block) until a path returns and the I/O succeeds.

The point is: if something goes wrong with your storage links, the LVM and Filesystem resource agents as well as other application agents, should all be able to detect it through their regular monitoring that issues I/O to these devices, and this usually makes any direct FC-HBA monitoring unnecessary.  If you have an application running on top of the LV or filesystem and that app is managed by the cluster, then I/O errors that make their way back up to that application and cause it to fail may also produce an error when the resource-agent for that app performs a monitor.  

So, while there may be a use case out there where having direct FC-HBA monitoring may be useful, its not immediately obvious what that use case is.  Saying that "LVM or Filesystem can do similar" as justification for needing another agent to "failover over resources" when there is an FC-link failure ignores the fact that LVM and Filesystem already achieve this through their own monitoring.  As such, it would be great if we could get more detail about what you or the customer feels is not entirely covered by the current offerings, or what they'd like to achieve that they cannot already. 

Also, it would be great if we could discuss this in a support case with the customer directly, so we are clearly understanding their needs and can communicate recommendations or status back to them.

Thanks,
John

Comment 4 John Ruemker 2016-04-14 14:10:32 UTC

From email, the target use case is for environments directly using a LUN as a raw/direct-access device without any file system or LVM volumes on it.

~~~
Simply Use case is monitor FC HBA but not to failover Filesystem or LVM that lives on top of SAN environment.

Refer to "Use case 4" of URL below which explains exact customer's scenario.
"
http://www.novell.com/docrep/2012/01/sap_on_sle_simple_stack.PDF
2.5 Use Case 4 “Enqueue Replication High Availability External Database”
"
For this case, using Filesystem or LVM requires extra resources such as configuring GFS or requires additional LUN.
~~~