Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1854675

Summary:	[RFE] a pacemaker agent that detects if a volume is lost or becomes inaccessible
Product:	Red Hat Enterprise Linux 8	Reporter:	Seunghwan Jung <jseunghw>
Component:	resource-agents	Assignee:	Oyvind Albrigtsen <oalbrigt>
Status:	CLOSED DUPLICATE	QA Contact:	cluster-qe <cluster-qe>
Severity:	medium	Docs Contact:
Priority:	medium
Version:	8.2	CC:	agk, cfeist, cluster-maint, fdinitto, kgaillot, nwahl, sbradley
Target Milestone:	rc	Keywords:	FutureFeature
Target Release:	8.4	Flags:	pm-rhel: mirror+
Hardware:	All
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2023-08-02 13:54:43 UTC	Type:	Feature Request
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Seunghwan Jung 2020-07-08 00:38:35 UTC

Description of problem:

A customer requests a new feature with which pacemaker cluster can detect when a mounted volume such as "/", is lost or not accessible. It would be implemented as a resource like 'ethmonitor' or an other way.

Version-Release number of selected component (if applicable):

The customer is running RHEL 8.1 E4S


How reproducible:
n/a

Steps to Reproduce:
n/a

Actual results:
When a node has lost its root volume, it won't detect it but will keep running.

Expected results:

There is an agent or cluster configuration with which cluster can be set to detect when a volume is lost or becomes inaccessible 

Additional info:

Comment 2 Reid Wahl 2020-07-08 05:02:05 UTC

A couple of considerations:
  1) It sounds like we want to monitor a filesystem rather than monitoring a disk. This distinction is important because a disk could be used without a filesystem (e.g., a raw disk for an Oracle database).
  2) If we want to monitor filesystems (e.g., the root filesystem), then do we even need a new resource agent? It's worth considering whether we could add a new "monitor_only" attribute to the Filesystem resource agent. Then a user can set:

        # pcs resource create <rsc_name> ocf:heartbeat:Filesystem monitor_only=true op monitor interval=<interval> timeout=<timeout> OCF_CHECK_LEVEL=20 on-fail=fence

     With `monitor_only=true`, the resource agent would not mount the filesystem during the start operation but would perform the monitor operation.
     With `op monitor ... OCF_CHECK_LEVEL=20 on-fail=fence` (existing options), the monitor operation would perform a read/write test. The node would self-fence if the filesystem is not mounted or if I/O fails.


kgaillot said in parent BZ 1854340:

> It is expected behavior that loss of the root volume (or any other disk volume) 
> will not be detected automatically. If that is desired, a cluster resource must 
> be configured to monitor it. There is no agent currently available for that 
> purpose, so it would have be written (you could create an RFE BZ for that if 
> desired). Such an agent would work like the ocf:heartbeat:ethmonitor agent -- 
> it would not mount and unmount the root volume (as ocf:heartbeat:Filesystem 
> would), but would only run a recurring monitor and set a node attribute 
> accordingly. That node attribute could then be used in location constraints if 
> you want to move resources away from a node that loses the monitored disk 
> volume.
> 
> Without such a resource, Pacemaker will detect a lost disk volume only if it 
> needs to write to that volume for its own purposes, or if a resource monitor 
> that depends on the volume fails. This should eventually happen if 
> /var/lib/pacemaker is on the volume. The DC node will eventually attempt to 
> write the current CIB to disk, which will fail, then Pacemaker on the DC should 
> immediately exit without restarting, and the other nodes should take over and 
> fence the former DC.

There is a difference between kgaillot's suggestion and the approach that I suggested above.
  - My suggested approach would cause a node to self-fence if the monitor fails (when `on-fail=fence` is set), which would initiate recovery.
  - kgaillot's approach would change the value of a Pacemaker node attribute if the monitor fails. Then the cluster would initiate recovery in whatever way the user configured it to do so.

The attribute approach seems more flexible, but if we can solve the problem with an existing resource agent, then maybe that's desirable.

Comment 3 Seunghwan Jung 2020-07-08 05:24:12 UTC

Hi Reid,

Yes, it is filesystem to monitor. Thank you for clarifying that.
I think it is a good idea to use an existing resource with a small modification.

Comment 4 Reid Wahl 2020-07-08 06:37:10 UTC

Discussed with Hwanii in IRC. He pointed out that when the root LV is suspended, resource monitors won't work. That makes sense. The resource agents reside on the root filesystem, and Pacemaker presumably needs to write to the root filesystem when updating the CIB.

I tested it out with a new monitor_only attribute and confirmed. The resource failure is not detected until the LV is resumed (un-suspended) on the DC.

I think a resource agent-based approach isn't going to work for monitoring the root filesystem. Even if we made a resource agent that updates an attribute with a date and we try to check "how recently was the attribute updated?", I don't think we can make the OTHER node initiate fencing based on the lack of an attribute update. The local node can't decide to fence itself without access to the root FS.

Comment 5 Ken Gaillot 2020-07-08 14:43:15 UTC

Hi all,

Those are all good points.

It occurs to me that for Reid's approach on a non-root filesystem, the current Dummy agent is sufficient -- just set the "state" parameter to a writeable location on the target filesystem, and the dummy resource will be unable to perform any action successfully if the filesystem is unavailable (or unwriteable).

For the root filesystem, it does seem that the most practical solution is to rely on pacemaker's internal CIB write failure handling, so Bug 1854340 should be the focus.

However if there is an independent filesystem available, I do think it would be possible to do this via an agent. It would have to be an LSB-style (not OCF) agent, written in a compiled language (such as C, so it doesn't need a script interpreter from the root filesystem) and stored on the separate filesystem. Pacemaker could launch the agent from the separate filesystem even if the root filesystem were unavailable -- unless of course both filesystems had a single point of failure, such as a shared I/O controller. (It would have to be LSB because that is the only standard that Pacemaker allows you to specify the full path for. A compiled OCF agent would be fine if Pacemaker supported specifying a path to it, or if /usr/lib/ocf were kept on an independent filesystem.)

Comment 13 Chris Feist 2023-08-02 13:54:43 UTC

Closing this bz as it sounds like storagemon (currently in Tech Preview) is what the customer is interested in - https://bugzilla.redhat.com/show_bug.cgi?id=2055299

*** This bug has been marked as a duplicate of bug 2055299 ***