Bug 2207567

Summary: Filesystem: Improve stopping for large filesystems (RHEL9)
Product: Red Hat Enterprise Linux 9 Reporter: Oyvind Albrigtsen <oalbrigt>
Component: resource-agentsAssignee: Oyvind Albrigtsen <oalbrigt>
Status: CLOSED ERRATA QA Contact: cluster-qe <cluster-qe>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 9.2CC: agk, cluster-maint, fdinitto, lucas.blenkhorn, mjuricek, oalbrigt, phagara, pzimek, sbradley
Target Milestone: rcKeywords: Triaged
Target Release: 9.3   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: resource-agents-4.10.0-43.el9 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 2189243 Environment:
Last Closed: 2023-11-07 08:23:10 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2189242, 2189243    
Bug Blocks:    

Description Oyvind Albrigtsen 2023-05-16 10:06:41 UTC
+++ This bug was initially created as a clone of Bug #2189243 +++

+++ This bug was initially created as a clone of Bug #2189242 +++

Description of problem:

On high-end production workload systems with huge amount of (write-cache) RAM and big XFS file systems >= 8 TiB the unmount operation itself may take longer then 10 minutes on each attempt (even if it fails as processes are still utilizing it). In case login shells of users sit on the Filesystem resource then these do no respond  to SIGTERM, just to SIGHUP so when resource is stopping it deliberately fails to unmount and causes stop operation to fail/timeout.


Version-Release number of selected component (if applicable):

resource-agents-4.1.1-61.el7_9.15.x86_64

How reproducible:
repeatedly


Steps to Reproduce:
1. Create large filesystem resource with potentially long dirty unmount cycles (+- 30 minutes) with login shells on it
2. re-login during the long stop operation (login shells on any HA FS RA managed file system does not fail all standard FS RA stop operation)


Actual results:
unmount fails resulting stop operation to fail

Expected results:
unmount succeeds

Additional info:
https://github.com/ClusterLabs/resource-agents/pull/1868

Comment 2 Oyvind Albrigtsen 2023-07-12 10:43:33 UTC
New PR based on suggestions by QE: https://github.com/ClusterLabs/resource-agents/pull/1878

Comment 11 errata-xmlrpc 2023-11-07 08:23:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (resource-agents bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:6312