Bug 572248

Summary:	fs.sh can kill processes that are not on the mount point which is being unmounted
Product:	[Retired] Red Hat Cluster Suite	Reporter:	Benjamin Kahn <bkahn>
Component:	rgmanager	Assignee:	Lon Hohberger <lhh>
Status:	CLOSED ERRATA	QA Contact:	Cluster QE <mspqa-list>
Severity:	urgent	Docs Contact:
Priority:	urgent
Version:	4	CC:	bmr, cluster-maint, djansa, fnadge, iannis, jkortus, jwest, lhh, pm-eus, rbinkhor, sbradley, tao
Target Milestone:	---	Keywords:	ZStream
Target Release:	---
Hardware:	All
OS:	Linux
Whiteboard:
Fixed In Version:	rgmanager-1.9.87-1.el4_8.5	Doc Type:	Bug Fix
Doc Text:	If an application was using a mount point similar to a mount point managed by rgmanager and force_unmount was used, the file system agent could kill that particular process. This issue has been resolved and the file system does not kill the process anymore.	Story Points:	---
Clone Of:		Environment:
Last Closed:	2010-07-21 15:14:00 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	555901
Bug Blocks:

Description Benjamin Kahn 2010-03-10 16:17:52 UTC

This bug has been copied from bug #555901 and has been proposed
to be backported to 4.8 z-stream (EUS).

Comment 3 Lon Hohberger 2010-04-05 14:29:30 UTC

http://git.fedorahosted.org/git/?p=cluster.git;a=commit;h=04d4b4acc681f8af1a467ffac077e5898ad19592

Comment 5 Jaroslav Kortus 2010-04-14 09:59:28 UTC

This patch cannot kill processes running directly on the mountpoint. For more info please see the parent bug 555901.

Comment 9 Lon Hohberger 2010-06-15 13:17:42 UTC

http://git.fedorahosted.org/git?p=cluster.git;a=commit;h=26a1b3ec5c6831fda8e46f5e0ff133c5605f91eb

Comment 10 Jaroslav Kortus 2010-06-28 17:22:13 UTC

The patch has still problems handling SIGTERM ignoring applications on clusterfs with force_umount="1".

  <rm>
<resources>
   <clusterfs device="/dev/vedder/vedder0" force_unmount="1" self_fence="0" fstype="gfs" mountpoint="/mnt/vedder0" name="vedderfs" options=""/>
</resources>
<service autostart="1" name="jkservice">
<clusterfs ref="vedderfs"/>
</service>
  </rm>

Run simple script from mountpoint:
#!/bin/bash
trap "" SIGTERM;
sleep 10000;

Relocating the service will fail:
Jun 28 12:05:28 z2 clurgmgrd[12176]: <notice> Stopping service jkservice
Jun 28 12:05:32 z2 clurgmgrd: [12176]: <err> 'umount /mnt/vedder0' failed, error=0
Jun 28 12:05:32 z2 clurgmgrd[12176]: <notice> stop on clusterfs "vedderfs" returned 2 (invalid argument(s))
Jun 28 12:05:32 z2 clurgmgrd[12176]: <crit> #12: service:jkservice failed to stop; intervention required
Jun 28 12:05:32 z2 clurgmgrd[12176]: <notice> Service jkservice is failed

Comment 11 Lon Hohberger 2010-07-13 14:40:57 UTC

http://git.fedorahosted.org/git/?p=cluster.git;a=commit;h=26a1b3ec5c6831fda8e46f5e0ff133c5605f91eb

Replaced previous patch with the above patch.

Comment 12 Jaroslav Kortus 2010-07-16 12:56:29 UTC

Tested on clusterfs and fs. All processess accessing the mountpoints were
killed and none of the others (including those running in the example given by
reporter) were killed.

Comment 13 Florian Nadge 2010-07-20 14:45:30 UTC

Technical note added. If any revisions are required, please edit the "Technical Notes" field
accordingly. All revisions will be proofread by the Engineering Content Services team.

New Contents:
If an application was using a mount point similar to a mount point managed by rgmanager and force_unmount was used, the file system agent could kill that particular process. This issue has been resolved and the file system does not kill the process anymore.

Comment 15 errata-xmlrpc 2010-07-21 15:14:00 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2010-0550.html