Bug 591851

Summary:

clusterfs force umount try only 2 times

Product:

Red Hat Enterprise Linux 5

Reporter:

John Lau <jlau>

Component:

rgmanager

Assignee:

Lon Hohberger <lhh>

Status:

CLOSED DUPLICATE

QA Contact:

Cluster QE <mspqa-list>

Severity:

medium

Docs Contact:

Priority:

low

Version:

5.5

CC:

cluster-maint, edamato

Target Milestone:

---

Target Release:

---

Hardware:

All

OS:

Linux

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2010-09-30 20:16:46 UTC

Type:

---

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
Proposed patch	none

Description John Lau 2010-05-13 11:06:34 UTC

Description of problem:

In rgmanager-2.0.52-6.el5 of RHEL5 AP, there is a logical problem in /usr/share/cluster/clusterfs.sh :

866     #
867     # Unmount the device.  
868     #
869     while [ ! "$done" ]; do
870         isMounted $dev $mp
871         case $? in
872         $NO)
873             ocf_log info "$dev is not mounted"
874             umount_failed=
875             done=$YES
876             ;;
877         $FAIL)
878             return $FAIL
879             ;;
880         $YES)
881             sync; sync; sync
882             ocf_log info "unmounting $dev ($mp)"
883 
884             umount $mp
885             if  [ $? -eq 0 ]; then
886                 umount_failed=
887                 done=$YES
888                 continue
889             fi
890 
891             umount_failed=yes
892 
893             if [ "$force_umount" ]; then
894                 killMountProcesses $mp
895             fi
896 
897             if [ $try -ge $max_tries ]; then
898                 done=$YES
899             else
900                 sleep $sleep_time
901                 let try=try+1   ##### "try" increase by 1 #####
902             fi
903             ;;
904         *)
905             return $FAIL
906             ;;
907         esac
908 
909         if [ $try -ge $max_tries ]; then
910             done=$YES
911         else
912             sleep $sleep_time
913             let try=try+1     ##### "try" increase by 1 AGAIN in the same loop #####
914         fi
915     done # while 


This logical error would cause the script to try unmount for only 2 times even $max_tries=3 (the default value), which I see a real case from a customer.

Version-Release number of selected component (if applicable):
rgmanager-2.0.52-6.el5

How reproducible:
Always.

Steps to Reproduce:
1. Configure GFS as an resource of a service. And enable "force_unmount".
2. Create a process that would hold on the GFS
3. After the process is killed by the script the first time and go into "sleep $sleep_time", start the process again
4. The script will kill it another time, but unmount will considered fail 
  
Actual results:
The unmount is tried for 2 times instead of $max_tries time

Expected results:
The unmount should tried for $max_tries time.

Additional info:
Actually Line 909-914 can be removed because in the "case", only "$YES" case need to be retried. Other case break the loop immediately. On the other hand, remove Line 897-902 may make the code look more logical.

Comment 1 John Lau 2010-05-13 11:12:33 UTC

Created attachment 413716 [details]
Proposed patch

Comment 2 Lon Hohberger 2010-09-30 20:16:46 UTC


*** This bug has been marked as a duplicate of bug 573705 ***