Bug 591851

Summary: clusterfs force umount try only 2 times
Product: Red Hat Enterprise Linux 5 Reporter: John Lau <jlau>
Component: rgmanagerAssignee: Lon Hohberger <lhh>
Status: CLOSED DUPLICATE QA Contact: Cluster QE <mspqa-list>
Severity: medium Docs Contact:
Priority: low    
Version: 5.5CC: cluster-maint, edamato
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-09-30 20:16:46 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Proposed patch none

Description John Lau 2010-05-13 11:06:34 UTC
Description of problem:

In rgmanager-2.0.52-6.el5 of RHEL5 AP, there is a logical problem in /usr/share/cluster/clusterfs.sh :

866     #
867     # Unmount the device.  
868     #
869     while [ ! "$done" ]; do
870         isMounted $dev $mp
871         case $? in
872         $NO)
873             ocf_log info "$dev is not mounted"
874             umount_failed=
875             done=$YES
876             ;;
877         $FAIL)
878             return $FAIL
879             ;;
880         $YES)
881             sync; sync; sync
882             ocf_log info "unmounting $dev ($mp)"
883 
884             umount $mp
885             if  [ $? -eq 0 ]; then
886                 umount_failed=
887                 done=$YES
888                 continue
889             fi
890 
891             umount_failed=yes
892 
893             if [ "$force_umount" ]; then
894                 killMountProcesses $mp
895             fi
896 
897             if [ $try -ge $max_tries ]; then
898                 done=$YES
899             else
900                 sleep $sleep_time
901                 let try=try+1   ##### "try" increase by 1 #####
902             fi
903             ;;
904         *)
905             return $FAIL
906             ;;
907         esac
908 
909         if [ $try -ge $max_tries ]; then
910             done=$YES
911         else
912             sleep $sleep_time
913             let try=try+1     ##### "try" increase by 1 AGAIN in the same loop #####
914         fi
915     done # while 


This logical error would cause the script to try unmount for only 2 times even $max_tries=3 (the default value), which I see a real case from a customer.

Version-Release number of selected component (if applicable):
rgmanager-2.0.52-6.el5

How reproducible:
Always.

Steps to Reproduce:
1. Configure GFS as an resource of a service. And enable "force_unmount".
2. Create a process that would hold on the GFS
3. After the process is killed by the script the first time and go into "sleep $sleep_time", start the process again
4. The script will kill it another time, but unmount will considered fail 
  
Actual results:
The unmount is tried for 2 times instead of $max_tries time

Expected results:
The unmount should tried for $max_tries time.

Additional info:
Actually Line 909-914 can be removed because in the "case", only "$YES" case need to be retried. Other case break the loop immediately. On the other hand, remove Line 897-902 may make the code look more logical.

Comment 1 John Lau 2010-05-13 11:12:33 UTC
Created attachment 413716 [details]
Proposed patch

Comment 2 Lon Hohberger 2010-09-30 20:16:46 UTC

*** This bug has been marked as a duplicate of bug 573705 ***