Hide Forgot
Description of problem: At first I was annoyed by this, but later realized that this is probably a regression. When stopping clvmd for the first time, even without an clustered volumes to activate/deactivate, it can take multiple minutes on one of the nodes in the cluster. I'll attempt to better debug this and find out what's taking the most time. GRANT CLUSTER: [root@grant-01 ~]# time service clvmd stop Signaling clvmd to exit [ OK ] clvmd terminated [ OK ] real 2m1.086s [root@grant-02 ~]# time service clvmd stop Signaling clvmd to exit [ OK ] clvmd terminated [ OK ] real 0m0.683s [root@grant-03 ~]# time service clvmd stop Signaling clvmd to exit [ OK ] clvmd terminated [ OK ] real 0m1.005s HAYES CLUSTER: [root@hayes-01 ~]# time service clvmd stop Signaling clvmd to exit [ OK ] clvmd terminated [ OK ] real 1m0.994s [root@hayes-02 ~]# time service clvmd stop Signaling clvmd to exit [ OK ] clvmd terminated [ OK ] real 0m0.699s [root@hayes-03 ~]# time service clvmd stop Signaling clvmd to exit [ OK ] clvmd terminated [ OK ] real 0m0.911s TAFT CLUSTER: [root@taft-01 ~]# time service clvmd stop Signaling clvmd to exit [ OK ] clvmd terminated [ OK ] real 2m0.845s [root@taft-02 ~]# time service clvmd stop Signaling clvmd to exit [ OK ] clvmd terminated [ OK ] real 1m0.780s [root@taft-03 ~]# time service clvmd stop Signaling clvmd to exit [ OK ] clvmd terminated [ OK ] real 0m0.695s [root@taft-04 ~]# time service clvmd stop Signaling clvmd to exit [ OK ] clvmd terminated [ OK ] real 0m0.721s Version-Release number of selected component (if applicable): 2.6.32-203.el6.x86_64 lvm2-2.02.87-3.el6 BUILT: Wed Sep 21 09:54:55 CDT 2011 lvm2-libs-2.02.87-3.el6 BUILT: Wed Sep 21 09:54:55 CDT 2011 lvm2-cluster-2.02.87-3.el6 BUILT: Wed Sep 21 09:54:55 CDT 2011 udev-147-2.40.el6 BUILT: Fri Sep 23 07:51:13 CDT 2011 device-mapper-1.02.66-3.el6 BUILT: Wed Sep 21 09:54:55 CDT 2011 device-mapper-libs-1.02.66-3.el6 BUILT: Wed Sep 21 09:54:55 CDT 2011 device-mapper-event-1.02.66-3.el6 BUILT: Wed Sep 21 09:54:55 CDT 2011 device-mapper-event-libs-1.02.66-3.el6 BUILT: Wed Sep 21 09:54:55 CDT 2011 cmirror-2.02.87-3.el6 BUILT: Wed Sep 21 09:54:55 CDT 2011 How reproducible: Everytime
[root@grant-01 ~]# time service clvmd stop + DAEMON=clvmd + exec_prefix= + sbindir=/sbin + lvm_vgchange=/sbin/vgchange + lvm_vgdisplay=/sbin/vgdisplay + lvm_vgscan=/sbin/vgscan + lvm_lvdisplay=/sbin/lvdisplay + CLVMDOPTS=-T30 + '[' -f /etc/sysconfig/cluster ']' + '[' -f /etc/sysconfig/clvmd ']' + '[' -n '' ']' + '[' -z ']' + CLVMD_STOP_TIMEOUT=10 + LOCK_FILE=/var/lock/subsys/clvmd + '[' 0 '!=' 0 ']' + case "$1" in + stop + rh_status_q + rh_status + '[' -z '' ']' ++ clustered_vgs ++ /sbin/vgdisplay ++ awk 'BEGIN {RS="VG Name"} {if (/Clustered/) print $1;}' # HERE'S WERE IT HANGS FOR THE MINUTE + LVM_VGS= + '[' -n '' ']' ++ pidofproc clvmd ++ local RC pid pid_file= ++ '[' 1 = 0 ']' ++ '[' clvmd = -p ']' ++ fail_code=3 ++ __pids_var_run clvmd '' ++ local base=clvmd ++ local pid_file=/var/run/clvmd.pid ++ pid= ++ '[' -f /var/run/clvmd.pid ']' ++ local line p ++ '[' '!' -r /var/run/clvmd.pid ']' ++ : ++ read line ++ '[' -z 2200 ']' ++ for p in '$line' ++ '[' -z '' -a -d /proc/2200 ']' ++ pid=' 2200' ++ : ++ read line ++ '[' -z '' ']' ++ break ++ '[' -n ' 2200' ']' ++ return 0 ++ RC=0 ++ '[' -n ' 2200' ']' ++ echo 2200 ++ return 0 + action 'Signaling clvmd to exit' kill -TERM 2200 + local STRING rc + STRING='Signaling clvmd to exit' + echo -n 'Signaling clvmd to exit ' Signaling clvmd to exit + shift + kill -TERM 2200 + success 'Signaling clvmd to exit' + '[' color '!=' verbose -a -z '' ']' + echo_success + '[' color = color ']' + echo -en '\033[60G' + echo -n '[' [+ '[' color = color ']' + echo -en '\033[0;32m' + echo -n ' OK ' OK + '[' color = color ']' + echo -en '\033[0;39m' + echo -n ']' ]+ echo -ne '\r' + return 0 + return 0 + rc=0 + echo + return 0 + usleep 500000 + rh_status_q + rh_status + rh_status_q + rh_status + echo -n 'clvmd terminated' clvmd terminated+ success + '[' color '!=' verbose -a -z '' ']' + echo_success + '[' color = color ']' + echo -en '\033[60G' + echo -n '[' [+ '[' color = color ']' + echo -en '\033[0;32m' + echo -n ' OK ' OK + '[' color = color ']' + echo -en '\033[0;39m' + echo -n ']' ]+ echo -ne '\r' + return 0 + return 0 + echo + rm -f /var/lock/subsys/clvmd + return 0 + rtrn=0 + exit 0 real 1m3.164s user 0m0.048s sys 0m0.165s
(In reply to comment #1) > ++ /sbin/vgdisplay > ++ awk 'BEGIN {RS="VG Name"} {if (/Clustered/) print $1;}' > > # HERE'S WERE IT HANGS FOR THE MINUTE Some locking issue, I guess(?) So for starters, can you add -vvvv log of the vgdisplay?
Corey, can you give a try with a vgdsiplay -vvvv so we have a log?
Just a note that I have been able to reproduce this issue with the latest rpms. I'm still attempting to gather more verbose information on this however. 2.6.32-236.el6.x86_64 lvm2-2.02.94-0.61.el6 BUILT: Thu Mar 1 07:03:29 CST 2012 lvm2-libs-2.02.94-0.61.el6 BUILT: Thu Mar 1 07:03:29 CST 2012 lvm2-cluster-2.02.94-0.61.el6 BUILT: Thu Mar 1 07:03:29 CST 2012 udev-147-2.40.el6 BUILT: Fri Sep 23 07:51:13 CDT 2011 device-mapper-1.02.73-0.61.el6 BUILT: Thu Mar 1 07:03:29 CST 2012 device-mapper-libs-1.02.73-0.61.el6 BUILT: Thu Mar 1 07:03:29 CST 2012 device-mapper-event-1.02.73-0.61.el6 BUILT: Thu Mar 1 07:03:29 CST 2012 device-mapper-event-libs-1.02.73-0.61.el6 BUILT: Thu Mar 1 07:03:29 CST 2012 cmirror-2.02.94-0.61.el6 BUILT: Thu Mar 1 07:03:29 CST 2012
Created attachment 567779 [details] vgdisplay log from taft-03 The log from a node taking a really long time looks just like one that takes less then a second.
(In reply to comment #8) > The log from a node taking a really long time looks just like one that takes > less then a second. And on which line in log it waits for so long? (there should be really timestamps...)
So maybe strace or lvm debug logging to syslog would be of more help then...
(In reply to comment #10) > So maybe strace or lvm debug logging to syslog would be of more help then... (strace with timestamps!)
Is this issue still reproducible with latest 6.4 build - the lvm2-2.02.98-3.el6? If yes, would it be possible to grab some debug output, possibly with timestamps? (comment #10 and comment #11, the strace with timestamps for the vgdisplay from comment #1).
Closing, please, reopen if you hit this problem again.