Bug 744344 - 'service clvmd stop' can take multple minutes when first attempted
Summary: 'service clvmd stop' can take multple minutes when first attempted
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: lvm2
Version: 6.2
Hardware: x86_64
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Peter Rajnoha
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks: 756082 840699
TreeView+ depends on / blocked
 
Reported: 2011-10-07 22:25 UTC by Corey Marthaler
Modified: 2012-12-11 08:08 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-12-11 08:08:36 UTC
Target Upstream Version:
cmarthal: needinfo+


Attachments (Terms of Use)
vgdisplay log from taft-03 (54.96 KB, text/plain)
2012-03-05 23:04 UTC, Corey Marthaler
no flags Details

Description Corey Marthaler 2011-10-07 22:25:38 UTC
Description of problem:
At first I was annoyed by this, but later realized that this is probably a regression. When stopping clvmd for the first time, even without an clustered volumes to activate/deactivate, it can take multiple minutes on one of the nodes in the cluster.

I'll attempt to better debug this and find out what's taking the most time.

GRANT CLUSTER:
[root@grant-01 ~]# time service clvmd stop
Signaling clvmd to exit                                    [  OK  ]
clvmd terminated                                           [  OK  ]

real    2m1.086s
[root@grant-02 ~]# time service clvmd stop
Signaling clvmd to exit                                    [  OK  ]
clvmd terminated                                           [  OK  ]

real    0m0.683s
[root@grant-03 ~]# time service clvmd stop
Signaling clvmd to exit                                    [  OK  ]
clvmd terminated                                           [  OK  ]

real    0m1.005s


HAYES CLUSTER:
[root@hayes-01 ~]# time service clvmd stop
Signaling clvmd to exit                                    [  OK  ]
clvmd terminated                                           [  OK  ]

real    1m0.994s
[root@hayes-02 ~]# time service clvmd stop
Signaling clvmd to exit                                    [  OK  ]
clvmd terminated                                           [  OK  ]

real    0m0.699s
[root@hayes-03 ~]# time service clvmd stop
Signaling clvmd to exit                                    [  OK  ]
clvmd terminated                                           [  OK  ]

real    0m0.911s


TAFT CLUSTER:
[root@taft-01 ~]# time service clvmd stop
Signaling clvmd to exit                                    [  OK  ]
clvmd terminated                                           [  OK  ]

real    2m0.845s
[root@taft-02 ~]# time service clvmd stop
Signaling clvmd to exit                                    [  OK  ]
clvmd terminated                                           [  OK  ]

real    1m0.780s
[root@taft-03 ~]# time service clvmd stop
Signaling clvmd to exit                                    [  OK  ]
clvmd terminated                                           [  OK  ]

real    0m0.695s
[root@taft-04 ~]# time service clvmd stop
Signaling clvmd to exit                                    [  OK  ]
clvmd terminated                                           [  OK  ]

real    0m0.721s


Version-Release number of selected component (if applicable):
2.6.32-203.el6.x86_64

lvm2-2.02.87-3.el6    BUILT: Wed Sep 21 09:54:55 CDT 2011
lvm2-libs-2.02.87-3.el6    BUILT: Wed Sep 21 09:54:55 CDT 2011
lvm2-cluster-2.02.87-3.el6    BUILT: Wed Sep 21 09:54:55 CDT 2011
udev-147-2.40.el6    BUILT: Fri Sep 23 07:51:13 CDT 2011
device-mapper-1.02.66-3.el6    BUILT: Wed Sep 21 09:54:55 CDT 2011
device-mapper-libs-1.02.66-3.el6    BUILT: Wed Sep 21 09:54:55 CDT 2011
device-mapper-event-1.02.66-3.el6    BUILT: Wed Sep 21 09:54:55 CDT 2011
device-mapper-event-libs-1.02.66-3.el6    BUILT: Wed Sep 21 09:54:55 CDT 2011
cmirror-2.02.87-3.el6    BUILT: Wed Sep 21 09:54:55 CDT 2011


How reproducible:
Everytime

Comment 1 Corey Marthaler 2011-10-07 22:44:08 UTC
[root@grant-01 ~]# time service clvmd stop
+ DAEMON=clvmd
+ exec_prefix=
+ sbindir=/sbin
+ lvm_vgchange=/sbin/vgchange
+ lvm_vgdisplay=/sbin/vgdisplay
+ lvm_vgscan=/sbin/vgscan
+ lvm_lvdisplay=/sbin/lvdisplay
+ CLVMDOPTS=-T30
+ '[' -f /etc/sysconfig/cluster ']'
+ '[' -f /etc/sysconfig/clvmd ']'
+ '[' -n '' ']'
+ '[' -z ']'
+ CLVMD_STOP_TIMEOUT=10
+ LOCK_FILE=/var/lock/subsys/clvmd
+ '[' 0 '!=' 0 ']'
+ case "$1" in
+ stop
+ rh_status_q
+ rh_status
+ '[' -z '' ']'
++ clustered_vgs
++ /sbin/vgdisplay
++ awk 'BEGIN {RS="VG Name"} {if (/Clustered/) print $1;}'

# HERE'S WERE IT HANGS FOR THE MINUTE



+ LVM_VGS=
+ '[' -n '' ']'
++ pidofproc clvmd
++ local RC pid pid_file=
++ '[' 1 = 0 ']'
++ '[' clvmd = -p ']'
++ fail_code=3
++ __pids_var_run clvmd ''
++ local base=clvmd
++ local pid_file=/var/run/clvmd.pid
++ pid=
++ '[' -f /var/run/clvmd.pid ']'
++ local line p
++ '[' '!' -r /var/run/clvmd.pid ']'
++ :
++ read line
++ '[' -z 2200 ']'
++ for p in '$line'
++ '[' -z '' -a -d /proc/2200 ']'
++ pid=' 2200'
++ :
++ read line
++ '[' -z '' ']'
++ break
++ '[' -n ' 2200' ']'
++ return 0
++ RC=0
++ '[' -n ' 2200' ']'
++ echo 2200
++ return 0
+ action 'Signaling clvmd to exit' kill -TERM 2200
+ local STRING rc
+ STRING='Signaling clvmd to exit'
+ echo -n 'Signaling clvmd to exit '
Signaling clvmd to exit + shift
+ kill -TERM 2200
+ success 'Signaling clvmd to exit'
+ '[' color '!=' verbose -a -z '' ']'
+ echo_success
+ '[' color = color ']'
+ echo -en '\033[60G'
                                                           + echo -n '['
[+ '[' color = color ']'
+ echo -en '\033[0;32m'
+ echo -n '  OK  '
  OK  + '[' color = color ']'
+ echo -en '\033[0;39m'
+ echo -n ']'
]+ echo -ne '\r'
+ return 0
+ return 0
+ rc=0
+ echo

+ return 0
+ usleep 500000
+ rh_status_q
+ rh_status
+ rh_status_q
+ rh_status
+ echo -n 'clvmd terminated'
clvmd terminated+ success
+ '[' color '!=' verbose -a -z '' ']'
+ echo_success
+ '[' color = color ']'
+ echo -en '\033[60G'
                                                           + echo -n '['
[+ '[' color = color ']'
+ echo -en '\033[0;32m'
+ echo -n '  OK  '
  OK  + '[' color = color ']'
+ echo -en '\033[0;39m'
+ echo -n ']'
]+ echo -ne '\r'
+ return 0
+ return 0
+ echo

+ rm -f /var/lock/subsys/clvmd
+ return 0
+ rtrn=0
+ exit 0

real    1m3.164s
user    0m0.048s
sys     0m0.165s

Comment 2 Peter Rajnoha 2011-11-01 12:41:55 UTC
(In reply to comment #1)
> ++ /sbin/vgdisplay
> ++ awk 'BEGIN {RS="VG Name"} {if (/Clustered/) print $1;}'
> 
> # HERE'S WERE IT HANGS FOR THE MINUTE

Some locking issue, I guess(?) So for starters, can you add -vvvv log of the vgdisplay?

Comment 6 Peter Rajnoha 2012-01-04 09:46:41 UTC
Corey, can you give a try with a vgdsiplay -vvvv so we have a log?

Comment 7 Corey Marthaler 2012-03-05 19:23:27 UTC
Just a note that I have been able to reproduce this issue with the latest rpms. I'm still attempting to gather more verbose information on this however.

2.6.32-236.el6.x86_64
lvm2-2.02.94-0.61.el6    BUILT: Thu Mar  1 07:03:29 CST 2012
lvm2-libs-2.02.94-0.61.el6    BUILT: Thu Mar  1 07:03:29 CST 2012
lvm2-cluster-2.02.94-0.61.el6    BUILT: Thu Mar  1 07:03:29 CST 2012
udev-147-2.40.el6    BUILT: Fri Sep 23 07:51:13 CDT 2011
device-mapper-1.02.73-0.61.el6    BUILT: Thu Mar  1 07:03:29 CST 2012
device-mapper-libs-1.02.73-0.61.el6    BUILT: Thu Mar  1 07:03:29 CST 2012
device-mapper-event-1.02.73-0.61.el6    BUILT: Thu Mar  1 07:03:29 CST 2012
device-mapper-event-libs-1.02.73-0.61.el6    BUILT: Thu Mar  1 07:03:29 CST 2012
cmirror-2.02.94-0.61.el6    BUILT: Thu Mar  1 07:03:29 CST 2012

Comment 8 Corey Marthaler 2012-03-05 23:04:34 UTC
Created attachment 567779 [details]
vgdisplay log from taft-03

The log from a node taking a really long time looks just like one that takes less then a second.

Comment 9 Milan Broz 2012-03-05 23:11:13 UTC
(In reply to comment #8)
> The log from a node taking a really long time looks just like one that takes
> less then a second.

And on which line in log it waits for so long? (there should be really timestamps...)

Comment 10 Peter Rajnoha 2012-03-06 13:47:28 UTC
So maybe strace or lvm debug logging to syslog would be of more help then...

Comment 11 Peter Rajnoha 2012-04-11 09:36:17 UTC
(In reply to comment #10)
> So maybe strace or lvm debug logging to syslog would be of more help then...

(strace with timestamps!)

Comment 14 Peter Rajnoha 2012-11-28 08:34:55 UTC
Is this issue still reproducible with latest 6.4 build - the lvm2-2.02.98-3.el6? If yes, would it be possible to grab some debug output, possibly with timestamps? (comment #10 and comment #11, the strace with timestamps for the vgdisplay from comment #1).

Comment 17 Peter Rajnoha 2012-12-11 08:08:36 UTC
Closing, please, reopen if you hit this problem again.


Note You need to log in before you can comment on or make changes to this bug.