Description of problem: In an attempt to verify the fix for bz 217753, I noticed that at times clvmd would time out and "fail" to start. This only ever seemed to happened on link-07 though so I'm not sure what that means, but I figured I'd document this issue anyways just in case. I have 40 volumes and was doing looping 'service clvmd start' and 'service clvmd stop' on the 4 node x86_64 link cluster (link-02,4,7,8). Here's what was on the console: Jan 22 17:37:45 link-07 clvmd: clvmd startup timed out Jan 22 17:37:45 link-07 clvmd: clvmd startup failed Jan 22 17:37:46 link-07 clvmd: Cluster LVM daemon started - connected to CMAN Here's the actual command: [root@link-07 ~]# service clvmd start + . /etc/init.d/functions ++ TEXTDOMAIN=initscripts ++ umask 022 ++ PATH=/sbin:/usr/sbin:/bin:/usr/bin:/usr/X11R6/bin ++ export PATH ++ '[' -z '' ']' ++ COLUMNS=80 ++ '[' -z '' ']' +++ /sbin/consoletype ++ CONSOLETYPE=pty ++ '[' -f /etc/sysconfig/i18n -a -z '' ']' ++ . /etc/sysconfig/i18n +++ LANG=en_US.UTF-8 +++ SUPPORTED=en_US.UTF-8:en_US:en +++ SYSFONT=latarcyrheb-sun16 ++ '[' pty '!=' pty ']' ++ '[' -n '' ']' ++ export LANG ++ '[' -z '' ']' ++ '[' -f /etc/sysconfig/init ']' ++ . /etc/sysconfig/init +++ BOOTUP=color +++ GRAPHICAL=yes +++ RES_COL=60 +++ MOVE_TO_COL='echo -en \033[60G' +++ SETCOLOR_SUCCESS='echo -en \033[0;32m' +++ SETCOLOR_FAILURE='echo -en \033[0;31m' +++ SETCOLOR_WARNING='echo -en \033[0;33m' +++ SETCOLOR_NORMAL='echo -en \033[0;39m' +++ LOGLEVEL=3 +++ PROMPT=yes ++ '[' pty = serial ']' ++ '[' color '!=' verbose ']' ++ INITLOG_ARGS=-q + LVDISPLAY=/usr/sbin/lvdisplay + VGCHANGE=/usr/sbin/vgchange + VGSCAN=/usr/sbin/vgscan + VGDISPLAY=/usr/sbin/vgdisplay + VGS=/usr/sbin/vgs + CLVMDOPTS=-T20 + '[' -f /etc/sysconfig/cluster ']' + LOCK_FILE=/var/lock/subsys/clvmd + rtrn=1 + case "$1" in + start + for rtrn in 0 + pidof clvmd + echo -n 'Starting clvmd: ' Starting clvmd: + daemon clvmd -T20 + local gotbase= force= + local base= user= nice= bg= pid= + nicelevel=0 + '[' clvmd '!=' clvmd ']' + '[' -z '' ']' + base=clvmd + '[' -f /var/run/clvmd.pid ']' + '[' -n '' -a -z '' ']' + ulimit -S -c 0 + '[' -n '' ']' + '[' color = verbose -a -z '' ']' + '[' -z '' ']' + initlog -q -c 'clvmd -T20' clvmd startup timed out + '[' 5 -eq 0 ']' + failure 'clvmd startup' + rc=1 + '[' -z '' ']' + initlog -q -n /etc/init.d/clvmd -s 'clvmd startup' -e 2 + '[' color '!=' verbose -a -z '' ']' + echo_failure + '[' color = color ']' + echo -en '\033[60G' + echo -n '[' [+ '[' color = color ']' + echo -en '\033[0;31m' + echo -n FAILED FAILED+ '[' color = color ']' + echo -en '\033[0;39m' + echo -n ']' ]+ echo -ne '\r' + return 1 + '[' -x /usr/bin/rhgb-client ']' + return 1 + rtrn=1 + echo + '[' 1 -ne 0 ']' + break + return 1 + rtrn=1 + '[' 1 = 0 ']' + exit 1 I'll try and reproduce this issue and gather more info... hopefully on more machines than just link-07. Version-Release number of selected component (if applicable): 2.6.9-43.ELsmp lvm2-2.02.18-1.el4 lvm2-cluster-2.02.18-1.el4 How reproducible: quite often on link-07
FWIW, here is how the volumes are laid out: [root@link-07 ~]# lvs -a -o +devices LV VG Attr LSize Origin Snap% Move Log Copy% Devices linear1 activator1 -wi-a- 200.00M /dev/sdb1(0) linear2 activator1 -wi-a- 200.00M /dev/sdb2(0) linear3 activator1 -wi-a- 200.00M /dev/sdf1(50) linear4 activator1 -wi-a- 200.00M /dev/sdb1(100) linear5 activator1 -wi-a- 200.00M /dev/sdb2(100) stripe1 activator1 -wi-a- 200.00M /dev/sdf1(0) stripe2 activator1 -wi-a- 200.00M /dev/sdb1(50) stripe3 activator1 -wi-a- 200.00M /dev/sdb2(50) stripe4 activator1 -wi-a- 200.00M /dev/sdf1(100) stripe5 activator1 -wi-a- 200.00M /dev/sdb1(150) linear1 activator2 -wi-a- 200.00M /dev/sdh1(0) linear2 activator2 -wi-a- 200.00M /dev/sdh1(100) linear3 activator2 -wi-a- 200.00M /dev/sdh1(200) linear4 activator2 -wi-a- 200.00M /dev/sdh1(300) linear5 activator2 -wi-a- 200.00M /dev/sdh1(400) stripe1 activator2 -wi-a- 200.00M /dev/sdh1(50) stripe2 activator2 -wi-a- 200.00M /dev/sdh1(150) stripe3 activator2 -wi-a- 200.00M /dev/sdh1(250) stripe4 activator2 -wi-a- 200.00M /dev/sdh1(350) stripe5 activator2 -wi-a- 200.00M /dev/sdh1(450) linear1 activator3 -wi-a- 200.00M /dev/sdd1(0) linear2 activator3 -wi-a- 200.00M /dev/sdg1(0) linear3 activator3 -wi-a- 200.00M /dev/sda2(50) linear4 activator3 -wi-a- 200.00M /dev/sdd1(100) linear5 activator3 -wi-a- 200.00M /dev/sdg1(100) stripe1 activator3 -wi-a- 200.00M /dev/sda2(0) stripe2 activator3 -wi-a- 200.00M /dev/sdd1(50) stripe3 activator3 -wi-a- 200.00M /dev/sdg1(50) stripe4 activator3 -wi-a- 200.00M /dev/sda2(100) stripe5 activator3 -wi-a- 200.00M /dev/sdd1(150) linear1 activator4 -wi-a- 200.00M /dev/sdd2(0) linear2 activator4 -wi-a- 200.00M /dev/sdg2(0) linear3 activator4 -wi-a- 200.00M /dev/sdc2(50) linear4 activator4 -wi-a- 200.00M /dev/sdd2(100) linear5 activator4 -wi-a- 200.00M /dev/sdg2(100) stripe1 activator4 -wi-a- 200.00M /dev/sdc2(0) stripe2 activator4 -wi-a- 200.00M /dev/sdd2(50) stripe3 activator4 -wi-a- 200.00M /dev/sdg2(50) stripe4 activator4 -wi-a- 200.00M /dev/sdc2(100) stripe5 activator4 -wi-a- 200.00M /dev/sdd2(150)
The only think that can significantly delay clvmd startup is lockspace creation, pretty much everything else (including getting the locks themselves) happens after that has completed and in a background thread. So, I suspect that the number of LVs here might be a red herring - it could just the the up/down action of the lockspace itself. A log of dlm_controld might show whether this is the case, or perhaps the DLM debug logging - if it still does that!
After playing with this on my cluster, it might be worth investigating Udev. I noticed that one of my machines consistently takes 10 seconds to create a lockspace. It does this because Udev doesn't create the DLM device in /dev/misc and so leaves it for libdlm to create the device itself...which it only does after waiting 10 seconds for Udev to do it! If this is the case it might explain why only one of your nodes is affected. Try timing or stracing the lstest program in cluster/dlm/tests/lstest.c and see if it does some thing similar.
Chatting with nstraz on IRC about an afflicted cluster it seems likely that udev is not the problem, so I guess it's likely to be lockspace creation. An strace of clvmd starting up would confirm/deny this.
Created attachment 149477 [details] strace output for all clvmd processes started on each node I used a script to strace all clvmd processes on each node during the init scripts. It doesn't capture the beginning of the process, but everything after about the first second.
Those straces DO finger udev. I can see clvmd (acually in libdlm) looping around for eleven seconds, waiting for the DLM lockspace device to be created. at the end of that it gives up and creates the device itself. I appreciate that that this adds only 11 seconds to the startup time and that the timeout is 20 seconds but it its a spurious 11 seconds and I can't tell where the rest comes from without timestamps on the strace (yes, I should have asked for those, sorry). Since this doesn't happen if clvmd is started by hand when the system is quiescent I suspect it's the load on udev and possibly the rest of the processes at boot time that's aggravating this. There are two options I see, one is to reduce the timeout in libdlm to (say) 4 or 5 seconds and the other is to increase the the timeout for clvmd in the init script - maybe both to be safe. It might be worth invesitgating udev but I supect we'll get the answer back, "yes, it's busy".
I turned on udev logging and found that udev is creating a different device. Mar 8 08:44:31 morph-04 fenced: startup succeeded Mar 8 08:44:46 morph-04 udev[2836]: creating device node '/dev/dlm_clvmd' Mar 8 08:44:46 morph-04 udev[2839]: creating device node '/dev/dlm_clvmd' Mar 8 08:44:51 morph-04 clvmd: clvmd startup timed out Mar 8 08:44:51 morph-04 clvmd: clvmd startup failed strace shows: 2807 stat64("/dev/misc/dlm_clvmd", 0xbff8a02c) = -1 ENOENT (No such file or directory) 2807 rt_sigprocmask(SIG_BLOCK, [CHLD], [USR2], 8) = 0 2807 rt_sigaction(SIGCHLD, NULL, {SIG_DFL}, 8) = 0 2807 rt_sigprocmask(SIG_SETMASK, [USR2], NULL, 8) = 0 2807 nanosleep({1, 0}, {1, 0}) = 0 2807 stat64("/dev/misc/dlm_clvmd", 0xbff8a02c) = -1 ENOENT (No such file or directory)
Well, that would certainly explain why libdlm can't find the device. But not why it doesn't wait for 10 seconds when run afterwards. So who is responsible for the udev rule that creates these and has it changed recently ?
I am hitting this almost every time I run revolver with a uni-processor kernel. This will prevent me from getting through the release criteria for the i386 UP kernel.
so nag someone to fix udev ?
ahh, now it starts it make some sort of sense. for RHEL4 the syntax is KERNEL= for RHEL5 the syntax is KERNEL== I'll put the fix the RHEL4 CVS, can someone set the tags on this to get it into U5 ?
Well, it seems that the DLM udev files are not included at all!
The latest dlm package still does not include the udev rules file. [root@morph-05 rules.d]# pwd /etc/udev/rules.d [root@morph-05 rules.d]# ls 50-udev.rules 51-by-id.rules 90-ib.rules [root@morph-05 rules.d]# rpm -q dlm dlm-1.0.2-0 [root@morph-05 rules.d]# rpm -ql dlm /usr/lib/libdlm.so.1 /usr/lib/libdlm.so.1.0.0 /usr/lib/libdlm_lt.so.1 /usr/lib/libdlm_lt.so.1.0.0
It looks like that file isn't installed by the Makefiles. Patrick can you make the necessary change so the file is installed in the proper location with the correct permissions? I'll get the rpm .spec file to recognize the file.
Checking in lib/Makefile; /cvs/cluster/cluster/dlm/lib/Makefile,v <-- Makefile new revision: 1.7.2.1.14.1; previous revision: 1.7.2.1 done
The addition/change of the placement of dlm-control breaks magma-plugins, requiring an ancillary fix
Created attachment 153300 [details] Fix magma-plugins
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2007-0136.html