From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.10) Gecko/20050720 Fedora/1.0.6-1.1.fc3 Firefox/1.0.6 Description of problem: after install lvm2-cluster-2.01.09-5.0.RHEL4 server not mount lvm2 not ROOTFS logical volumes Version-Release number of selected component (if applicable): lvm2-2.01.08-1.0.RHEL4 lvm2-cluster-2.01.09-5.0.RHEL4 initscripts-7.93.13.EL-2 How reproducible: Always Steps to Reproduce: Server configuration: cat /etc/fstab ... /dev/Volume00/ROOT / ext3 defaults 1 1 /dev/VolMSA0003/STAT /data ext3 defaults 1 2 ... after install lvm2-cluster-2.01.09-5.0.RHEL4 server not mount /data on startup Actual Results: rc.sysinit scripts not execute "Setting up Logical Volume Management:" and VolumeGroup VolMSA0003 not initialized startup procedure stopped with "type Ctrl-D or enter root password" Expected Results: Normal startup with mount /data filesystem Additional info: in rc.sysinit script not execute if [ -c /dev/mapper/control -a -x /sbin/lvm.static ]; then if /sbin/lvm.static vgscan --mknodes --ignorelockingfailure > /dev/null 2>&1 ; then action $"Setting up Logical Volume Management:" /sbin/lvm.static vgchange -a y --ignorelockingfailure fi fi lvm.static crash with error liblvm2clusterlock.so: undefined symbol: print_log we found solution this problem we change lvm.static to lvm in rc.sysinit script
*** Bug 166036 has been marked as a duplicate of this bug. ***
Is this the same as bug 165255, fixed by the new mkinitrd?
bug 165255, fixed problem with rootFS but not with other lvm filesystem in fstab. We mount rootFS without problem. I think problem with lvm.static, it can't load liblvm2clusterlock.so from lvm.conf. when we use in rc.sysinit lvm (symlink to lvm.static) insted lvm.static - no problem.
These are local LVs and they should not be using clustered locking. LVM2 CVS has some recent fixes for this, but they don't completely solve the problem yet.
We can't change rc.sysinit to use 'lvm' because /usr might not be mounted yet at that point.
See also bug 165832
ok, vgchange -cn can solve my problem? but what cat i make with my server configuration? /dev/VolMSA0003/STAT - logical volume for local usage /dev/VolMSA0003/ORACLE - logical volume with GFS for Oracle RAC for use with two nodes vgchange -cn /dev/VolMSA0003 make impossible use /dev/VolMSA0003/ORACLE for GFS in cluster? i alredy change rc.sysinit and this work.
You shouldn't mix single-node and clustered volumes within a single volume group. This is because lvm2 does its locking at *volume group* level. You can however activate a logical volume in a clustered volume group only on the local node using vgchange -aly. (Or you can set up lvm.conf files with 'tags' so this always happens.)
I finally managed to reboot the server and test your solution. I reconfigure my server and server not have mix custered and not clustered logical volumes on one volume group my server configuration: .... /dev/VolMSA0003/DATA /data gfs defaults 0 0 ... /dev/VolMSA0001/ORAREDO /opt/oracle/redo01 ext3 defaults 1 2 ... VolMSA0003 is clustered for VolMSA0001 i executed vgchange -cn /dev/VolMSA0001 after reboot rc.sysinit can't mount /opt/oracle/redo01, volume group /dev/VolMSA0001, was not detected If i comment /dev/VolMSA0001... in /etc/fstab and system started i can mount /dev/VolMSA0001... manual, without problem.
Spoke with Alasdair over the weekend, and he has some thoughts as well about approaches, but those are likely to big a change at this point for U4. They may be able to make U5, but even that might be too dangerous. What we *can* and should do, however, is fix initscripts to be more selective about the volumes it starts. They should not simply be doing a 'vgchange -ay', but should be doing something similar to the alternate approach in my last comment. The change should be something like: 1) do a vgs 2) pick out any LV's that are not in clustered VG's and that aren't started yet 3) lvchange -ay --ignorelockingfailure those volumes I've started work on this, but there does seem to be a problem: # lvm.static vgs lvm.static: symbol lookup error: /usr/lib/liblvm2clusterlock.so: undefined symbol: print_log So even if we make the initscripts change, lvm.static will still fail with this undefined symbol error. We also need to consider another issue -- what if /usr is on a different LVM volume? I think we also need to consider moving /usr/lib/liblvm2clusterlock.so to /lib as well. Moving this BZ to lvm2-cluster since this is more of an issue with the library.
Created attachment 128148 [details] patch to include log.o when building cluster locking library This patch seemed to resolve the problem of the cluster locking library not being able to resolve print_log. Here we're just going ahead and including log.o in the library. This patch applied to lvm2-cluster-2.02.01-1.2.RH worked for me, but I had trouble with later versions segfaulting (both with and without this patch).
Created attachment 128163 [details] patch to add --skipclustered option to vgchange This patch adds a new option to vgchange (--skipclustered) to make it skip any clustered VG's when iterating over them. I did some cursory testing and it seems to work correctly. With this, we should be able to call the following from the initscripts: # lvm.static vgchange -ay --ignorelockingfailure --skipclustered and that should do the right thing here (note that we'll still need the symbol resolution patch I posted earlier as well unless that problem is already fixed in a more recent version). This patch should probably go into yet another cloned BZ for lvm2, but I'd like to have Alasdair comment before I open another BZ for this problem.
[root@blade05 sbin]# lvm.static vgchange --skipclustered -ay File descriptor 3 left open File descriptor 5 left open File descriptor 7 left open vgchange: unrecognized option `--skipclustered' Error during parsing of command line. [root@blade05 sbin]# rpm -qa "lvm*cluster*" lvm2-cluster-2.02.01-1.2.RHEL4 you can create simple test labs for testing install RHEL4 + GFS (all needed packages) create Logical Volume with ext3 (GFS not used) optionally vgchange -cn /dev/VolXXX/LVXXX (with or without cluster flug, no result) insert into /etc/fstab /dev/VolXXX/LVXXX /mnt ext3 defaults 1 2 now you can try reboot system (my system not boot)
Yes, that patch has not been reviewed or included in any packages as of yet. This BZ is on the RHEL4.5 proposed list, so it's not going to happen until then at the earliest. I should also mention here that the impetus for this patch was that the initscripts maintainer did not approve of the ugliness of my initscripts patch (see BZ 189751) to change how we loop over the LVs and start them. He suggested adding something along these lines to vgchange to keep vg activation a one-liner.
Made some changes upstream (in reverse order) which I hope will help: Don't attempt automatic recovery without proper locking. When using local file locking, skip clustered VGs. Add fallback_to_clustered_locking and fallback_to_local_locking parameters. lvm.static uses built-in cluster locking instead of external locking. Don't attempt to load shared libraries if built statically. So the hope is that, with the new default behaviour, provided the command *only* references a specific non-clustered VG it will now be able to run successfully. Then something like skipclustered actually becomes just a change to the error code. i.e. don't report an error if the only problem was an inability to lock a clustered logical volume
Changing the product and component to rhel and lvm2 since the changes are needed in the base set of lvm2 packages.
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
Yes, please test the latest builds to check whether or not the problem has gone away.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2007-0287.html