Bug 165787 - Not mount local lvm2 partition with lvm2-cluster
Summary: Not mount local lvm2 partition with lvm2-cluster
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: lvm2
Version: 4.0
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
: ---
Assignee: Alasdair Kergon
QA Contact:
URL:
Whiteboard:
: 166036 (view as bug list)
Depends On: 165832
Blocks: 189751
TreeView+ depends on / blocked
 
Reported: 2005-08-12 03:17 UTC by Michael Musikhin
Modified: 2018-10-19 20:50 UTC (History)
8 users (show)

Fixed In Version: RHBA-2007-0287
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2007-05-07 23:57:59 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
patch to include log.o when building cluster locking library (289 bytes, patch)
2006-04-24 14:20 UTC, Jeff Layton
no flags Details | Diff
patch to add --skipclustered option to vgchange (1.60 KB, patch)
2006-04-24 19:45 UTC, Jeff Layton
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2007:0287 0 normal SHIPPED_LIVE lvm2 bug fix and enhancement update 2007-04-28 19:01:34 UTC

Description Michael Musikhin 2005-08-12 03:17:09 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.10) Gecko/20050720 Fedora/1.0.6-1.1.fc3 Firefox/1.0.6

Description of problem:
after install lvm2-cluster-2.01.09-5.0.RHEL4 
server not mount lvm2 not ROOTFS logical volumes 

Version-Release number of selected component (if applicable):
lvm2-2.01.08-1.0.RHEL4 lvm2-cluster-2.01.09-5.0.RHEL4 initscripts-7.93.13.EL-2

How reproducible:
Always

Steps to Reproduce:
Server configuration:
cat /etc/fstab
...
/dev/Volume00/ROOT      /                       ext3    defaults        1 1
/dev/VolMSA0003/STAT    /data                   ext3    defaults        1 2
...

after install lvm2-cluster-2.01.09-5.0.RHEL4 
server not mount /data on startup

Actual Results:  rc.sysinit scripts not execute 
"Setting up Logical Volume Management:"
and VolumeGroup VolMSA0003 not initialized
startup procedure stopped with
"type Ctrl-D or enter root password"

Expected Results:  Normal startup with mount /data filesystem

Additional info:

in rc.sysinit script not execute 
    if [ -c /dev/mapper/control -a -x /sbin/lvm.static ]; then
        if /sbin/lvm.static vgscan --mknodes --ignorelockingfailure > /dev/null 2>&1 ; then
            action $"Setting up Logical Volume Management:" /sbin/lvm.static vgchange -a y --ignorelockingfailure
        fi
    fi
lvm.static crash with error
liblvm2clusterlock.so: undefined symbol: print_log

we found solution this problem
we change lvm.static to lvm in rc.sysinit script

Comment 1 Michael Musikhin 2005-08-16 03:29:07 UTC
*** Bug 166036 has been marked as a duplicate of this bug. ***

Comment 2 Alasdair Kergon 2005-08-16 21:42:33 UTC
Is this the same as bug 165255, fixed by the new mkinitrd?

Comment 3 Michael Musikhin 2005-08-17 03:40:16 UTC
bug 165255, fixed problem with rootFS but not with other lvm filesystem in fstab.
We mount rootFS without problem.
I think problem with lvm.static, it can't load liblvm2clusterlock.so from lvm.conf.
when we use in rc.sysinit lvm (symlink to lvm.static) insted lvm.static - no
problem.


Comment 4 Alasdair Kergon 2005-08-18 14:07:37 UTC
These are local LVs and they should not be using clustered locking.
LVM2 CVS has some recent fixes for this, but they don't completely solve the
problem yet.

Comment 5 Alasdair Kergon 2005-08-18 14:09:18 UTC
We can't change rc.sysinit to use 'lvm' because /usr might not be mounted yet at
that point.

Comment 6 Alasdair Kergon 2005-08-18 14:58:58 UTC
See also bug 165832

Comment 7 Michael Musikhin 2005-08-19 05:04:23 UTC
ok, vgchange -cn can solve my problem?
but what cat i make with my server configuration?
/dev/VolMSA0003/STAT - logical volume for local usage
/dev/VolMSA0003/ORACLE - logical volume with GFS for Oracle RAC
for use with two nodes
vgchange -cn /dev/VolMSA0003 make impossible use /dev/VolMSA0003/ORACLE for GFS
in cluster?

i alredy change rc.sysinit and this work.

Comment 8 Alasdair Kergon 2005-08-30 12:48:25 UTC
You shouldn't mix single-node and clustered volumes within a single volume group.

This is because lvm2 does its locking at *volume group* level.

You can however activate a logical volume in a clustered volume group only on
the local node using vgchange -aly.  (Or you can set up lvm.conf files with
'tags' so this always happens.)



Comment 14 Michael Musikhin 2005-11-24 08:40:08 UTC
I finally managed to reboot the server and test your solution.
I reconfigure my server and server not have mix custered and not clustered
logical volumes on one volume group
my server configuration:
....
/dev/VolMSA0003/DATA    /data                   gfs     defaults        0 0
...
/dev/VolMSA0001/ORAREDO /opt/oracle/redo01  ext3    defaults        1 2
...

VolMSA0003 is clustered 
for VolMSA0001 i executed vgchange -cn /dev/VolMSA0001
after reboot rc.sysinit can't mount /opt/oracle/redo01,
volume group /dev/VolMSA0001, was not detected 

If i comment /dev/VolMSA0001... in /etc/fstab and system started 
i can mount /dev/VolMSA0001... manual, without problem.


Comment 22 Jeff Layton 2006-04-24 11:56:16 UTC
Spoke with Alasdair over the weekend, and he has some thoughts as well about
approaches, but those are likely to big a change at this point for U4. They may
be able to make U5, but even that might be too dangerous.

What we *can* and should do, however, is fix initscripts to be more selective
about the volumes it starts. They should not simply be doing a 'vgchange -ay',
but should be doing something similar to the alternate approach in my last comment.

The change should be something like:

1) do a vgs
2) pick out any LV's that are not in clustered VG's and that aren't started yet
3) lvchange -ay --ignorelockingfailure those volumes

I've started work on this, but there does seem to be a problem:

# lvm.static vgs
lvm.static: symbol lookup error: /usr/lib/liblvm2clusterlock.so: undefined
symbol: print_log

So even if we make the initscripts change, lvm.static will still fail with this
undefined symbol error.

We also need to consider another issue -- what if /usr is on a different LVM
volume? I think we also need to consider moving /usr/lib/liblvm2clusterlock.so
to /lib as well.

Moving this BZ to lvm2-cluster since this is more of an issue with the library.

Comment 23 Jeff Layton 2006-04-24 14:20:46 UTC
Created attachment 128148 [details]
patch to include log.o when building cluster locking library

This patch seemed to resolve the problem of the cluster locking library not
being able to resolve print_log. Here we're just going ahead and including
log.o in the library. This patch applied to lvm2-cluster-2.02.01-1.2.RH worked
for me, but I had trouble with later versions segfaulting (both with and
without this patch).

Comment 24 Jeff Layton 2006-04-24 19:45:31 UTC
Created attachment 128163 [details]
patch to add --skipclustered option to vgchange

This patch adds a new option to vgchange (--skipclustered) to make it skip any
clustered VG's when iterating over them. I did some cursory testing and it
seems to work correctly.

With this, we should be able to call the following from the initscripts:

# lvm.static vgchange -ay --ignorelockingfailure --skipclustered

and that should do the right thing here (note that we'll still need the symbol
resolution patch I posted earlier as well unless that problem is already fixed
in a more recent version).

This patch should probably go into yet another cloned BZ for lvm2, but I'd like
to have Alasdair comment before I open another BZ for this problem.

Comment 25 Michael Musikhin 2006-05-05 11:56:17 UTC
[root@blade05 sbin]# lvm.static vgchange --skipclustered -ay
File descriptor 3 left open
File descriptor 5 left open
File descriptor 7 left open
vgchange: unrecognized option `--skipclustered'
  Error during parsing of command line.
[root@blade05 sbin]# rpm -qa "lvm*cluster*"
lvm2-cluster-2.02.01-1.2.RHEL4

you can create simple test labs for testing
install RHEL4 + GFS (all needed packages)
create Logical Volume with ext3 (GFS not used)
optionally  vgchange -cn /dev/VolXXX/LVXXX 
(with or without cluster flug, no result)
insert into /etc/fstab
/dev/VolXXX/LVXXX   /mnt                    ext3    defaults        1 2
now you can try reboot system
(my system not boot)

Comment 26 Jeff Layton 2006-05-10 15:36:43 UTC
Yes, that patch has not been reviewed or included in any packages as of yet.
This BZ is on the RHEL4.5 proposed list, so it's not going to happen until then
at the earliest.

I should also mention here that the impetus for this patch was that the
initscripts maintainer did not approve of the ugliness of my initscripts patch
(see BZ 189751) to change how we loop over the LVs and start them. He suggested
adding something along these lines to vgchange to keep vg activation a one-liner.


Comment 29 Alasdair Kergon 2006-09-03 23:45:38 UTC
Made some changes upstream (in reverse order) which I hope will help:

  Don't attempt automatic recovery without proper locking.
  When using local file locking, skip clustered VGs.
  Add fallback_to_clustered_locking and fallback_to_local_locking parameters.
  lvm.static uses built-in cluster locking instead of external locking.
  Don't attempt to load shared libraries if built statically.

So the hope is that, with the new default behaviour, provided the command *only*
references a specific non-clustered VG it will now be able to run successfully.


Then something like skipclustered actually becomes just a change to the error
code.  i.e. don't report an error if the only problem was an inability to lock a
clustered logical volume

Comment 30 Kiersten (Kerri) Anderson 2006-09-05 18:47:05 UTC
Changing the product and component to rhel and lvm2 since the changes are needed
in the base set of lvm2 packages.

Comment 31 RHEL Program Management 2006-09-05 19:02:06 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 33 Alasdair Kergon 2006-12-22 15:38:34 UTC
Yes, please test the latest builds to check whether or not the problem has gone
away.

Comment 39 Red Hat Bugzilla 2007-05-07 23:58:00 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2007-0287.html


Note You need to log in before you can comment on or make changes to this bug.