Bug 153215

Summary: non root LVM not working
Product: [Fedora] Fedora Reporter: Ronny Buchmann <ronny-rhbugzilla>
Component: initscriptsAssignee: Bill Nottingham <notting>
Status: CLOSED RAWHIDE QA Contact: Brock Organ <borgan>
Severity: high Docs Contact:
Priority: medium    
Version: rawhideCC: agk, oliva, rvokal
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-09-30 20:55:50 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 136451, 153237, 153238    

Description Ronny Buchmann 2005-04-03 08:12:17 UTC
Description of problem:
LVM device nodes are not created,
so the filesystems can't be mounted nor fscked

In the / on LVM case, devices nodes are created by initrd, so things are fine here.

Version-Release number of selected component (if applicable):
initscripts >= 7.95 (rc.sysinit rev 1.459)

old behaviour (fc3):
 - fsck /
 - umount initrd
 - quotacheck /
 - remount / rw
 - lvm init
 - raid init
 - lvm init
 - fsck remaining
 - mount remaining
 - quotacheck remaining

current behaviour:
 - raid init (can't work, b/c / is ro)
 - lvm init (can't work b/c / is ro)
 - fsck all
 - umount initrd
 - quotacheck all (can't work I assume)
 - remount / rw
 - mount all

What was wrong with the old way? The current thing seems to be completly broken.
I don't see how fsck "/" and fsck "remaining" could be merged.

RAID on top of LVM may be really rare, so doing only one LVM init could be fine.

Comment 1 Ronny Buchmann 2005-04-03 10:44:06 UTC
I was a bit misguided here. Peter Jones noted that /dev is on tmpfs.

So I investigated a bit further and found that vgscan is trying to create
/etc/lvm/archive and /etc/lvm/backup, which is of coursing failing.

When I create both directories it's working fine.

Please include them in the rpm.

Comment 2 Alasdair Kergon 2005-04-03 16:10:09 UTC
Firstly, anaconda needs to create backups of the metadata there after it's
finished doing the lvm2 configuration, so this situation shouldn't arise.

Secondly, if lvm2 can't write to /etc, what's the point in running 'vgscan'?
[The only reason for running it is to update the '/etc/lvm/.cache' file.]
If you're seeing that error, it suggests there's something's wrong with the
command ordering.

Thirdly, if lvm2 can't write to /etc, but you're running a *read-only* lvm2
command, it shouldn't fail.  It does - and -An has no effect.  The code that
creates the backup and archive directories need moving and refining.

Finally, there's already a bugzilla asking for 'ghost' adding to the rpm for the
/etc/lvm directories.  (But I don't think that's any impact here.)


Maybe this bug needs cloning into 3: anaconda, mkinitrd and lvm2?


Comment 3 Alasdair Kergon 2005-04-03 16:12:22 UTC
Or s/mkinitrd/initscripts/ probably

Comment 4 Alasdair Kergon 2005-04-03 16:18:25 UTC
If 'old behaviour' has really been reduced to 'current behaviour' in the
initscripts, you almost certainly need to change much of it back unless you've
decided to reduce the number of supported configurations.


Comment 5 Ronny Buchmann 2005-04-03 21:32:26 UTC
to comment#4:
yes it is that way

to comment#2:
I added clones for anaconda (#153237) and initscripts (#153238)


Comment 6 Alasdair Kergon 2005-04-04 16:37:39 UTC
swapped them over: cloning lost the data and 'initscripts' needs the history.

[Assuming /etc isn't on its own partition]
rc.sysinit needs:
                                                                               
                                                     
  remount / rw
                                                                               
                                                     
  vgscan --mknodes --ignorelockingfailure
  vgchange -ay --ignorelockingfailure
                                                                               
                                                     
  raid init if present
                                                                               
                                                     
  if any raid volumes were found:
    vgscan --mknodes --ignorelockingfailure
    vgchange -ay --ignorelockingfailure
                                                                               
                                                     
  fsck & mount all non-/
                                                                               
                                                     
...
clustered lvm initscript:
  initialise locking
  vgchange -ay
                                                                               
                                                     
                                                                               
                                                     
The issue here is that everyone with raid suffers from 'vgscan' running
twice in rc.sysinit.  People have either lvm-over-raid (preferred) or
raid-over-lvm but rarely both: we could do with a way to detect or tell
the initscripts which is present.
                                                                               
                                                     


Comment 7 Bill Nottingham 2005-04-05 18:22:03 UTC
Wait, why is / needed read-write for further LVM init?

Comment 8 Alasdair Kergon 2005-04-05 18:29:14 UTC
So it can update /etc/lvm/.cache which the other commands load in.


Comment 9 Bill Nottingham 2005-04-05 18:32:14 UTC
... and without which, those commands still work, *right*?

Comment 10 Alasdair Kergon 2005-04-05 18:38:12 UTC
Also, as of lvm2 2.01.09, the tools are supposed to be able to work out for
themselves when they need to do a vgscan so you should be able to do:

  remount / rw

  vgmknodes --ignorelockingfailure
  vgchange -ay --ignorelockingfailure

  raid init

  if any raid found
     vgchange -ay --ignorelockingfailure

  fsck & mount all non-/

...

[Not fully tested yet]

Comment 11 Alasdair Kergon 2005-04-05 18:40:18 UTC
Comment #9 - for older versions of lvm2 (before 2.01.09), nope, the tools
wouldn't always work because they'd rely on the out-of-date contents of .cache,
messing things up if e.g. any scsi devices have come up in a different order or
if disks have been swapped around.

Comment 12 Alasdair Kergon 2005-04-05 18:44:41 UTC
Comment #9 - from 2.01.09 onwards, it would just slow down booting: in the
circumstances where a disk changed *every* lvm2 command would trigger a full
vgscan because it can't cache the results between one command and the next.

And /dev must be writable before you run vgmknodes followed by vgchange -ay,
otherwise some volumes may not come up if /dev was left with stray content (e.g.
after a crash rather than a clean shutdown).


Comment 13 Alasdair Kergon 2005-04-05 18:49:30 UTC
Re-thinking comment #10:  it will only work if the VGs are listed after vgchange
i.e. vgchange -ay vg1 vg2

So I'm wrong, both vgscans *are* still necessary.

The internal vgscan is only triggered when there's a request for a named VG that
is not found.  In the case of vgchange -ay without VG args, that doesn't arise.


Comment 14 Alasdair Kergon 2005-04-05 18:52:49 UTC
So that makes the first part of comment #12 wrong too.  You cannot avoid first
doing a vgscan with writable /etc/lvm/.cache.  I'd have to add a --force-scan
argument to vgchange -ay to get around that.


Comment 15 Bill Nottingham 2005-04-06 03:33:45 UTC
Alternatively, the cache could just be removed on halt. >:)

Realistically, having to lvm/fsck/mount twice is a pain in the ass, and
contributes to duplicated code, longer boot time, and other annoyances. So, if
this can work with r/o root (but writable /dev), it's greatly preferred.

Comment 16 Bill Nottingham 2005-04-27 15:56:39 UTC
OK, so, I just attempted to reproduce this and failed. What exactly is the
failure scenario here?

Comment 17 Alasdair Kergon 2005-04-27 16:34:24 UTC
For original scenario, something like:

Install / on a raw partition
Create logical volumes.
Combine some of them with software raid.
Place other filesystems like /usr on top of the software raid which is on top of
LVM.

For the .cache problems, e.g. use SCSI disks that can come up in a different
order each reboot - or swap the drives around.  e.g. If sdb and sdc are the LVM
PVs in a VG and sdd and sde are not,  reboot swapping sdb with sdd [cache
partially invalid],  or sdb+sdc with sdd+sde [cache completely invalid], or try
same with sdd/sde as PVs that aren't in any VGs.


Comment 18 Bill Nottingham 2005-04-27 16:52:48 UTC
RAID-on-LVM has explicitly *never* been supported.

Comment 19 Ronny Buchmann 2005-04-27 17:51:15 UTC
#16:
in anaconda:
create / on a raw partition
create PV on a raw partition
create VG
create /usr as LV (or any of /tmp, /var, /home ...)
reboot

*this* scenario could be fixed in anaconda (by creating the cache in /etc/lvm),
but it's not the whole problem, #17 (2nd half) describes the cache problem very
well.

#18:
I personally think RAID-on-LVM should *never* be supported.

#14:
Alasdair, wouldn't it make sense to have a non-caching "vgscan -ay --force-scan"
(needing only a writable /dev)?

Comment 20 Alexandre Oliva 2005-04-27 18:02:00 UTC
FWIW, the problem I had was that a separate volume group wasn't made active,
presumably because upon the first (and now only) vgscan&vgchange -ay in
rc.sysinit, the root device was still read only.  As a result, the attempt to
mount logical volumes from this separate volume group failed.

Comment 21 Bill Nottingham 2005-04-27 18:15:53 UTC
Ronny: will try that. In testing non-root LVM created outside of anaconda, it
works fine.

Alexandre: I'm not sure the read-only-ness is a problem; it works for other
activations for me.

Comment 22 Ronny Buchmann 2005-04-27 18:24:11 UTC
#21:
Yes, the problem is triggered with non existing (anaconda) or wrong (changing
devices) cache.
If you create the LVM in a running system, the cache is correct until devices
change (as far as I understand it).

I'm not sure if the missing backup and archive (anaconda case) is a problem too
(if so, I think this should be fixed in anaconda).

Comment 23 Bill Nottingham 2005-04-27 20:00:06 UTC
OK, so, your case is simply solved if /etc/lvm/{archive,backup} are present.

The easiest thing to do is just package those directories in the lvm2 package.
(and might as well package /var/lock/lvm as well.)

Assigning there.

Comment 24 Alasdair Kergon 2005-04-27 20:03:48 UTC
lvm2-2_01_08-2_1 added those directories etc. to the package


Comment 25 Alasdair Kergon 2005-04-27 20:10:57 UTC
but I think there's still a cache problem.

If you don't care about md over lvm working any more - it used to work, but I
always advised people against doing it - just remove that step from the list (as
per comment #19).  That's only dodges one of the minor issues, viz. the loss of:
 - lvm init
 - raid init
in the original report.


Comment 26 Bill Nottingham 2005-04-27 20:24:25 UTC
The init works fine in the general case once those dirs are added. It's only in
the case of an out-of-date cache that things will go wrong.

As cache files aren't written by default without manual intervention, I'm moving
this from the blocker list, as the big issue (doesn't boot at all) is solved
with the new LVM2 package.

Comment 27 Bill Nottingham 2005-04-27 20:31:42 UTC
Idea of the day - why not change the default lvm cache location to:

  cache = "/dev/.lvm_cache"

That should be writable most all the time that LVM would need it; if you're
calling LVM before udev is running, you've already got problems.



Comment 28 Jeremy Katz 2005-04-28 01:19:41 UTC
What about SELinux implications of writing to /dev?

Comment 29 Bill Nottingham 2005-04-28 02:14:52 UTC
They can be fixed. :)

Comment 30 Alexandre Oliva 2005-05-22 05:11:54 UTC
I'm not entirely sure what it was that fixed the problem for me, but VGs other
than the one containing the root LV are now properly brought up in the first
boot, after an Everything install of 20050521's rawhide.

Comment 31 Bill Nottingham 2005-09-30 20:55:50 UTC
Closing, then.