97843 – LVM snapshots dont mount

Bug 97843 - LVM snapshots dont mount

Summary: LVM snapshots dont mount

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Linux
Classification:	Retired
Component:	lvm
Sub Component:
Version:	7.3
Hardware:	All
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	---
Assignee:	Dave Jones
QA Contact:	Brian Brock
Docs Contact:
URL:
Whiteboard:
Duplicates (3):	84278 88115 111337 (view as bug list)
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2003-06-23 06:44 UTC by Deon George
Modified:	2015-01-04 22:02 UTC (History)
CC List:	11 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2004-01-05 03:43:40 UTC
Embargoed:

Attachments	(Terms of Use)
Minimal fix for LVM snapshotting (894 bytes, patch) 2003-10-23 17:23 UTC, Stephen Tweedie	no flags	Details \| Diff
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2003:394	0	normal	SHIPPED_LIVE	Updated 2.4 kernel fixes various bugs	2003-12-23 05:00:00 UTC

Description Deon George 2003-06-23 06:44:20 UTC

Description of problem:
LVM snapshots of mounted filesystems dont mount.

-- EXAMPLE --
[root@merlin-1 root]# mount
...
/dev/vgroot/lvexchange on /vmware/exchange type reiserfs (rw)

[root@merlin-1 root]# lvcreate -n snap -s -L2G /dev/vgroot/lvexchange
lvcreate -- WARNING: the snapshot will be automatically disabled once it gets full
lvcreate -- INFO: using default snapshot chunk size of 64 KB for
&amp;quot;/dev/vgroot/snap&amp;quot;
lvcreate -- doing automatic backup of &amp;quot;vgroot&amp;quot;
lvcreate -- logical volume &amp;quot;/dev/vgroot/snap&amp;quot; successfully created

[root@merlin-1 root]# mount -t reiserfs /dev/vgroot/snap /mnt/tmp -oro
mount: wrong fs type, bad option, bad superblock on /dev/vgroot/snap,
       or too many mounted file systems

-- EXAMPLE --

Now, if the file system is not mounted when the snapshot is taken (or is
mounted, but there has been NO I/O, it will mount successfully).

I did see somewhere (but sorry havent been able to find it), something about the
snapshot being created quicker then the reiserfs journal could be synced/closed
would cause the problem. It also mentioned that it would be rare - but since I'm
running a busy SMP system, Im wondering if I fall into this category.

Version-Release number of selected component (if applicable):
lvm-1.0.3-4

How reproducible:
Always

Steps to Reproduce:
1. mount file system
2. do some I/O on it
3. create a snapshot
4. snapshot fails to mount.

If you unmount it, create a snapshot, then the snap will mount.

Expected Results:  Snapshot should mount - otherwise why bother hey?

Comment 1 Alain Spineux 2003-08-22 09:37:24 UTC

I used redhat 8.0 kernel 2.4.20-18.8smp and had the same probleme 

> Now, if the file system is not mounted when the snapshot is taken (or is
> mounted, but there has been NO I/O, it will mount successfully).

I have not tested this ! 

It's looks this is a bug in kernel 2.4.20

Looks the web for keywords :  2.4.20  lvm snapshot

Comment 2 Eric Bollengier 2003-10-22 14:56:09 UTC

I used redhat 7.3 kernel 2.4.20-20.7smp and had the same probleme   
 
The lvm-VFS-lock patch is ok, but i have to do something like 
fuser -km /path/to/app 
umount /path/to/app 
 
to make my snapshot...

Comment 3 Stephen Tweedie 2003-10-23 17:23:11 UTC

Created attachment 95434 [details]
Minimal fix for LVM snapshotting

Comment 4 Stephen Tweedie 2003-10-23 17:27:33 UTC

There are two parts to the problem.  One is the absence of the
LVM_VFS_ENHANCEMENT #define, which managed to get lost for reasons unknown
during an update of our trees.  The other is that even with that fix, you get
benign but annoying messages like

lvm - lvm_map: ll_rw_blk write for readonly LV /dev/spock/snap1
Can't write to read-only device 3a:06

in the logs.  Those are harmless, but fixable; but the fix is moderately risky
so I'll send it upstream in the first instance.

The minimal fix, for LVM_VFS_ENHANCEMENT (plus an additional special-case fix
for data=writeback mode) is in the previous attachment; it's also in our errata
trees now.  But somehow that fix only got picked up into the 2.4.20-20.8 errata
for 8.0; the RHL 7.x and 9 versions of the kernel managed to miss it.

Anyway, it's fixed internally with the patch above, which you can use in the
mean time.  Reassinging to our errata maintainer.

Comment 5 Stephen Tweedie 2003-10-23 17:32:03 UTC

*** Bug 88115 has been marked as a duplicate of this bug. ***

Comment 6 Stephen Tweedie 2003-10-23 17:33:37 UTC

*** Bug 84278 has been marked as a duplicate of this bug. ***

Comment 7 Damian Menscher 2003-10-23 18:12:49 UTC

Do you have an ETA for the errata release?  While we wait, others might be 
interested in my fix for doing dumps: rsync to another partition, then dump the 
copy.  Of course, this requires double the disk space....

Comment 8 Philip Spencer 2003-12-02 05:11:44 UTC

Well, a new errata release of the kernel (2.4.20-24.9) for RHL9 is out now.

And it's still broken! Which sort of contradicts the earlier posting about the fix being "in the errata tree" (implying it would be released with the next errata).

Attention RedHat: support and timely release of bugfixes has seriously deteriorated over the past six months. Like this one -- how many months of a patch being "in the errata tree" is it reasonable to have pass before it starts being reflected in the errata which are actually released? The same has been true of the bug reports that I've filed recently.

Judging by the past track record, it now seems likely that this bug will never be fixed, and we will now have to build our own kernel RPMS.

Once the "support" period for RHL9 expires in April, customers will have to choose between paying more for Enterprise Linux or switching to a free version (Fedora or another distribution).

Six months ago, Enterprise seemed an attractive option; RedHat's support was good. However, now that support has deteriorated so much, I think RedHat will have a hard time persuading anybody to pay more money for it. Which is a shame, because it used to be a very well supported and put together distribution, and it's sad to see it so much in decline with serious bugs like this languishing unfixed for so many months.

Comment 9 Damian Menscher 2003-12-02 05:34:11 UTC

Anyone know if this has been fixed in Fedora?  What about in RHEL?

BTW, I agree that support is severely lacking here.  The bug was 
reported in February with fixes suggested as early as May.  Now even 
when fixes are in the errata tree, the just get "missed" time after 
time.  Considering the inability of "dump" to create a consistent 
backup of an active filesystem, we're forced to resort to rsync to 
extra disks to get our backups.

We're considering RHEL as a future option.  Maybe I can beat RH into 
action when I have a support contract and a judge.  :P

Comment 10 Damian Menscher 2003-12-02 06:05:11 UTC

I just checked the kernel sources on a Fedora box, and the patch made 
it in there.  I also downloaded the .src.rpm for RHEL3 and the patch 
was there.  I checked a RH8.0 machine with 2.4.20-20.8 and it does 
NOT appear to have the patch, contrary to the comment above.

Comment 11 Michael Paesold 2003-12-02 11:22:38 UTC

We already have RHEL v.3. Nevertheless some core services currently 
run on 9.0. We cannot update those servers. We have to install the 
services on new machine and then switch. But what about the 500 GB 
lvm managed data? And we can't make consistent backups because of 
this bug! The bug history -- it's a shame -- RedHat!

Comment 12 Stephen Tweedie 2003-12-04 19:10:23 UTC

The fixes _are_ in the bugfix errata tree, but the last kernel was not
built from that branch --- it was a security errata with minimal
change against the old 2.4.20-20.9 kernel, released at short notice
due to the do_brk exploit.

We expect the proper bugfix errata to be released shortly.

Comment 13 Chris Adams 2003-12-09 15:28:58 UTC

I'm running kernel 2.4.20-24.8 rebuilt with this patch and
LVM_VFS_ENHANCEMENT defined, and I had a crash last night that I think
may have been when a snapshot was being created, so there may still be
a problem here somewhere.  See bug 111735 for more information.

Comment 14 Stephen Tweedie 2003-12-09 19:56:24 UTC

There's no sign of an LVM footprint in that oops, and the crash is
accessing a data structure which is (a) often associated with bad
memory, and (b) not touched by anything on LVM.  Is it reproducible? 
I'd be inclined to suspect something else at this stage, but obviously
if you see it again that will give more info.

Comment 15 Dave Jones 2003-12-14 00:10:20 UTC

*** Bug 111337 has been marked as a duplicate of this bug. ***

Comment 16 Dave Jones 2003-12-14 00:12:13 UTC

There is another bugfix errata currently in QA, that should be out
'real soon'. I apologise for this fix not making it into the recent
update, but that was a quick release in order to fix the recent do_brk
security problem. To 'rush' that kernel through QA, a kernel with
minimal change vs the previous errata kernel was deemed necessary.

Note You need to log in before you can comment on or make changes to this bug.