Description of problem:
LVM snapshots of mounted filesystems dont mount.
-- EXAMPLE --
[root@merlin-1 root]# mount
/dev/vgroot/lvexchange on /vmware/exchange type reiserfs (rw)
[root@merlin-1 root]# lvcreate -n snap -s -L2G /dev/vgroot/lvexchange
lvcreate -- WARNING: the snapshot will be automatically disabled once it gets full
lvcreate -- INFO: using default snapshot chunk size of 64 KB for
lvcreate -- doing automatic backup of &quot;vgroot&quot;
lvcreate -- logical volume &quot;/dev/vgroot/snap&quot; successfully created
[root@merlin-1 root]# mount -t reiserfs /dev/vgroot/snap /mnt/tmp -oro
mount: wrong fs type, bad option, bad superblock on /dev/vgroot/snap,
or too many mounted file systems
-- EXAMPLE --
Now, if the file system is not mounted when the snapshot is taken (or is
mounted, but there has been NO I/O, it will mount successfully).
I did see somewhere (but sorry havent been able to find it), something about the
snapshot being created quicker then the reiserfs journal could be synced/closed
would cause the problem. It also mentioned that it would be rare - but since I'm
running a busy SMP system, Im wondering if I fall into this category.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. mount file system
2. do some I/O on it
3. create a snapshot
4. snapshot fails to mount.
If you unmount it, create a snapshot, then the snap will mount.
Expected Results: Snapshot should mount - otherwise why bother hey?
I used redhat 8.0 kernel 2.4.20-18.8smp and had the same probleme
> Now, if the file system is not mounted when the snapshot is taken (or is
> mounted, but there has been NO I/O, it will mount successfully).
I have not tested this !
It's looks this is a bug in kernel 2.4.20
Looks the web for keywords : 2.4.20 lvm snapshot
I used redhat 7.3 kernel 2.4.20-20.7smp and had the same probleme
The lvm-VFS-lock patch is ok, but i have to do something like
fuser -km /path/to/app
to make my snapshot...
Created attachment 95434 [details]
Minimal fix for LVM snapshotting
There are two parts to the problem. One is the absence of the
LVM_VFS_ENHANCEMENT #define, which managed to get lost for reasons unknown
during an update of our trees. The other is that even with that fix, you get
benign but annoying messages like
lvm - lvm_map: ll_rw_blk write for readonly LV /dev/spock/snap1
Can't write to read-only device 3a:06
in the logs. Those are harmless, but fixable; but the fix is moderately risky
so I'll send it upstream in the first instance.
The minimal fix, for LVM_VFS_ENHANCEMENT (plus an additional special-case fix
for data=writeback mode) is in the previous attachment; it's also in our errata
trees now. But somehow that fix only got picked up into the 2.4.20-20.8 errata
for 8.0; the RHL 7.x and 9 versions of the kernel managed to miss it.
Anyway, it's fixed internally with the patch above, which you can use in the
mean time. Reassinging to our errata maintainer.
*** Bug 88115 has been marked as a duplicate of this bug. ***
*** Bug 84278 has been marked as a duplicate of this bug. ***
Do you have an ETA for the errata release? While we wait, others might be
interested in my fix for doing dumps: rsync to another partition, then dump the
copy. Of course, this requires double the disk space....
Well, a new errata release of the kernel (2.4.20-24.9) for RHL9 is out now.
And it's still broken! Which sort of contradicts the earlier posting about the fix being "in the errata tree" (implying it would be released with the next errata).
Attention RedHat: support and timely release of bugfixes has seriously deteriorated over the past six months. Like this one -- how many months of a patch being "in the errata tree" is it reasonable to have pass before it starts being reflected in the errata which are actually released? The same has been true of the bug reports that I've filed recently.
Judging by the past track record, it now seems likely that this bug will never be fixed, and we will now have to build our own kernel RPMS.
Once the "support" period for RHL9 expires in April, customers will have to choose between paying more for Enterprise Linux or switching to a free version (Fedora or another distribution).
Six months ago, Enterprise seemed an attractive option; RedHat's support was good. However, now that support has deteriorated so much, I think RedHat will have a hard time persuading anybody to pay more money for it. Which is a shame, because it used to be a very well supported and put together distribution, and it's sad to see it so much in decline with serious bugs like this languishing unfixed for so many months.
Anyone know if this has been fixed in Fedora? What about in RHEL?
BTW, I agree that support is severely lacking here. The bug was
reported in February with fixes suggested as early as May. Now even
when fixes are in the errata tree, the just get "missed" time after
time. Considering the inability of "dump" to create a consistent
backup of an active filesystem, we're forced to resort to rsync to
extra disks to get our backups.
We're considering RHEL as a future option. Maybe I can beat RH into
action when I have a support contract and a judge. :P
I just checked the kernel sources on a Fedora box, and the patch made
it in there. I also downloaded the .src.rpm for RHEL3 and the patch
was there. I checked a RH8.0 machine with 2.4.20-20.8 and it does
NOT appear to have the patch, contrary to the comment above.
We already have RHEL v.3. Nevertheless some core services currently
run on 9.0. We cannot update those servers. We have to install the
services on new machine and then switch. But what about the 500 GB
lvm managed data? And we can't make consistent backups because of
this bug! The bug history -- it's a shame -- RedHat!
The fixes _are_ in the bugfix errata tree, but the last kernel was not
built from that branch --- it was a security errata with minimal
change against the old 2.4.20-20.9 kernel, released at short notice
due to the do_brk exploit.
We expect the proper bugfix errata to be released shortly.
I'm running kernel 2.4.20-24.8 rebuilt with this patch and
LVM_VFS_ENHANCEMENT defined, and I had a crash last night that I think
may have been when a snapshot was being created, so there may still be
a problem here somewhere. See bug 111735 for more information.
There's no sign of an LVM footprint in that oops, and the crash is
accessing a data structure which is (a) often associated with bad
memory, and (b) not touched by anything on LVM. Is it reproducible?
I'd be inclined to suspect something else at this stage, but obviously
if you see it again that will give more info.
*** Bug 111337 has been marked as a duplicate of this bug. ***
There is another bugfix errata currently in QA, that should be out
'real soon'. I apologise for this fix not making it into the recent
update, but that was a quick release in order to fix the recent do_brk
security problem. To 'rush' that kernel through QA, a kernel with
minimal change vs the previous errata kernel was deemed necessary.