68818 – Triggered 'kernel bug at journal.c' using "lvreduce"

Bug 68818 - Triggered 'kernel bug at journal.c' using "lvreduce"

Summary: Triggered 'kernel bug at journal.c' using "lvreduce"

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Red Hat Linux
Classification:	Retired
Component:	kernel
Sub Component:
Version:	9
Hardware:	i386
OS:	Linux
Priority:	medium
Severity:	low
Target Milestone:	---
Assignee:	Dave Jones
QA Contact:	Brian Brock
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2002-07-14 19:54 UTC by Peter van Egdom
Modified:	2015-01-04 22:01 UTC (History)
CC List:	2 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2004-09-30 15:39:45 UTC
Embargoed:

Attachments	(Terms of Use)
Fix for over-zealous clearing of buffer metadata (554 bytes, patch) 2002-11-12 15:51 UTC, Stephen Tweedie	no flags	Details \| Diff
View All

Description Peter van Egdom 2002-07-14 19:54:27 UTC

From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.1) Gecko/20020625

Description of problem:
I decided to play a bit with the Linux Logical Volume Manager, which because
of the easy setup thanks to the new installer invites to play with a little
more than before (uh-oh).

First of all, here is the output of fdisk on my computer:

  Disk /dev/hda: 255 heads, 63 sectors, 1229 cylinders
  Units = cylinders of 16065 * 512 bytes

     Device Boot    Start       End    Blocks   Id  System
  /dev/hda1   *         1       956   7679038+   7  HPFS/NTFS
  /dev/hda2           957       966     80325   83  Linux
  /dev/hda3           967      1157   1534207+  83  Linux
  /dev/hda4          1158      1229    578340    f  Win95 Ext'd LBA)
  /dev/hda5          1158      1176    152586   8e  Linux LVM
  /dev/hda6          1177      1192    128488+  82  Linux swap
  /dev/hda7          1193      1229    297171   8e  Linux LVM


 and here is the output of "lvscan":

  lvscan -- ACTIVE            "/dev/Volume00/LogVol00" [152 MB]
  lvscan -- ACTIVE            "/dev/Volume01/LogVol00" [64 MB]
  lvscan -- 2 logical volumes with 216 MB total in 2 volume groups
  lvscan -- 2 active logical volumes

 
 and here is the output of "df -h":

   Filesystem            Size  Used Avail Use% Mounted on
   /dev/hda3             1.4G  1.2G  173M  88% /
   /dev/hda2              76M  9.0M   63M  13% /boot
   /dev/Volume00/LogVol00
                         147M   18M  122M  13% /home
   none                   61M     0   61M   0% /dev/shm
   /dev/Volume01/LogVol00
                          62M  4.1M   54M   7% /tmp


I decided to reduce one of the logical volumes to 100 MB with the following
command:


 "lvreduce -L 100 /dev/Volume00/LogVol00 -v"


This seemed to work out all right, however I was surprised that the output
of "df -h" showed the filesystem which was in "/dev/Volume00/LogVol00" still
had 120 MB available after reducing this filesystem to 100 MB.

When I dediced to shrink this filesystem a bit more for fun with the command
"lvreduce -L 90 /dev/Volume00/LogVol00 -v", the system 'froze' (the filesystem
in question was the "/home" directory) so the X session I was in froze too.
(GNOME probably couldn't write it's temporary files)..


So i went to "/dev/tty1" to see what happened.

This was the relevant output of "dmesg":

attempt to access beyond end of device
3a:00: rw=1, want=98310, limit=94208
Assertion failure in __journal_remove_journal_head() at journal.c:1783:
"buffer_jbd(bh)"
------------[ cut here ]------------
kernel BUG at journal.c:1783!
invalid operand: 0000
sg scsi_mod i810_audio ac97_codec soundcore agpgart autofs ide-tape ide-cd cdr

CPU:    0
EIP:    0010:[<c8843905>]    Not tainted
EFLAGS: 00013286

EIP is at __journal_remove_journal_head [jbd] 0xe5 (2.4.18-5.58)
eax: 0000005c   ebx: c7d82128   ecx: 00000001   edx: 00003086
esi: c24ba3e4   edi: c584d924   ebp: c68a9e14   esp: c68a9df8
ds: 0018   es: 0018   ss: 0018
Process kjournald (pid: 209, stackpage=c68a9000)
Stack: c8845780 c884434d c8844166 000006f7 c884437b c7d82128 c24ba79c c68a9e24
       c88404d7 c7d82128 c24ba3e4 c68a9e54 c8840add c24ba3e4 c68a8000 c24ba4fc
       00000003 00000001 c1588cfc c1588cfc c68a8000 c14f0ab4 c14f0ab4 c68a9fac
Call Trace: [<c8845780>] .rodata.str1.32 [jbd] 0x13a0
[<c884434d>] .rodata.str1.1 [jbd] 0x6c5
[<c8844166>] .rodata.str1.1 [jbd] 0x4de
[<c884437b>] .rodata.str1.1 [jbd] 0x6f3
[<c88404d7>] __try_to_free_cp_buf [jbd] 0x37
[<c8840add>] __journal_clean_checkpoint_list [jbd] 0x6d
[<c883e74d>] journal_commit_transaction [jbd] 0xfd
[<c01174be>] schedule [kernel] 0x17e
[<c8841bee>] kjournald [jbd] 0x15e
[<c8841a70>] commit_timeout [jbd] 0x0
[<c010765e>] kernel_thread [kernel] 0x2e
[<c8841a90>] kjournald [jbd] 0x0


Code: 0f 0b f7 06 66 41 84 c8 e9 60 ff ff ff 8b 5d f8 8b 75 fc 89


Because of the state of the system I decided to reboot the system.


Upon rebooting the following situation happened :


Limbo tried to mount the LVM filesystems, but couldn't and gave the following
error:

 The filesystem size, according to the superblock is 155648 blocks.
 The physical size of the partition is 40960 blocks.


Then it hinted about doing a fsck of the filesystem, which I did, but that
did not help. (fsck kept complaining about inodes).


I probably should not have used fsck to fix this problem, so I decided to do
a "lvextend -L 150 /dev/Volume00/LogVol00" to bring it back to the 150 MB
filesystem it was, and rebooted the system.

This time Limbo was happy again and started booting just if nothing happened.


Well, the main question is, did I do something very stupid or did I trigger
a bug? :-)

I'll go to the Sistina site to read some information on Linux LVM.
After that I'll try to reproduce the bug and submit a new comment.

(unless this call is closed with status "NOTA" with some witty comment by a
Red Hat employee that I should do my homework before playing with LVM..) :-)

I will give this bug the initial state of "low" severity and increase it later
if I'm able to reproduce the bug.

Version-Release number of selected component (if applicable):
Source RPM: lvm-1.0.3-6.src.rpm

How reproducible:
Didn't try (yet).


Additional info:

Comment 1 Stephen Tweedie 2002-11-11 21:24:42 UTC

Well, Don't Do That(TM)!

Reducing the size of a physical device does *not* automatically reduce the size
of a filesystem sitting on that device.  If you try this on a mounted device,
you will get really bad things happening!

However, thanks for the report, because you've just supplied me with a beautiful
recipe for reproducing the ext3 oops you mentioned, which I've been chasing for
a while.

The LVM problem is not a bug.  You just can't do that.  The ext3 assert failure
needs to be fixed, and I'll look into that.

Comment 2 Stephen Tweedie 2002-11-12 15:51:29 UTC

Created attachment 84664 [details]
Fix for over-zealous clearing of buffer metadata

Comment 3 Stephen Tweedie 2002-11-12 15:52:27 UTC

The attached patch fixes it for me.  Will be in future kernels, pending review.

Comment 4 Dave Jones 2004-01-05 19:28:32 UTC

RHL 8 got EOL'd, but I'm keeping this open as it affects RHL9 too, and
will need fixing there. (This patch got dropped on the floor for some
reason).

Comment 5 Bugzilla owner 2004-09-30 15:39:45 UTC

Thanks for the bug report. However, Red Hat no longer maintains this version of
the product. Please upgrade to the latest version and open a new bug if the problem
persists.

The Fedora Legacy project (http://fedoralegacy.org/) maintains some older releases, 
and if you believe this bug is interesting to them, please report the problem in
the bug tracker at: http://bugzilla.fedora.us/

Note You need to log in before you can comment on or make changes to this bug.