Description Corey Marthaler 2006-10-19 16:25:29 EDT
Description of problem:
I was removing a bunch a snapshots and hit this kernel bug.

----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at mm/mempool.c:121
invalid opcode: 0000 [1] SMP
last sysfs file: /block/ram0/dev
Modules linked in: md5 sctp lock_nolock gfs(U) lock_dlm gfs2 dlm configfs
autofs4 hidp rfcomm l2cap bluetooth sunrpd
Pid: 6156, comm: lvremove Not tainted 2.6.18-1.2726.el5 #1
RIP: 0010:[<ffffffff800c9980>]  [<ffffffff800c9980>] mempool_resize+0x1e/0x17d
RSP: 0018:ffff81020aa17ba8  EFLAGS: 00010282
RAX: 0000000000000400 RBX: 00000000ffffffa8 RCX: 0000000000000001
RDX: 00000000000000d0 RSI: 00000000ffffffa8 RDI: ffff8102163294d0
RBP: ffff81020aa17bd8 R08: ffff81020aa17bd8 R09: 0000000000000001
R10: ffffffff80032467 R11: ffffffff8811cdf8 R12: ffff8102163294d0
R13: ffffc20010217080 R14: 00000000ffffffa8 R15: 00000000000000d0
FS:  00002aaaaaabb7e0(0000) GS:ffff8101fff44398(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000ff5000 CR3: 000000020f2d2000 CR4: 00000000000006e0
Process lvremove (pid: 6156, threadinfo ffff81020aa16000, task ffff81020f1a0100)
Stack:  ffff81020aa17bd8 00000000ffffffa8 ffff81021182a2f0 ffffc20010217080
 ffffc20000026000 ffff81020aa17d38 ffff81020aa17bf8 ffffffff880df462
 ffffffff8811cdc0 ffff81020aa6b4d0 ffff81020aa17c08 ffffffff880df4f9
Call Trace:
 [<ffffffff880df462>] :dm_mod:resize_pool+0x45/0xc4
 [<ffffffff880df4f9>] :dm_mod:dm_io_put+0x18/0x1a
 [<ffffffff880df720>] :dm_mod:kcopyd_client_destroy+0x85/0xc5
 [<ffffffff88110ab8>] :dm_snapshot:snapshot_dtr+0x7e/0xe0
 [<ffffffff880dbdf2>] :dm_mod:dm_table_put+0x62/0xd0
 [<ffffffff880dab8d>] :dm_mod:dm_put+0x9f/0x17d
 [<ffffffff880de761>] :dm_mod:dev_remove+0xa3/0xb7
 [<ffffffff880decbe>] :dm_mod:ctl_ioctl+0x23f/0x28c
 [<ffffffff80043a4e>] do_ioctl+0x5e/0x77
 [<ffffffff80032422>] vfs_ioctl+0x25a/0x277
 [<ffffffff8004e9e2>] sys_ioctl+0x5f/0x82
 [<ffffffff8006079a>] tracesys+0xd1/0xdb
DWARF2 unwinder stuck at tracesys+0xd1/0xdb
Leftover inexact backtrace:

Code: 0f 0b 68 63 9c 2a 80 c2 79 00 4c 89 e7 e8 d5 f2 f9 ff 45 3b
RIP  [<ffffffff800c9980>] mempool_resize+0x1e/0x17d
 RSP <ffff81020aa17ba8>

[root@taft-04 ~]# dmsetup ls
vg-snap_18      (253, 39)
vg-snap_20      (253, 43)
vg-snap_17      (253, 37)
vg-snap_21-cow  (253, 44)
vg-origin       (253, 2)
vg-snap_27-cow  (253, 56)
vg-snap_18-cow  (253, 38)
vg-snap_16      (253, 35)
vg-origin-real  (253, 3)
vg-snap_22-cow  (253, 46)
vg-snap_19-cow  (253, 40)
vg-snap_23-cow  (253, 48)
vg-snap_27      (253, 57)
vg-snap_26      (253, 55)
vg-snap_24-cow  (253, 50)
vg-snap_25      (253, 53)
vg-snap_15-cow  (253, 32)
vg-snap_24      (253, 51)
VolGroup00-LogVol01     (253, 1)
vg-snap_25-cow  (253, 52)
vg-snap_23      (253, 49)
vg-snap_16-cow  (253, 34)
VolGroup00-LogVol00     (253, 0)
vg-snap_22      (253, 47)
vg-snap_20-cow  (253, 42)
vg-snap_19      (253, 41)
vg-snap_21      (253, 45)
vg-snap_26-cow  (253, 54)
vg-snap_17-cow  (253, 36)

[root@taft-04 ~]# lvscan
  inactive Original '/dev/vg/origin' [12.00 MB] inherit
  ACTIVE            '/dev/vg/snap_16' [4.00 MB] inherit
  ACTIVE            '/dev/vg/snap_17' [4.00 MB] inherit
  ACTIVE            '/dev/vg/snap_18' [4.00 MB] inherit
  ACTIVE            '/dev/vg/snap_19' [4.00 MB] inherit
  ACTIVE            '/dev/vg/snap_20' [4.00 MB] inherit
  ACTIVE            '/dev/vg/snap_21' [4.00 MB] inherit
  ACTIVE            '/dev/vg/snap_22' [4.00 MB] inherit
  ACTIVE            '/dev/vg/snap_23' [4.00 MB] inherit
  ACTIVE            '/dev/vg/snap_24' [4.00 MB] inherit
  ACTIVE            '/dev/vg/snap_25' [4.00 MB] inherit
  ACTIVE            '/dev/vg/snap_26' [4.00 MB] inherit
  ACTIVE            '/dev/vg/snap_27' [4.00 MB] inherit
  inactive Snapshot '/dev/vg/snap_15' [4.00 MB] inherit
[root@taft-04 ~]# lvs -a -o+devices
  LV      VG   Attr   LSize  Origin Snap%  Move Log Copy%  Devices
  origin  vg   owi-a- 12.00M                               /dev/sdb1(0)
  snap_15 vg   swi---  4.00M origin                        /dev/sdc1(8)
  snap_16 vg   -wi-d-  4.00M                               /dev/sdb1(9)
  snap_17 vg   -wi-d-  4.00M                               /dev/sdc1(9)
  snap_18 vg   -wi-d-  4.00M                               /dev/sdb1(10)
  snap_19 vg   -wi-d-  4.00M                               /dev/sdc1(10)
  snap_20 vg   -wi-d-  4.00M                               /dev/sdb1(11)
  snap_21 vg   -wi-d-  4.00M                               /dev/sdc1(11)
  snap_22 vg   -wi-d-  4.00M                               /dev/sdb1(12)
  snap_23 vg   -wi-d-  4.00M                               /dev/sdc1(12)
  snap_24 vg   -wi-d-  4.00M                               /dev/sdb1(13)
  snap_25 vg   -wi-d-  4.00M                               /dev/sdc1(13)
  snap_26 vg   -wi-d-  4.00M                               /dev/sdb1(14)
  snap_27 vg   -wi-d-  4.00M                               /dev/sdc1(14)

Version-Release number of selected component (if applicable):
[root@taft-04 ~]# rpm -q lvm2
[root@taft-04 ~]# rpm -q device-mapper
Comment 2 RHEL Product and Program Management 2007-03-21 19:46:35 EDT
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
Comment 4 Milan Broz 2007-04-12 08:01:22 EDT
Reproducible on RHEL5, kernel 2.6.18-8.1.1.el5, new DM-IO patches solve it.

kernel BUG at mm/mempool.c:121!
invalid opcode: 0000 [#1]
last sysfs file: /block/ram0/dev
Modules linked in: autofs4 hidp nfs lockd fscache nfs_acl rfcomm l2cap bluetooth
sunrpc ipv6 video sbs i2c_ec button battery asus_acpi ac lp sg floppy pcspkr
i2c_piix4 pcnet32 i2c_core mii ide_cd cdrom parport_pc parport serio_raw
dm_snapshot dm_zero dm_mirror dm_mod mptspi mptscsih mptbase scsi_transport_spi
sd_mod scsi_mod ext3 jbd ehci_hcd ohci_hcd uhci_hcd
CPU:    0
EIP:    0060:[<c045272c>]    Not tainted VLI
EFLAGS: 00010282   (2.6.18-8.1.1.el5 #1)
EIP is at mempool_resize+0x14/0x158
eax: cf9f6cc0   ebx: ffffff70   ecx: 000000d0   edx: ffffff70
esi: c7136ec0   edi: d0c31080   ebp: cf9f6cc0   esp: c8dd4d90
ds: 007b   es: 007b   ss: 0068
Process lvremove (pid: 3315, ti=c8dd4000 task=c9e2f550 task.ti=c8dd4000)
Stack: 000000d0 ffffff70 c040492e ffffff70 c7136ec0 d0c31080 00000000 d08faa45
       c7023940 d08fac9b c0432297 00000000 c9e2f550 c0434e65 00000286 c9bd0600
       c7136ec0 d0c31080 d08e4998 c93c7180 d0c31080 00000000 d08f7aca c93c7c80
Call Trace:
 [<c040492e>] common_interrupt+0x1a/0x20
 [<d08faa45>] resize_pool+0x37/0xa5 [dm_mod]
 [<d08fac9b>] kcopyd_client_destroy+0x6a/0x9f [dm_mod]
 [<c0432297>] flush_cpu_workqueue+0x7c/0x87
 [<c0434e65>] autoremove_wake_function+0x0/0x2d
 [<d08e4998>] snapshot_dtr+0x5a/0xa0 [dm_snapshot]
 [<d08f7aca>] dm_table_put+0x4a/0xa7 [dm_mod]
 [<d08f6a80>] dm_put+0x7f/0x130 [dm_mod]
 [<d08f9e41>] dev_remove+0x82/0x90 [dm_mod]
 [<d08fa374>] ctl_ioctl+0x1f3/0x238 [dm_mod]
 [<d08f9dbf>] dev_remove+0x0/0x90 [dm_mod]
 [<c0479cf3>] do_ioctl+0x47/0x5d
 [<c0479f53>] vfs_ioctl+0x24a/0x25c
 [<c0479fad>] sys_ioctl+0x48/0x5f
 [<c0403eff>] syscall_call+0x7/0xb
Code: c4 0c 89 d8 5b 5e 5f 5d c3 6a ff ff 74 24 08 e8 38 ff ff ff 5a 59 c3 55 89
c5 57 56 53 83 ec 0c 85 d2 89 54 24 04 89 0c 24 7f 08 <0f> 0b 79 00 0d ff 61 c0
89 e8 e8 c5 9c 1a 00 89 44 24 08 8b 44
EIP: [<c045272c>] mempool_resize+0x14/0x158 SS:ESP 0068:c8dd4d90
 <0>Kernel panic - not syncing: Fatal exception
Comment 5 Milan Broz 2007-04-12 08:05:06 EDT
Simple core of test script to reproduce this:


pvcreate -ff $DEV $DEV1
vgcreate $VG $DEV $DEV1
lvcreate -L 100M -n $LV $VG

while [ $i -le $NUM ] ; do
    lvcreate -s -L 4M -n $LV$i /dev/"$VG"/"$LV"
    let i=i+1

echo "--ENTER to continue--"; read;

while [ $i -le $NUM ] ; do
    lvremove -f /dev/"$VG"/"$LV"$i
    let i=i+1

lvremove -f /dev/"$VG"/"$LV"
vgchange -a n $VG
vgremove $VG
pvremove $DEV $DEV1
Comment 7 Don Zickus 2007-05-09 14:17:18 EDT
in 2.6.18-18.el5
Comment 8 Corey Marthaler 2007-06-06 14:58:34 EDT
Fix verified in 2.6.18-18.el5.
Comment 9 Don Zickus 2007-07-23 15:56:01 EDT
moving to MODIFIED to satisfy errata tool
Comment 11 Mike Gahagan 2007-08-08 16:02:21 EDT
Created attachment 160932 [details]
reproducer script

I reproduced the  bug and verified the fix with the attach script. It is a
modified version of the original testcase which uses loopback files and does
not require extra disks.
Comment 12 Milan Broz 2007-09-07 09:43:36 EDT
*** Bug 258561 has been marked as a duplicate of this bug. ***
Comment 14 errata-xmlrpc 2007-11-07 14:14:00 EST
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.


