Bug 211525

Summary: kernel dm: mempool_resize BUG() during multiple snapshot removals
Product: Red Hat Enterprise Linux 5 Reporter: Corey Marthaler <cmarthal>
Component: kernelAssignee: Milan Broz <mbroz>
Status: CLOSED ERRATA QA Contact:
Severity: high Docs Contact:
Priority: medium    
Version: 5.0CC: agk, chris, dwysocha, pvrabec
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: RHBA-2007-0959 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-11-07 19:14:00 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
reproducer script none

Description Corey Marthaler 2006-10-19 20:25:29 UTC
Description of problem:
I was removing a bunch a snapshots and hit this kernel bug.

----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at mm/mempool.c:121
invalid opcode: 0000 [1] SMP
last sysfs file: /block/ram0/dev
CPU 2
Modules linked in: md5 sctp lock_nolock gfs(U) lock_dlm gfs2 dlm configfs
autofs4 hidp rfcomm l2cap bluetooth sunrpd
Pid: 6156, comm: lvremove Not tainted 2.6.18-1.2726.el5 #1
RIP: 0010:[<ffffffff800c9980>]  [<ffffffff800c9980>] mempool_resize+0x1e/0x17d
RSP: 0018:ffff81020aa17ba8  EFLAGS: 00010282
RAX: 0000000000000400 RBX: 00000000ffffffa8 RCX: 0000000000000001
RDX: 00000000000000d0 RSI: 00000000ffffffa8 RDI: ffff8102163294d0
RBP: ffff81020aa17bd8 R08: ffff81020aa17bd8 R09: 0000000000000001
R10: ffffffff80032467 R11: ffffffff8811cdf8 R12: ffff8102163294d0
R13: ffffc20010217080 R14: 00000000ffffffa8 R15: 00000000000000d0
FS:  00002aaaaaabb7e0(0000) GS:ffff8101fff44398(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000ff5000 CR3: 000000020f2d2000 CR4: 00000000000006e0
Process lvremove (pid: 6156, threadinfo ffff81020aa16000, task ffff81020f1a0100)
Stack:  ffff81020aa17bd8 00000000ffffffa8 ffff81021182a2f0 ffffc20010217080
 ffffc20000026000 ffff81020aa17d38 ffff81020aa17bf8 ffffffff880df462
 ffffffff8811cdc0 ffff81020aa6b4d0 ffff81020aa17c08 ffffffff880df4f9
Call Trace:
 [<ffffffff880df462>] :dm_mod:resize_pool+0x45/0xc4
 [<ffffffff880df4f9>] :dm_mod:dm_io_put+0x18/0x1a
 [<ffffffff880df720>] :dm_mod:kcopyd_client_destroy+0x85/0xc5
 [<ffffffff88110ab8>] :dm_snapshot:snapshot_dtr+0x7e/0xe0
 [<ffffffff880dbdf2>] :dm_mod:dm_table_put+0x62/0xd0
 [<ffffffff880dab8d>] :dm_mod:dm_put+0x9f/0x17d
 [<ffffffff880de761>] :dm_mod:dev_remove+0xa3/0xb7
 [<ffffffff880decbe>] :dm_mod:ctl_ioctl+0x23f/0x28c
 [<ffffffff80043a4e>] do_ioctl+0x5e/0x77
 [<ffffffff80032422>] vfs_ioctl+0x25a/0x277
 [<ffffffff8004e9e2>] sys_ioctl+0x5f/0x82
 [<ffffffff8006079a>] tracesys+0xd1/0xdb
DWARF2 unwinder stuck at tracesys+0xd1/0xdb
Leftover inexact backtrace:


Code: 0f 0b 68 63 9c 2a 80 c2 79 00 4c 89 e7 e8 d5 f2 f9 ff 45 3b
RIP  [<ffffffff800c9980>] mempool_resize+0x1e/0x17d
 RSP <ffff81020aa17ba8>


[root@taft-04 ~]# dmsetup ls
vg-snap_18      (253, 39)
vg-snap_20      (253, 43)
vg-snap_17      (253, 37)
vg-snap_21-cow  (253, 44)
vg-origin       (253, 2)
vg-snap_27-cow  (253, 56)
vg-snap_18-cow  (253, 38)
vg-snap_16      (253, 35)
vg-origin-real  (253, 3)
vg-snap_22-cow  (253, 46)
vg-snap_19-cow  (253, 40)
vg-snap_23-cow  (253, 48)
vg-snap_27      (253, 57)
vg-snap_26      (253, 55)
vg-snap_24-cow  (253, 50)
vg-snap_25      (253, 53)
vg-snap_15-cow  (253, 32)
vg-snap_24      (253, 51)
VolGroup00-LogVol01     (253, 1)
vg-snap_25-cow  (253, 52)
vg-snap_23      (253, 49)
vg-snap_16-cow  (253, 34)
VolGroup00-LogVol00     (253, 0)
vg-snap_22      (253, 47)
vg-snap_20-cow  (253, 42)
vg-snap_19      (253, 41)
vg-snap_21      (253, 45)
vg-snap_26-cow  (253, 54)
vg-snap_17-cow  (253, 36)

[root@taft-04 ~]# lvscan
  inactive Original '/dev/vg/origin' [12.00 MB] inherit
  ACTIVE            '/dev/vg/snap_16' [4.00 MB] inherit
  ACTIVE            '/dev/vg/snap_17' [4.00 MB] inherit
  ACTIVE            '/dev/vg/snap_18' [4.00 MB] inherit
  ACTIVE            '/dev/vg/snap_19' [4.00 MB] inherit
  ACTIVE            '/dev/vg/snap_20' [4.00 MB] inherit
  ACTIVE            '/dev/vg/snap_21' [4.00 MB] inherit
  ACTIVE            '/dev/vg/snap_22' [4.00 MB] inherit
  ACTIVE            '/dev/vg/snap_23' [4.00 MB] inherit
  ACTIVE            '/dev/vg/snap_24' [4.00 MB] inherit
  ACTIVE            '/dev/vg/snap_25' [4.00 MB] inherit
  ACTIVE            '/dev/vg/snap_26' [4.00 MB] inherit
  ACTIVE            '/dev/vg/snap_27' [4.00 MB] inherit
  inactive Snapshot '/dev/vg/snap_15' [4.00 MB] inherit
[root@taft-04 ~]# lvs -a -o+devices
  LV      VG   Attr   LSize  Origin Snap%  Move Log Copy%  Devices
  origin  vg   owi-a- 12.00M                               /dev/sdb1(0)
  snap_15 vg   swi---  4.00M origin                        /dev/sdc1(8)
  snap_16 vg   -wi-d-  4.00M                               /dev/sdb1(9)
  snap_17 vg   -wi-d-  4.00M                               /dev/sdc1(9)
  snap_18 vg   -wi-d-  4.00M                               /dev/sdb1(10)
  snap_19 vg   -wi-d-  4.00M                               /dev/sdc1(10)
  snap_20 vg   -wi-d-  4.00M                               /dev/sdb1(11)
  snap_21 vg   -wi-d-  4.00M                               /dev/sdc1(11)
  snap_22 vg   -wi-d-  4.00M                               /dev/sdb1(12)
  snap_23 vg   -wi-d-  4.00M                               /dev/sdc1(12)
  snap_24 vg   -wi-d-  4.00M                               /dev/sdb1(13)
  snap_25 vg   -wi-d-  4.00M                               /dev/sdc1(13)
  snap_26 vg   -wi-d-  4.00M                               /dev/sdb1(14)
  snap_27 vg   -wi-d-  4.00M                               /dev/sdc1(14)

Version-Release number of selected component (if applicable):
[root@taft-04 ~]# rpm -q lvm2
lvm2-2.02.12-2.el5
[root@taft-04 ~]# rpm -q device-mapper
device-mapper-1.02.12-2.el5

Comment 2 RHEL Program Management 2007-03-21 23:46:35 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 4 Milan Broz 2007-04-12 12:01:22 UTC
Reproducible on RHEL5, kernel 2.6.18-8.1.1.el5, new DM-IO patches solve it.

kernel BUG at mm/mempool.c:121!
invalid opcode: 0000 [#1]
SMP
last sysfs file: /block/ram0/dev
Modules linked in: autofs4 hidp nfs lockd fscache nfs_acl rfcomm l2cap bluetooth
sunrpc ipv6 video sbs i2c_ec button battery asus_acpi ac lp sg floppy pcspkr
i2c_piix4 pcnet32 i2c_core mii ide_cd cdrom parport_pc parport serio_raw
dm_snapshot dm_zero dm_mirror dm_mod mptspi mptscsih mptbase scsi_transport_spi
sd_mod scsi_mod ext3 jbd ehci_hcd ohci_hcd uhci_hcd
CPU:    0
EIP:    0060:[<c045272c>]    Not tainted VLI
EFLAGS: 00010282   (2.6.18-8.1.1.el5 #1)
EIP is at mempool_resize+0x14/0x158
eax: cf9f6cc0   ebx: ffffff70   ecx: 000000d0   edx: ffffff70
esi: c7136ec0   edi: d0c31080   ebp: cf9f6cc0   esp: c8dd4d90
ds: 007b   es: 007b   ss: 0068
Process lvremove (pid: 3315, ti=c8dd4000 task=c9e2f550 task.ti=c8dd4000)
Stack: 000000d0 ffffff70 c040492e ffffff70 c7136ec0 d0c31080 00000000 d08faa45
       c7023940 d08fac9b c0432297 00000000 c9e2f550 c0434e65 00000286 c9bd0600
       c7136ec0 d0c31080 d08e4998 c93c7180 d0c31080 00000000 d08f7aca c93c7c80
Call Trace:
 [<c040492e>] common_interrupt+0x1a/0x20
 [<d08faa45>] resize_pool+0x37/0xa5 [dm_mod]
 [<d08fac9b>] kcopyd_client_destroy+0x6a/0x9f [dm_mod]
 [<c0432297>] flush_cpu_workqueue+0x7c/0x87
 [<c0434e65>] autoremove_wake_function+0x0/0x2d
 [<d08e4998>] snapshot_dtr+0x5a/0xa0 [dm_snapshot]
 [<d08f7aca>] dm_table_put+0x4a/0xa7 [dm_mod]
 [<d08f6a80>] dm_put+0x7f/0x130 [dm_mod]
 [<d08f9e41>] dev_remove+0x82/0x90 [dm_mod]
 [<d08fa374>] ctl_ioctl+0x1f3/0x238 [dm_mod]
 [<d08f9dbf>] dev_remove+0x0/0x90 [dm_mod]
 [<c0479cf3>] do_ioctl+0x47/0x5d
 [<c0479f53>] vfs_ioctl+0x24a/0x25c
 [<c0479fad>] sys_ioctl+0x48/0x5f
 [<c0403eff>] syscall_call+0x7/0xb
 =======================
Code: c4 0c 89 d8 5b 5e 5f 5d c3 6a ff ff 74 24 08 e8 38 ff ff ff 5a 59 c3 55 89
c5 57 56 53 83 ec 0c 85 d2 89 54 24 04 89 0c 24 7f 08 <0f> 0b 79 00 0d ff 61 c0
89 e8 e8 c5 9c 1a 00 89 44 24 08 8b 44
EIP: [<c045272c>] mempool_resize+0x14/0x158 SS:ESP 0068:c8dd4d90
 <0>Kernel panic - not syncing: Fatal exception

Comment 5 Milan Broz 2007-04-12 12:05:06 UTC
Simple core of test script to reproduce this:

DEV=/dev/sdb1
DEV1=/dev/sdc1
VG=vg_test
LV=lv_test
NUM=50

pvcreate -ff $DEV $DEV1
vgcreate $VG $DEV $DEV1
lvcreate -L 100M -n $LV $VG

i=1
while [ $i -le $NUM ] ; do
    lvcreate -s -L 4M -n $LV$i /dev/"$VG"/"$LV"
    let i=i+1
done

echo "--ENTER to continue--"; read;

i=1
while [ $i -le $NUM ] ; do
    lvremove -f /dev/"$VG"/"$LV"$i
    let i=i+1
done

lvremove -f /dev/"$VG"/"$LV"
vgchange -a n $VG
vgremove $VG
pvremove $DEV $DEV1


Comment 7 Don Zickus 2007-05-09 18:17:18 UTC
in 2.6.18-18.el5

Comment 8 Corey Marthaler 2007-06-06 18:58:34 UTC
Fix verified in 2.6.18-18.el5.

Comment 9 Don Zickus 2007-07-23 19:56:01 UTC
moving to MODIFIED to satisfy errata tool

Comment 11 Mike Gahagan 2007-08-08 20:02:21 UTC
Created attachment 160932 [details]
reproducer script

I reproduced the  bug and verified the fix with the attach script. It is a
modified version of the original testcase which uses loopback files and does
not require extra disks.

Comment 12 Milan Broz 2007-09-07 13:43:36 UTC
*** Bug 258561 has been marked as a duplicate of this bug. ***

Comment 14 errata-xmlrpc 2007-11-07 19:14:00 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2007-0959.html