Bug 204791

Summary: full snapshot removal attempt causes a panic
Product: Red Hat Enterprise Linux 4 Reporter: Corey Marthaler <cmarthal>
Component: kernelAssignee: Milan Broz <mbroz>
Status: CLOSED ERRATA QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 4.4CC: agk, bmr, christophe.varoqui, dwysocha, egoggin, lmb, martial.paupe, mbroz, pvrabec, tranlan
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: RHBA-2007-0304 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-05-08 03:29:48 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Corey Marthaler 2006-08-31 16:18:55 UTC
Description of problem:
[root@taft-04 lvm]# lvs -a -o +devices
  LV     VG      Attr   LSize   Origin Snap%  Move Log Copy%  Devices
  origin snapper owi-ao 100.00M                               /dev/sdb1(0)
  snap1  snapper Swi-I-  20.00M origin 100.00                 /dev/sdb2(0)
[root@taft-04 lvm]# lvscan
  inactive Original '/dev/snapper/origin' [100.00 MB] inherit
  inactive Snapshot '/dev/snapper/snap1' [20.00 MB] inherit
[root@taft-04 lvm]# lvremove /dev/snapper/snap1
Do you really want to remove active logical volume "snap1"? [y/n]: y
                                                                               
              [PANIC]


Unable to handle kernel NULL pointer dereference at 0000000000000000 RIP:
<ffffffffa00f80d0>{:dm_snapshot:exit_exception_table+49}
PML4 1d1037067 PGD 1ddcb1067 PMD 0
Oops: 0000 [1] SMP
CPU 3
Modules linked in: lock_nolock(U) gfs(U) lock_harness(U) qla2300 qla2xxx
scsi_transport_fc cman(U)d
Pid: 29693, comm: lvremove Not tainted 2.6.9-42.0.1.ELlargesmp
RIP: 0010:[<ffffffffa00f80d0>]
<ffffffffa00f80d0>{:dm_snapshot:exit_exception_table+49}
RSP: 0018:00000101cf741cf8  EFLAGS: 00010207
RAX: 0000000000000000 RBX: ffffff0000038000 RCX: 0000010000012000
RDX: 0000000000000206 RSI: 0000000000000000 RDI: 0000000000000203
RBP: 0000010214b52af0 R08: 00000101cf740000 R09: 0000000000000040
R10: 0000000000000000 R11: 0000000000000005 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000100
FS:  0000002a95576540(0000) GS:ffffffff804f7f80(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000000000000 CR3: 00000001fff14000 CR4: 00000000000006e0
Process lvremove (pid: 29693, threadinfo 00000101cf740000, task 00000101ec5dd7f0)
Stack: 0000010217b25880 0000010214b52a80 ffffff0000036080 0000000000000000
       00000000c138fd04 00000000006b9070 ffffffffa009f8c9 ffffffffa00f86d5
       0000000000000000 ffffff0000036080
Call Trace:<ffffffffa009f8c9>{:dm_mod:dev_remove+0}
<ffffffffa00f86d5>{:dm_snapshot:snapshot_dtr+1
       <ffffffffa009d90d>{:dm_mod:table_destroy+72}
<ffffffffa009d020>{:dm_mod:dm_put+149}
       <ffffffffa009f97d>{:dm_mod:dev_remove+180}
<ffffffffa00a0c34>{:dm_mod:ctl_ioctl+600}
       <ffffffff8018a3dd>{sys_ioctl+853} <ffffffff8011026a>{system_call+126}


Code: 4c 8b 26 74 0e 48 8b 3c 24 e8 35 9b 06 e0 4c 89 e6 eb ea 41
RIP <ffffffffa00f80d0>{:dm_snapshot:exit_exception_table+49} RSP <00000101cf741cf8>
CR2: 0000000000000000
 <0>Kernel panic - not syncing: Oops


Version-Release number of selected component (if applicable):
2.6.9-42.0.1.ELlargesmp
lvm2-2.02.06-6.0.RHEL4
device-mapper-1.02.07-4.0.RHEL4

Comment 1 Corey Marthaler 2006-08-31 16:48:37 UTC
Here is what happens on the RHEL5 Beta...


[root@link-08 ~]# uname -ar
Linux link-08 2.6.17-1.2519.4.18.el5 #1 SMP Tue Aug 29 16:29:07 EDT 2006 x86_64
x86_64 x86_64 GNU/Linux
[root@link-08 ~]# lvs -a -o +devices
  LV     VG      Attr   LSize   Origin Snap%  Move Log Copy%  Devices
  origin snapper owi-a- 100.00M                               /dev/sda1(0)
  snap   snapper Swi-I-  28.00M origin 100.00                 /dev/sdb1(0)
[root@link-08 lvm]# lvremove -f /dev/snapper/snap
Segmentation fault


----------- [cut here ] --------- [please bite here ] ---------
Kernel BUG at mm/slab.c:595
invalid opcode: 0000 [1] SMP
last sysfs file: /block/ram0/dev
CPU 0
Modules linked in: autofs4 hidp rfcomm l2cap bluetooth sunrpc ipv6 acpi_cpufreq
video sbs i2c_ec button battery asus_acpi ac parport_pc lp parport sg amd_rng
i2c_amd756 i2c_core ide_cd shpchp k8_edac cdrom floppy tg3 ohci_hcd serio_raw
edac_mc pcspkr dm_snapshot dm_zero dm_mirror dm_mod ext3 jbd qla2xxx
scsi_transport_fc mptspi mptscsih scsi_transport_spi sd_mod scsi_mod mptbase
Pid: 2572, comm: lvremove Not tainted 2.6.17-1.2519.4.18.el5 #1
RIP: 0010:[<ffffffff800076e2>]  [<ffffffff800076e2>] kmem_cache_free+0x70/0x240
RSP: 0018:ffff810013c1bbd8  EFLAGS: 00010246
RAX: 0000000000000000 RBX: 6b6b6b6b6b6b6b6b RCX: 0000000000000007
RDX: ffff8100208d1020 RSI: ffff810001000038 RDI: ffff81003f486200
RBP: ffff810013c1bc18 R08: ffffffff881b9068 R09: 0000000000000001
R10: ffff810001cdc080 R11: ffffffff8000b3eb R12: ffff81003f486200
R13: ffff81003988cb00 R14: 0000000000000000 R15: 0000000000000000
FS:  00002aaaaaaba1a0(0000) GS:ffffffff8075a000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007fff384a76c0 CR3: 0000000039fb0000 CR4: 00000000000006e0
Process lvremove (pid: 2572, threadinfo ffff810013c1a000, task ffff810013d810c0)
Stack:  ffffc200000ba080 ffffc20000033000 ffff810013c1bbf8 6b6b6b6b6b6b6b6b
 ffff81003ff987d0 ffffc200000bc000 0000000000000000 0000000000000000
 ffff810013c1bc68 ffffffff881dea42 ffff81003ff98720 ffff81003f486200
Call Trace:
 [<ffffffff881dea42>] :dm_snapshot:exit_exception_table+0x41/0x72
 [<ffffffff881deaf6>] :dm_snapshot:snapshot_dtr+0x83/0xd1
 [<ffffffff881aac4a>] :dm_mod:dm_table_put+0x62/0xd0
 [<ffffffff881a9b27>] :dm_mod:dm_put+0x9f/0x171
 [<ffffffff881ad4e0>] :dm_mod:dev_remove+0xa3/0xb7
 [<ffffffff881ada3d>] :dm_mod:ctl_ioctl+0x23f/0x28c
 [<ffffffff8004380d>] do_ioctl+0x5e/0x77
 [<ffffffff80032385>] vfs_ioctl+0x25a/0x277
 [<ffffffff8004e748>] sys_ioctl+0x5f/0x82
 [<ffffffff800604da>] tracesys+0xd1/0xdb
DWARF2 unwinder stuck at tracesys+0xd1/0xdb
Leftover inexact backtrace:


Code: 0f 0b 68 d1 6e 2a 80 c2 53 02 4c 39 62 48 74 0a 0f 0b 68 d1
RIP  [<ffffffff800076e2>] kmem_cache_free+0x70/0x240
 RSP <ffff810013c1bbd8>






Comment 2 Corey Marthaler 2006-08-31 16:52:05 UTC
filing a rhel5 bug for comment #1...

Comment 4 RHEL Program Management 2006-10-02 14:48:55 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 7 Jason Baron 2006-10-25 17:30:37 UTC
committed in stream U5 build 42.21. A test kernel with this patch is available
from http://people.redhat.com/~jbaron/rhel4/


Comment 8 Bryn M. Reeves 2006-11-16 16:25:09 UTC
*** Bug 215959 has been marked as a duplicate of this bug. ***

Comment 9 Corey Marthaler 2006-11-21 23:43:41 UTC
Just a note that this is still not fixed in the latest version that I have:
device-mapper-1.02.12-3
lvm2-2.02.15-3

Do I need a newer version of device-mapper built?

Comment 10 Milan Broz 2006-11-22 09:44:14 UTC
Corey, this is bug in kernel only not related to specific lvm2 or device-mapper.
You can use test kernel build mentioned in comment #7 or wait till new RHEL4
kernel is released.

Changing component to kernel to reflect this.

Comment 11 Milan Broz 2007-02-05 17:09:03 UTC
*** Bug 227361 has been marked as a duplicate of this bug. ***

Comment 12 Milan Broz 2007-02-05 17:13:05 UTC
*** Bug 227360 has been marked as a duplicate of this bug. ***

Comment 15 Mike Gahagan 2007-04-05 18:50:07 UTC
fix verified with the test case.


Comment 17 Red Hat Bugzilla 2007-05-08 03:29:51 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2007-0304.html

Comment 19 Issue Tracker 2007-06-26 18:45:53 UTC
Internal Status set to 'Resolved'
Status set to: Closed by Client
Resolution set to: 'Security Errata'

This event sent from IssueTracker by sbradley 
 issue 111294