Bug 159877 - x86_64 kernel panic after force removal of active lv
Summary: x86_64 kernel panic after force removal of active lv
Keywords:
Status: CLOSED DUPLICATE of bug 158956
Alias: None
Product: Red Hat Cluster Suite
Classification: Retired
Component: lvm2-cluster
Version: 4
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Christine Caulfield
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2005-06-08 19:26 UTC by Corey Marthaler
Modified: 2010-01-12 04:03 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2005-08-10 15:01:14 UTC
Embargoed:


Attachments (Terms of Use)

Description Corey Marthaler 2005-06-08 19:26:58 UTC
Description of problem:
I had just finished up the LVM I/O on the x86_64 cluster (link-01, link-02,
link-08) and was tearing down lvm volumes inorder to make new ones for file
system testing. An lvremove attempt caused all my nodes to panic:

Unable to handle kernel paging request at 0000000030345f4e RIP:
<ffffffff801dced5>{rb_first+10}
PML4 1d829067 PGD 1f6e1067 PMD 0
Oops: 0000 [1] SMP
CPU 0
Modules linked in: gnbd(U) lock_nolock(U) gfs(U) lock_dlm(U) dlm(U) cman(U)
lock_harness(U) md5 ipv6 parport_pc lp parport autofs4 i2c_dev i2c_core sunrpc
ds yenta_socket pcmcia_core buttonbattery ac ohci_hcd hw_random tg3 floppy
dm_snapshot dm_zero dm_mirror ext3 jbd dm_mod qla2300qla2xxx scsi_transport_fc
mptscsih mptbase sd_modscsi_mod
Pid: 14792, comm: clvmd Not tainted 2.6.9-11.ELsmp
RIP: 0010:[<ffffffff801dced5>] <ffffffff801dced5>{rb_first+10}
RSP: 0018:000001001e743ea0  EFLAGS: 00010206
RAX: 0000000030345f36 RBX: 000001001fdbb6a8 RCX: 0000010037e49c00
RDX: 0000000000000000 RSI: 000000000000006c RDI: 000001001fdbb6a0
RBP: 000001003d64c000 R08: 0000000000000025 R09: 0000000000000000
R10: 0000000000000000 R11: ffffffff80170638 R12: 000001001fdbb6a0
R13: 000000000069b4f7 R14: 000001001fdbb760 R15: 00000000006782b0
FS:  0000000041401960(005b) GS:ffffffff804c1700(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 0000000030345f4e CR3: 0000000000101000 CR4: 00000000000006e0
Process clvmd (pid: 14792, threadinfo 000001001e742000, task 0000010037d5c7f0)
Stack: ffffffff8016da67 000001001fdbb678 000001003d64c000 000001003a608408
        ffffffff80170649 0000000000000000 ffffffff80181672 000001003a00d6d8
        000001003ffec200 00000010010889cc
Call Trace:<ffffffff8016da67>{mpol_free_shared_policy+53}
<ffffffff80170649>{shmem_destroy_inode+17}
        <ffffffff80181672>{sys_unlink+261} <ffffffff8011003e>{system_call+126}


Code: 48 83 78 18 00 74 06 48 8b 40 18 eb f3 48 89 c2 48 89 d0 c3
RIP <ffffffff801dced5>{rb_first+10} RSP <000001001e743ea0>
CR2: 0000000030345f4e
 <0>Kernel panic - not syncing: Oops


Version-Release number of selected component (if applicable):
[root@link-01 ~]# rpm -qa | grep lvm2
lvm2-2.01.08-1.0.RHEL4
lvm2-cluster-2.01.09-3.1.RHEL4


How reproducible:
Still trying

Comment 1 Corey Marthaler 2005-06-08 20:25:19 UTC
reproduced again with exact same above senario.

Comment 2 Corey Marthaler 2005-06-08 20:57:02 UTC
This is caused by a force remove of an active lv.

[root@link-02 ~]# lvscan
  ACTIVE            '/dev/stripe_8_4096_4/stripe_8_4096_40' [924.00 GB] anywhere


lvremove -f /dev/stripe_8_4096_4/stripe_8_4096_40

strace:

[...]
stat("/dev/sdf1", {st_mode=S_IFBLK|0660, st_rdev=makedev(8, 81), ...}) = 0
stat("/dev/sdf1", {st_mode=S_IFBLK|0660, st_rdev=makedev(8, 81), ...}) = 0
open("/dev/sdf1", O_RDWR|O_DIRECT|0x40000) = 5
fstat(5, {st_mode=S_IFBLK|0660, st_rdev=makedev(8, 81), ...}) = 0
ioctl(5, BLKBSZGET, 0x67f9a0)           = 0
lseek(5, 2048, SEEK_SET)                = 2048
read(5, "_\332\24\f LVM2 x[5A%r0N*>\1\0\0\0\0\10\0\0\0\0\0\0"..., 512) = 512
lseek(5, 4096, SEEK_SET)                = 4096
read(5, "stripe_8_4096_4 {\nid = \"ADcb5J-K"..., 512) = 512
close(5)                                = 0
lseek(4, 0, SEEK_SET)                   = 0
read(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 2048) = 2048
lseek(4, 2048, SEEK_SET)                = 2048
read(4, "_\332\24\f LVM2 x[5A%r0N*>\1\0\0\0\0\10\0\0\0\0\0\0"..., 512) = 512
lseek(4, 4096, SEEK_SET)                = 4096
read(4, "stripe_8_4096_4 {\nid = \"ADcb5J-K"..., 512) = 512
close(4)                                = 0
brk(0x6db000)                           = 0x6db000
open("/proc/devices", O_RDONLY)         = 4
fstat(4, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
0x2a97c3b000
read(4, "Character devices:\n  1 mem\n  4 /"..., 1024) = 445
close(4)                                = 0
munmap(0x2a97c3b000, 4096)              = 0
open("/proc/misc", O_RDONLY)            = 4
fstat(4, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
0x2a97c3b000
read(4, " 60 dlm_clvmd\n 61 gnbd_ctl\n 62 d"..., 1024) = 94
close(4)                                = 0
munmap(0x2a97c3b000, 4096)   
stat("/dev/mapper/control", {st_mode=S_IFCHR|0600, st_rdev=makedev(10, 63),
...}) = 0
open("/dev/mapper/control", O_RDWR)     = 4
ioctl(4, DM_VERSION, 0x6ba260)          = 0
ioctl(4, DM_DEV_STATUS, 0x6a41c0)       = 0
brk(0x6d3000)                           = 0x6d3000
uname({sys="Linux", node="link-01", ...}) = 0
open("/etc/lvm/archive/.lvm_link-01_9269_145392622",
O_WRONLY|O_APPEND|O_CREAT|O_EXCL, 0666) =5
fcntl(5, F_SETLK, {type=F_WRLCK, whence=SEEK_SET, start=0, len=0}) = 0
fcntl(5, F_GETFL)                       = 0x8401 (flags
O_WRONLY|O_APPEND|O_LARGEFILE|0x8000)
fstat(5, {st_mode=S_IFREG|0600, st_size=0, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
0x2a97c3b000
lseek(5, 0, SEEK_CUR)                   = 0
uname({sys="Linux", node="link-01", ...}) = 0
write(5, "# Generated by LVM2: Wed Jun  8 "..., 2292) = 2292
close(5)                                = 0
munmap(0x2a97c3b000, 4096)              = 0
open("/etc/lvm/archive", O_RDONLY|O_NONBLOCK|O_DIRECTORY) = 5
fstat(5, {st_mode=S_IFDIR|0700, st_size=4096, ...}) = 0
fcntl(5, F_SETFD, FD_CLOEXEC)           = 0
getdents64(5, /* 88 entries */, 4096)   = 3472
getdents64(5, /* 0 entries */, 4096)    = 0
close(5)                                = 0
link("/etc/lvm/archive/.lvm_link-01_9269_145392622",
"/etc/lvm/archive/stripe_8_4096_4_00008.vg") = 0
stat("/etc/lvm/archive/.lvm_link-01_9269_145392622", {st_mode=S_IFREG|0600,
st_size=2292, ...}) = 0
unlink("/etc/lvm/archive/.lvm_link-01_9269_145392622") = 0
write(3, "2\0\377\277\0\0\0\0\0\0\0\0C\0\0\0\0\30\0ADcb5JKgkAFga"..., 85) = 85
read(3,



Comment 3 Christine Caulfield 2005-06-09 07:06:58 UTC
Don't we get useful tracebacks on X86_64? oh dear. 
If it is caused by removing a volume then it could be a device-mapper bug.
Does it happen on a non-clustered system?

Comment 4 Alasdair Kergon 2005-08-10 15:01:14 UTC

*** This bug has been marked as a duplicate of 158956 ***


Note You need to log in before you can comment on or make changes to this bug.