Bug 156854 - panic in lvm on start up of x86_64 machine
Summary: panic in lvm on start up of x86_64 machine
Keywords:
Status: CLOSED DUPLICATE of bug 158956
Alias: None
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel
Version: 4.0
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: ---
: ---
Assignee: Corey Marthaler
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2005-05-04 18:13 UTC by Corey Marthaler
Modified: 2007-11-30 22:07 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2005-08-10 15:05:38 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Corey Marthaler 2005-05-04 18:13:37 UTC
Description of problem:
Link-08 (x86_64) paniced during the startup process. This machine is connect to
storage which is being used by other cluster running rhel3 I/O on pool devs. 

[...]
File descriptor 3 left open
  Reading all physical volumes.  This may take a while...
  Found volume group "VolGroup00" using metadata type lvm2
  Found volume group "single_nominor" using metadata type pool
File descriptor 3 left open
  2 logical volume(s) in volume group "VolGroup00" now active
  1 logical volume(s) in volume group "single_nominor" now active
File descriptor 3 left open
Unable to handle kernel NULL pointer dereference at 000000000000008a RIP:
<ffffffff801dcf05>{rb_first+10}
PML4 1fb35067 PGD 1fa10067 PMD 0
Oops: 0000 [1] SMP
CPU 1
Modules linked in: dm_snapshot dm_zero dm_mirror ext3 jbd dm_mod qla2300 qla2xxx
 scsi_transport_fc mptscsih mptbase sd_mod scsi_mod
Pid: 260, comm: lvm Not tainted 2.6.9-6.37.ELsmp
RIP: 0010:[<ffffffff801dcf05>] <ffffffff801dcf05>{rb_first+10}
RSP: 0018:0000010037c8bea0  EFLAGS: 00010206
RAX: 0000000000000072 RBX: 000001003fac4388 RCX: 0000010020000000
RDX: 0000000000000000 RSI: 000000000000006c RDI: 000001003fac4380
RBP: 000001001f4bc000 R08: 0000010037c8bdb8 R09: 0000000000000000
R10: 000001001fafdb90 R11: ffffffff801705cc R12: 000001003fac4380
R13: 0000007fbfff4880 R14: 000001003fac4440 R15: 0000007fbfff6880
FS:  0000000000000000(0000) GS:ffffffff804c1700(0000) knlGS:00000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 000000000000008a CR3: 000000001ffb2000 CR4: 00000000000006e0
Process lvm (pid: 260, threadinfo 0000010037c8a000, task 0000010037ce40
Stack: ffffffff8016d9fb 000001003fac4358 000001001f4bc000 0000010020a6b
       ffffffff801705dd 0000000000000000 ffffffff80181606 0000010037c64
       000001003ffecd40 0000000eed0c114b
Call Trace:<ffffffff8016d9fb>{mpol_free_shared_policy+53}
<ffffffff8017em_destroy_inode+17}
       <ffffffff80181606>{sys_unlink+261} <ffffffff8011003e>{system_cal


Code: 48 83 78 18 00 74 06 48 8b 40 18 eb f3 48 89 c2 48 89 d0 c3
RIP <ffffffff801dcf05>{rb_first+10} RSP <0000010037c8bea0>
CR2: 000000000000008a
 <0>Kernel panic - not syncing: Oops


Once in this state, it would continue to panic each time I tried to power cycle
until I unplugged the storage hba.

Comment 1 Alasdair Kergon 2005-05-04 18:36:00 UTC
What's Q3 errata beta?

Comment 2 Corey Marthaler 2005-05-04 19:13:21 UTC
It was the default option for RHEL in bugzilla. In all reality, I wasn't sure
what to file it against so if you do know...

Comment 3 Ernie Petrides 2005-05-04 19:50:26 UTC
This is a RHEL4 bug.

Comment 4 Alasdair Kergon 2005-05-05 20:58:20 UTC
I don't think the oops is in device-mapper code.

mgalgoci reported a strange oops on irc a few weeks back, where a /proc file had
got corrupted.


Comment 5 Alasdair Kergon 2005-05-05 21:00:24 UTC
corey, can you work out where this is happening the boot sequence:

It looks like:
  vgscan
  vgchange -ay
  some other lvm command

(eg add -x to shell scripts, -vvvv to commands, or strace)

so we can try to work out what is being 'unlink'ed before passing back to
kernel-maint.


Comment 6 Mike Malone 2005-07-14 20:37:08 UTC
This bug is similar to the following Bugzilla Bug 156873 รข File descriptor 3
left open

Any proress/ideas on this yet?

Comment 7 Alasdair Kergon 2005-08-10 15:05:38 UTC

*** This bug has been marked as a duplicate of 158956 ***

Comment 8 Alasdair Kergon 2005-08-10 15:07:55 UTC
Should be fixed in 2.6.9-6.38


Note You need to log in before you can comment on or make changes to this bug.