Bug 213362 - Assertion failed in dlm/plock.c with LTP test
Summary: Assertion failed in dlm/plock.c with LTP test
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: Red Hat Cluster Suite
Classification: Retired
Component: dlm
Version: 4
Hardware: All
OS: Linux
medium
medium
Target Milestone: ---
Assignee: David Teigland
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2006-11-01 02:47 UTC by Tadashi Iwashita
Modified: 2009-04-16 20:31 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2007-01-03 16:18:48 UTC
Embargoed:


Attachments (Terms of Use)
Output of 'log' command on crash utility (30.38 KB, application/octet-stream)
2006-11-07 14:17 UTC, Tadashi Iwashita
no flags Details
LTP runlog taken after the panic happened (56.80 KB, application/octet-stream)
2006-11-07 14:19 UTC, Tadashi Iwashita
no flags Details

Description Tadashi Iwashita 2006-11-01 02:47:00 UTC
Description of problem:
I have experienced system hang-up after running the latest LTP
(http://ltp.sourceforge.net/) tool as a part of durability testings on our GFS 
environement. I was using 2 DELL PE1950 servers which installed CentOS4.3
(IA32) and DELL EMC AX150 was setup as a GFS shared storage connected to each 
servers. We did "./runltp -d /gfs3" to run the LTP tool on one server and 
another server remained just idle. Here are the extracted /var/log/messages 
taken when the system was stopped:

Oct  4 13:15:47 centos1 kernel: lock_dlm:  Assertion failed on line 500 of 
file /home/buildcentos/rpmbuild/BUILD/gfs-kernel-2.6.9-49/smp/src/dlm/plock.c
Oct  4 13:15:47 centos1 kernel: lock_dlm:  assertion:  "!error" 
Oct  4 13:15:47 centos1 kernel: lock_dlm:  time = 71704458 
Oct  4 13:15:47 centos1 kernel: error=-11 
Oct  4 13:15:47 centos1 kernel: 
Oct  4 13:15:47 centos1 kernel: ------------[ cut here ]------------ 
Oct  4 13:15:47 centos1 kernel: kernel BUG 
at /home/buildcentos/rpmbuild/BUILD/fs-kernel-2.6.9-49/smp/src/dlm/plock.c:500!
Oct  4 13:15:47 centos1 kernel: invalid operand: 0000 [#1] 
Oct  4 13:15:47 centos1 kernel: SMP 
Oct  4 13:15:47 centos1 kernel: Modules linked in: parport_pc lp parport 
autofs i2c_dev i2c_core lock_dlm(U) gfs(U) lock_harness(U) dlm(U) cman(U) 
sunrpc dm_mirror dm_multipath dm_mod button battery ac md5 ipv6 joydev 
uhci_hcd ehci_hcd hw random shpchp bnx2 ext3 jbd qla6312 qla2xxx 
scsi_transport_fc megaraid_sas sd_md scsi_mod
Oct  4 13:15:47 centos1 kernel: CPU:    0 

Version-Release number of selected component (if applicable):
CentOS 4.3 (i386): kernel 2.6.9-34.ELsmp
dlm-1.0.0-5.i686.rpm, dlm-kernel-smp-2.6.9-41.7.i686.rpm

How reproducible:
It happens every time.

Steps to Reproduce:
See the description.
  
Actual results:
See the description.

Expected results:
No system hang happens with LTP.

Additional info:
It might be able to take a diskdump on this problem if you need.

Comment 1 Nate Straz 2006-11-01 03:33:10 UTC
Can you tell us which test case caused the assertion?  There should be more
output after the kernel BUG message that states the name of the process and a
backtrace.

Comment 2 Tadashi Iwashita 2006-11-07 14:17:59 UTC
Created attachment 140560 [details]
Output of 'log' command on crash utility

Comment 3 Tadashi Iwashita 2006-11-07 14:19:44 UTC
Created attachment 140561 [details]
LTP runlog taken after the panic happened

Comment 4 Tadashi Iwashita 2006-11-07 14:23:54 UTC
Sorry for delayed response. We were able to reproduce this problem in the same 
environment and with the same tool. The attached are the output of 'log' 
command on crash utility and the LTP's run-log taken after the panic happened.

Comment 5 Nate Straz 2006-11-07 15:46:37 UTC
The logs clearly show that the test case running was fcntl11.

http://ltp.cvs.sourceforge.net/ltp/ltp/testcases/kernel/syscalls/fcntl/fcntl11.c?view=log


Comment 6 David Teigland 2006-11-27 16:43:58 UTC
This shouldn't block a release since it's not been an issue outside
of this specific test.


Comment 7 David Teigland 2007-01-03 16:18:48 UTC
I doubt we'll want to fiddle much with plocks on rhel4 at this late stage
unless it's a really crucial issue people are facing.
It should work in the new rhel5 code, though.



Note You need to log in before you can comment on or make changes to this bug.