Bug 213362 - Assertion failed in dlm/plock.c with LTP test
Assertion failed in dlm/plock.c with LTP test
Status: CLOSED WONTFIX
Product: Red Hat Cluster Suite
Classification: Red Hat
Component: dlm (Show other bugs)
4
All Linux
medium Severity medium
: ---
: ---
Assigned To: David Teigland
Cluster QE
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2006-10-31 21:47 EST by Tadashi Iwashita
Modified: 2009-04-16 16:31 EDT (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2007-01-03 11:18:48 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Output of 'log' command on crash utility (30.38 KB, application/octet-stream)
2006-11-07 09:17 EST, Tadashi Iwashita
no flags Details
LTP runlog taken after the panic happened (56.80 KB, application/octet-stream)
2006-11-07 09:19 EST, Tadashi Iwashita
no flags Details

  None (edit)
Description Tadashi Iwashita 2006-10-31 21:47:00 EST
Description of problem:
I have experienced system hang-up after running the latest LTP
(http://ltp.sourceforge.net/) tool as a part of durability testings on our GFS 
environement. I was using 2 DELL PE1950 servers which installed CentOS4.3
(IA32) and DELL EMC AX150 was setup as a GFS shared storage connected to each 
servers. We did "./runltp -d /gfs3" to run the LTP tool on one server and 
another server remained just idle. Here are the extracted /var/log/messages 
taken when the system was stopped:

Oct  4 13:15:47 centos1 kernel: lock_dlm:  Assertion failed on line 500 of 
file /home/buildcentos/rpmbuild/BUILD/gfs-kernel-2.6.9-49/smp/src/dlm/plock.c
Oct  4 13:15:47 centos1 kernel: lock_dlm:  assertion:  "!error" 
Oct  4 13:15:47 centos1 kernel: lock_dlm:  time = 71704458 
Oct  4 13:15:47 centos1 kernel: error=-11 
Oct  4 13:15:47 centos1 kernel: 
Oct  4 13:15:47 centos1 kernel: ------------[ cut here ]------------ 
Oct  4 13:15:47 centos1 kernel: kernel BUG 
at /home/buildcentos/rpmbuild/BUILD/fs-kernel-2.6.9-49/smp/src/dlm/plock.c:500!
Oct  4 13:15:47 centos1 kernel: invalid operand: 0000 [#1] 
Oct  4 13:15:47 centos1 kernel: SMP 
Oct  4 13:15:47 centos1 kernel: Modules linked in: parport_pc lp parport 
autofs i2c_dev i2c_core lock_dlm(U) gfs(U) lock_harness(U) dlm(U) cman(U) 
sunrpc dm_mirror dm_multipath dm_mod button battery ac md5 ipv6 joydev 
uhci_hcd ehci_hcd hw random shpchp bnx2 ext3 jbd qla6312 qla2xxx 
scsi_transport_fc megaraid_sas sd_md scsi_mod
Oct  4 13:15:47 centos1 kernel: CPU:    0 

Version-Release number of selected component (if applicable):
CentOS 4.3 (i386): kernel 2.6.9-34.ELsmp
dlm-1.0.0-5.i686.rpm, dlm-kernel-smp-2.6.9-41.7.i686.rpm

How reproducible:
It happens every time.

Steps to Reproduce:
See the description.
  
Actual results:
See the description.

Expected results:
No system hang happens with LTP.

Additional info:
It might be able to take a diskdump on this problem if you need.
Comment 1 Nate Straz 2006-10-31 22:33:10 EST
Can you tell us which test case caused the assertion?  There should be more
output after the kernel BUG message that states the name of the process and a
backtrace.
Comment 2 Tadashi Iwashita 2006-11-07 09:17:59 EST
Created attachment 140560 [details]
Output of 'log' command on crash utility
Comment 3 Tadashi Iwashita 2006-11-07 09:19:44 EST
Created attachment 140561 [details]
LTP runlog taken after the panic happened
Comment 4 Tadashi Iwashita 2006-11-07 09:23:54 EST
Sorry for delayed response. We were able to reproduce this problem in the same 
environment and with the same tool. The attached are the output of 'log' 
command on crash utility and the LTP's run-log taken after the panic happened.
Comment 5 Nate Straz 2006-11-07 10:46:37 EST
The logs clearly show that the test case running was fcntl11.

http://ltp.cvs.sourceforge.net/ltp/ltp/testcases/kernel/syscalls/fcntl/fcntl11.c?view=log
Comment 6 David Teigland 2006-11-27 11:43:58 EST
This shouldn't block a release since it's not been an issue outside
of this specific test.
Comment 7 David Teigland 2007-01-03 11:18:48 EST
I doubt we'll want to fiddle much with plocks on rhel4 at this late stage
unless it's a really crucial issue people are facing.
It should work in the new rhel5 code, though.

Note You need to log in before you can comment on or make changes to this bug.