Bug 201325 - Kernel Oops when passing LKF_CANCEL to dlm_ls_unlock_wait
Kernel Oops when passing LKF_CANCEL to dlm_ls_unlock_wait
Status: CLOSED ERRATA
Product: Red Hat Cluster Suite
Classification: Red Hat
Component: dlm (Show other bugs)
4
i386 Linux
medium Severity medium
: ---
: ---
Assigned To: David Teigland
Cluster QE
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2006-08-04 08:25 EDT by Carsten Clasohm
Modified: 2010-10-22 01:36 EDT (History)
2 users (show)

See Also:
Fixed In Version: RHBA-2007-0137
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2007-05-10 17:26:47 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Test program to reproduce the kernel oops. (2.21 KB, text/x-csrc)
2006-08-04 08:25 EDT, Carsten Clasohm
no flags Details
netdump log of the kernel oops (45.40 KB, text/plain)
2006-08-04 08:27 EDT, Carsten Clasohm
no flags Details
fixes the list_del call in ast_routine (514 bytes, patch)
2006-08-04 08:30 EDT, Carsten Clasohm
no flags Details | Diff

  None (edit)
Description Carsten Clasohm 2006-08-04 08:25:27 EDT
Description of problem:

When calling dlm_ls_unlock_wait with flag LKF_CANCEL for a lock which the
process still waits for, the kernel panics. This seems to be because ast_routine
tries to delete an empty kernel list.

Version-Release number of selected component (if applicable):

dlm-1.0.0-5
kernel-2.6.9-34.0.2.EL
dlm-kernel-2.6.9-41.7.2

How reproducible:

always


Steps to Reproduce:

1. Set up a Cluster Suite 4
2. Optional: set up netdump
3. Compile and run the attached program and run it
  
Actual results:

Kernel oops

Expected results:

no kernel oops

Additional info:

I have also attached the netdump log, and a patch for dlm-kernel. The patch
works for this test case and looks correct to me, but I am no dlm expert.
Comment 1 Carsten Clasohm 2006-08-04 08:25:29 EDT
Created attachment 133625 [details]
Test program to reproduce the kernel oops.
Comment 2 Carsten Clasohm 2006-08-04 08:27:51 EDT
Created attachment 133626 [details]
netdump log of the kernel oops

ast_routine+0x149/0x204 is the list_del call on line 336 of src/device.c in
package dlm-kernel.
Comment 3 Carsten Clasohm 2006-08-04 08:30:03 EDT
Created attachment 133627 [details]
fixes the list_del call in ast_routine

tested with package dlm-kernel-2.6.9-41.7.2
Comment 6 David Teigland 2006-08-14 17:04:53 EDT
fixed in RHEL4 branch:
/cvs/cluster/cluster/dlm-kernel/src/Attic/device.c,v  <--  device.c
new revision: 1.24.2.9; previous revision: 1.24.2.8

and STABLE branch:
/cvs/cluster/cluster/dlm-kernel/src/Attic/device.c,v  <--  device.c
new revision: 1.24.2.1.4.1.2.9; previous revision: 1.24.2.1.4.1.2.8
Comment 10 Red Hat Bugzilla 2007-05-10 17:26:47 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2007-0137.html

Note You need to log in before you can comment on or make changes to this bug.