Description of problem: When calling dlm_ls_unlock_wait with flag LKF_CANCEL for a lock which the process still waits for, the kernel panics. This seems to be because ast_routine tries to delete an empty kernel list. Version-Release number of selected component (if applicable): dlm-1.0.0-5 kernel-2.6.9-34.0.2.EL dlm-kernel-2.6.9-41.7.2 How reproducible: always Steps to Reproduce: 1. Set up a Cluster Suite 4 2. Optional: set up netdump 3. Compile and run the attached program and run it Actual results: Kernel oops Expected results: no kernel oops Additional info: I have also attached the netdump log, and a patch for dlm-kernel. The patch works for this test case and looks correct to me, but I am no dlm expert.
Created attachment 133625 [details] Test program to reproduce the kernel oops.
Created attachment 133626 [details] netdump log of the kernel oops ast_routine+0x149/0x204 is the list_del call on line 336 of src/device.c in package dlm-kernel.
Created attachment 133627 [details] fixes the list_del call in ast_routine tested with package dlm-kernel-2.6.9-41.7.2
fixed in RHEL4 branch: /cvs/cluster/cluster/dlm-kernel/src/Attic/device.c,v <-- device.c new revision: 1.24.2.9; previous revision: 1.24.2.8 and STABLE branch: /cvs/cluster/cluster/dlm-kernel/src/Attic/device.c,v <-- device.c new revision: 1.24.2.1.4.1.2.9; previous revision: 1.24.2.1.4.1.2.8
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2007-0137.html