Bug 152557

Summary: 20050117 Oopsable NFS locking
Product: Red Hat Enterprise Linux 4 Reporter: Mark J. Cox <mjc>
Component: kernelAssignee: Steve Dickson <steved>
Status: CLOSED ERRATA QA Contact: Brian Brock <bbrock>
Severity: high Docs Contact:
Priority: medium    
Version: 4.0CC: aleksey, crt, davej, davem, jbaron, jerome, kanderso, nutello, riel, security-response-team, steved
Target Milestone: ---Keywords: Security
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard: impact=important,public=20050116,reported=20050116,source=bk
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-06-08 15:14:01 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 147461    

Description Mark J. Cox 2005-03-30 12:12:22 UTC
The following issue was committed to 2.6-bk on Jan16:

Fix an Oopsable condition in the NFS locking
http://linux.bkbits.net:8080/linux-2.6/cset@41eb36cbltMxDLx0TssnriNdQzAjkw

The impact of this issue isn't known
(Split from bug 149693)

Comment 1 Rudi Chiarito 2005-04-19 15:48:08 UTC
This looks like a fix for the NFS "Attempting to free lock with active block
list" panic. I know this is a RHEL4 bug, but for what it's worth, I can
reproduce it on FC3, with kernel-2.6.10-1.737_FC3 and kernel-2.6.10-1.770_FC3. I
could help testing any errata.

Comment 2 Mark J. Cox 2005-04-19 15:52:03 UTC
Bug duplicated for FC3, see bug #155367

Comment 3 Aleksey Nogin 2005-04-20 06:00:31 UTC
With kernel-smp-2.6.9-5.0.5.EL (recompiled to include root-over-NFS) running in
a root-over-NFS setup (RHEL WS 4 client, RHEL AS 3 server) keeps giving me
"Kernel panic - not syncing: attempting to free lock with active block list"
right after the machine is booted (out of 10 or so reboots only once machine
have stayed up and even then for only about 2 minutes after the boot process
have finished).

I will be very happy to help debugging this and/or to test a potential fix.

P.S. I also found a related thread at http://lkml.org/lkml/2005/1/5/210

Comment 5 Aleksey Nogin 2005-04-22 04:39:31 UTC
We've applied the above patch, with the 3 surrounding it, 4 total:

- http://linux.bkbits.net:8080/linux-2.6/cset@1.1966.3.21
- http://linux.bkbits.net:8080/linux-2.6/cset@1.1966.78.16
- http://linux.bkbits.net:8080/linux-2.6/cset@1.1966.78.17
- http://linux.bkbits.net:8080/linux-2.6/cset@1.1982.115.2

After that the "Kernel panic - not syncing" would not happen, but under similar
condition a weird "temporary freze" would still happen (at the same time the
server will be spewing "RPC: error 5 connecting to server <_client's_ IP
address>" every 10 seconds or so) - see bug 140319 comment 12 for detail.

Comment 7 Tim Powers 2005-06-08 15:14:01 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2005-420.html