Bug 166345
Summary: | HA NFS Cluster Problem | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 3 | Reporter: | Issue Tracker <tao> | ||||||||
Component: | kernel | Assignee: | Steve Dickson <steved> | ||||||||
Status: | CLOSED ERRATA | QA Contact: | Cluster QE <mspqa-list> | ||||||||
Severity: | high | Docs Contact: | |||||||||
Priority: | high | ||||||||||
Version: | 3.0 | CC: | kanderso, lwang, petrides, rkenna, steved, tao | ||||||||
Target Milestone: | --- | ||||||||||
Target Release: | --- | ||||||||||
Hardware: | All | ||||||||||
OS: | Linux | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | RHSA-2006-0144 | Doc Type: | Bug Fix | ||||||||
Doc Text: | Story Points: | --- | |||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2006-03-15 16:25:19 UTC | Type: | --- | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Bug Depends On: | |||||||||||
Bug Blocks: | 168424 | ||||||||||
Attachments: |
|
Description
Issue Tracker
2005-08-19 16:01:11 UTC
This issue has been boiled down to the following: I. The problem starts with AIX (NFS) clients doing heavy NFS IOs that would bring down the NFS server *interface* - i.e. the server is still responsive and accessing the filesystem locally on the server works fine but NFS exports no longer accessible. Based on the customer, it could be recreated at will in their environment. II. From sysrq-m output taken during system fault, I don't see any memory issue with the "down" server. III. From sysrq-t, three things to watch out: III-1: This box has IBM multi-path driver (mpp) - I would need IBM support to help us explaining the mpp threads trace back (are they in normal wait-for-work path or in a fault handling path ?). At this moment, I assume they are in a normal wait-for-work path. Aug 22 13:30:20 fdxfs02 kernel: mppFailback S 00000001 4820 31 1 32 30 (L-TLB) Aug 22 13:30:20 fdxfs02 kernel: Call Trace: [<c0123e24>] schedule [kernel] 0x2f4 (0xf6c09f50) Aug 22 13:30:20 fdxfs02 kernel: [<f8933234>] mppLnx_failback_sem [mpp_Vhba] 0x0 (0xf6c09f84) Aug 22 13:30:20 fdxfs02 kernel: [<f893323c>] mppLnx_failback_sem [mpp_Vhba] 0x8 (0xf6c09f90) Aug 22 13:30:20 fdxfs02 kernel: [<c010ae9a>] __down_interruptible [kernel] 0x8a (0xf6c09f94) Aug 22 13:30:20 fdxfs02 kernel: [<f8933240>] mppLnx_failback_sem [mpp_Vhba] 0xc (0xf6c09fa4) Aug 22 13:30:20 fdxfs02 kernel: [<f8933240>] mppLnx_failback_sem [mpp_Vhba] 0xc (0xf6c09fa8) Aug 22 13:30:20 fdxfs02 kernel: [<f8938750>] mppLnxFailbackScanContext [mpp_Vhba] 0x10 (0xf6c09fb4) Aug 22 13:30:20 fdxfs02 kernel: [<c010af67>] __down_failed_interruptible [kernel] 0x7 (0xf6c09fcc) Aug 22 13:30:20 fdxfs02 kernel: [<f8933234>] mppLnx_failback_sem [mpp_Vhba] 0x0 (0xf6c09fd0) Aug 22 13:30:20 fdxfs02 kernel: [<f892d639>] mppLnx_setCheckCondition [mpp_Vhba] 0x249 (0xf6c09fd8) Aug 22 13:30:20 fdxfs02 kernel: [<f8938750>] mppLnxFailbackScanContext [mpp_Vhba] 0x10 (0xf6c09fdc) Aug 22 13:30:20 fdxfs02 kernel: [<f893039b>] .rodata.str1.1 [mpp_Vhba] 0x7c7 (0xf6c09fe0) Aug 22 13:30:20 fdxfs02 kernel: [<f892c6a0>] mppLnx_failback_handler [mpp_Vhba] 0x0 (0xf6c09fe8) Aug 22 13:30:20 fdxfs02 kernel: [<c01095ad>] kernel_thread_helper [kernel] 0x5 (0xf6c09ff0) III-2: All nfsds are hanging waiting for hash_lock and the while loop is unbreakable. This piece of code certainly can get some improvements but I'm not going to fuss about it at this moment. The real issue here is lockd hang (as described in III-3). Since all nfsds hung at rexp_readlock(), no one can access to this server. void exp_readlock(void) { while (hash_lock || want_lock) sleep_on(&hash_wait); hash_count++; } Aug 22 13:30:25 fdxfs02 kernel: nfsd D 00000000 3392 2484 1 2485 2483 (L-TLB) Aug 22 13:30:25 fdxfs02 kernel: Call Trace: [<c0123e24>] schedule [kernel] 0x2f4 (0xf6761f38) Aug 22 13:30:25 fdxfs02 kernel: [<f8f43040>] hash_wait [nfsd] 0x0 (0xf6761f6c) Aug 22 13:30:25 fdxfs02 kernel: [<c01246e2>] sleep_on [kernel] 0x52 (0xf6761f7c) Aug 22 13:30:25 fdxfs02 kernel: [<f8f43040>] hash_wait [nfsd] 0x0 (0xf6761f9c) Aug 22 13:30:25 fdxfs02 kernel: [<f8f372fa>] exp_readlock [nfsd] 0x2a (0xf6761fac) Aug 22 13:30:25 fdxfs02 kernel: [<f8f2f3a4>] nfsd [nfsd] 0x1a4 (0xf6761fb0) Aug 22 13:30:25 fdxfs02 kernel: [<f8f2f200>] nfsd [nfsd] 0x0 (0xf6761fe0) Aug 22 13:30:25 fdxfs02 kernel: [<c01095ad>] kernel_thread_helper [kernel] 0x5 (0xf6761ff0) III-3: The lockd hangs - look like deadlocking ! I havn't figured out which semaphore it is waiting on and why. Aug 22 13:30:23 fdxfs02 kernel: lockd D 00000001 3872 2262 1 2284 2261 (L-TLB) Aug 22 13:30:23 fdxfs02 kernel: Call Trace: [<c0123e24>] schedule [kernel] 0x2f4 (0xf681ddc0) Aug 22 13:30:23 fdxfs02 kernel: [<c010adb3>] __down [kernel] 0x73 (0xf681de04) Aug 22 13:30:23 fdxfs02 kernel: [<f8ecb5ab>] rpc_call_sync_Rsmp_c357b490 [sunrpc] 0xcb (0xf681de1c) Aug 22 13:30:23 fdxfs02 kernel: [<c010af5c>] __down_failed [kernel] 0x8 (0xf681de38) Aug 22 13:30:23 fdxfs02 kernel: [<f8ee5b7f>] .text.lock.svclock [lockd] 0x5 (0xf681de48) Aug 22 13:30:23 fdxfs02 kernel: [<c029f267>] vsnprintf [kernel] 0x207 (0xf681de50) Aug 22 13:30:23 fdxfs02 kernel: [<f8ef03b8>] nlm_files [lockd] 0x18 (0xf681de58) Aug 22 13:30:23 fdxfs02 kernel: [<f8ee7334>] nlm_traverse_files [lockd] 0x144 (0xf681de64) Aug 22 13:30:23 fdxfs02 kernel: [<f8ee74c0>] nlmsvc_mark_resources [lockd] 0x20 (0xf681de84) Aug 22 13:30:23 fdxfs02 kernel: [<f8ee3ff5>] nlm_gc_hosts [lockd] 0x45 (0xf681de90) Aug 22 13:30:23 fdxfs02 kernel: [<f8eec662>] .rodata.str1.1 [lockd] 0x39 (0xf681de98) Aug 22 13:30:23 fdxfs02 kernel: [<f8ee396b>] nlm_lookup_host [lockd] 0x8b (0xf681deb0) Aug 22 13:30:23 fdxfs02 kernel: [<f8eec657>] .rodata.str1.1 [lockd] 0x2e (0xf681deb8) Aug 22 13:30:23 fdxfs02 kernel: [<f8ef0138>] nlm_hosts [lockd] 0x78 (0xf681decc) Aug 22 13:30:23 fdxfs02 kernel: [<f8ee38d0>] nlmsvc_lookup_host [lockd] 0x30 (0xf681dee4) Aug 22 13:30:23 fdxfs02 kernel: [<f8ee5a15>] nlmsvc_create_block [lockd] 0xb5 (0xf681def8) Aug 22 13:30:23 fdxfs02 kernel: [<c0179f24>] posix_test_lock [kernel] 0x84 (0xf681df08) Aug 22 13:30:23 fdxfs02 kernel: [<f8ee4e1a>] nlmsvc_lock [lockd] 0xca (0xf681df1c) Aug 22 13:30:23 fdxfs02 kernel: [<f8eea4c7>] nlm4svc_retrieve_args [lockd] 0xc7 (0xf681df38) Aug 22 13:30:23 fdxfs02 kernel: [<f8eef2b8>] nlmsvc_version4 [lockd] 0x0 (0xf681df5c) Aug 22 13:30:23 fdxfs02 kernel: [<f8eea6dc>] nlm4svc_proc_lock [lockd] 0xac (0xf681df60) Aug 22 13:30:23 fdxfs02 kernel: [<f8eefc88>] nlmsvc_procedures4 [lockd] 0x48 (0xf681df84) Aug 22 13:30:23 fdxfs02 kernel: [<f8ed3548>] svc_process_Rsmp_462cdaea [sunrpc] 0x318 (0xf681df8c) Aug 22 13:30:23 fdxfs02 kernel: [<f8ee43fb>] lockd [lockd] 0x1ab (0xf681dfc4) Aug 22 13:30:23 fdxfs02 kernel: [<f8ee4250>] lockd [lockd] 0x0 (0xf681dfe0) Aug 22 13:30:23 fdxfs02 kernel: [<c01095ad>] kernel_thread_helper [kernel] 0x5 (0xf681dff0) Created attachment 118023 [details]
Upstream patch that fixes deadlock in lockd
Steve, I checked the patch and this is exactly one of the problems. Thanks. The STONITH message appears because the customer is not using power switches. Hence, it completely disclaims all data integrity because it can't ensure that the node has been cut off. (Not a bug) Kernel panics... def. a bug. Created attachment 118188 [details]
Updated Patch
During our internal review process, a locking inconsistency was
found in the original patch. So please re-test with this updated patch, thx...
Created attachment 118248 [details]
Updated Patch
Again through our review process, it was deemed that
extra locking around blocked locks are not needed since
the locking process is single thread. So those locks were
removed. Please test to ensure the removal of those locks
do not cause any regression...
A fix for this problem is queued for the next interim U7 build. A fix for this problem has just been committed to the RHEL3 U7 patch pool this evening (in kernel version 2.4.21-37.5.EL). An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHSA-2006-0144.html |