+++ This bug was initially created as a clone of Bug #809928 +++ Description of problem:The list macro list_for_each_next_safe for loop fails to enumerate RPC priority wait queue tasks stored on the tk_wait.links list resulting in rpc_wake_up and rpc_wake_up_status failing to wake up all tasks. This will result in nasty hangs, and poor performance. The NFSv4.1 session slot table wait queues implementation uses a lot of RPC priority queues and so v4.1 is especially affected. The bug was noticed as I investigated Bug 756212 - Redirecting I/O through the MDS after a data server network partition is very slow. We are currently running NFSv4.1 performance tests on RHEL 6.3 with the fix VRS without the fix as we suspect this bug is responsible for poor performance results and for the wide standard deviation between test runs. Version-Release number of selected component (if applicable): All versions of RHEL. How reproducible: 100% with newly submitted NFSv4.1 file layout data server quick failover patch set. Steps to Reproduce: 1. Start a large I/O on an nfsv4.1 pNFS mount. 2. Network partition a data server that is receiving a large amount of data 3. Do not reconnect the data server The client will get a data server connection error, reset the failed RPC to go to the MDS, and mark the pNFS deviceid as bad, which will reset RPC tasks going through the rpc_call_prepare state to go to the MDS instead of using pNFS. rpc_wake_up all is then called to drain (wake-up) all RPC tasks waiting on the failed Data Server Session fore channel slot table wait queue for a session slot. Actual results: We wait the RPC timeout for all in-flight RPC's to fail and be redirected, but since rpc_wake_up is broken without the fix, and only wakes up one PRIORITY task per rpc_wake_up call, then we only process up to slot # of RPC tasks waiting on the queue and the application hangs. Expected results: We wait the RPC timeout for all in-flight RPC's to fail, and all RPC tasks on the slot table wait queue immediately wake up and are redirected to the MDS. The application succeeds. Additional info: We are running other tests and should get a new reproducer that doesn't depend on a new patch set. Here is the message from Trond: Date: Mon, 19 Mar 2012 21:29:32 +0000 From: Myklebust, Trond <Trond.Myklebust> To: Steve Dickson <SteveD> CC: Adamson, Andy <William.Adamson> Steve, This bug probably explains a good chunk of the random hangs that we've been seeing (particularly on connection losses etc) in the past few years. Please queue it up for _all_ versions of RHEL asap. Kudos to Andy for noticing the problem and working out the bug! Cheers Trond
*** Bug 809502 has been marked as a duplicate of this bug. ***
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux release for currently deployed products. This request is not yet committed for inclusion in a release.
Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: A process scheduler did not handle RPC priority wait queues correctly. Consequently, the process scheduler failed to wake up all scheduled tasks as expected after RPC timeout, which caused the system to become unresponsive and could significantly decrease system performance. This update modifies the process scheduler to handle RPC priority wait queues as expected. All scheduled tasks are now properly woken up after RPC timeout and the system behaves as expected.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-0006.html