Bug 770250

Summary: readdir64_r calls fail with ELOOP
Product: Red Hat Enterprise Linux 6 Reporter: Harshavardhana <fharshav>
Component: kernelAssignee: Steve Dickson <steved>
Status: CLOSED ERRATA QA Contact: Boris Ranto <branto>
Severity: high Docs Contact:
Priority: high    
Version: 6.2CC: admin, akarlsso, bfields, branto, Colin.Simpson, cww, dejohnso, dhoward, eguan, fgozalo0, jlayton, joe.lin, jwest, kgc, ltroan, marcus, mishu, moshiro, ndevos, nfs-maint, nragusa, pasteur, peteh, rdassen, rwheeler, saujain, steved, tim, toracat, torel, ukar, vinaraya, yanwang
Target Milestone: rcKeywords: Reopened, ZStream
Target Release: 6.3   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: kernel-2.6.32-235.el6 Doc Type: Bug Fix
Doc Text:
On NFS, when repeatedly reading a directory, content of which kept changing, the client issued the same readdir request twice. Consequently, the following warning messages were returned to the dmesg output: NFS: directory A/B/C contains a readdir loop. This update fixes the bug by turning off the loop detection and letting the NFS client try to recover in the described scenario and the messages are no longer returned.
Story Points: ---
Clone Of:
: 788514 (view as bug list) Environment:
Last Closed: 2012-06-20 08:12:16 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 727267, 811135, 849300    
Attachments:
Description Flags
Kernel messages
none
Fix NFS spurious readdir ELOOP none

Description Harshavardhana 2011-12-24 22:42:52 UTC
Created attachment 549455 [details]
Kernel messages

Problem described here and reproducing steps provided too. 

http://wiki.linux-nfs.org/wiki/index.php/NFS:_directory_XXX_contains_a_readdir_loop_seems_to_be_triggered_by_well-behaving_server

Version-Release number of selected component (if applicable):

RHEL 6.2 - Kernel 2.6.32-220.el6.x86_64

Fix is provided in 2.6.39 in upstream kernel. Need a fix for 'RHEL 6.2' version

Error message visible in /var/log/messages

Dec 23 10:44:54 skimmer kernel: NFS: directory 51809410.2_imap/mail contains a readdir loop.  Please contact your server vendor.  Offending cookie: 4110
Dec 23 10:44:54 skimmer quotad[2161]: {140608674498304} readdir64_r failed: 40=Too many levels of symbolic links


Expected results :-

It shouldn't hang.

Comment 2 Harshavardhana 2011-12-26 05:33:10 UTC
Created attachment 549542 [details]
Fix NFS spurious readdir ELOOP

Patch pulled in from 2.6.39 upstream kernel.

Comment 3 Harshavardhana 2012-01-03 18:02:07 UTC
Customer verified the fix works.

Comment 4 Steve Dickson 2012-01-05 01:07:54 UTC
commit 0c0308066ca53fdf1423895f3a42838b67b3a5a8
Author: Trond Myklebust <Trond.Myklebust>
Date:   Sat Jul 30 12:45:35 2011 -0400

    NFS: Fix spurious readdir cookie loop messages
    
    If the directory contents change, then we have to accept that the
    file->f_pos value may shrink if we do a 'search-by-cookie'. In that
    case, we should turn off the loop detection and let the NFS client
    try to recover.
    
    The patch also fixes a second loop detection bug by ensuring
    that after turning on the ctx->duped flag, we read at least one new
    cookie into ctx->dir_cookie before attempting to match with
    ctx->dup_cookie.
    
    Reported-by: Petr Vandrovec <petr>
    Cc: stable [2.6.39+]
    Signed-off-by: Trond Myklebust <Trond.Myklebust>

Comment 5 RHEL Program Management 2012-01-05 01:10:01 UTC
This request was evaluated by Red Hat Product Management for inclusion
in a Red Hat Enterprise Linux maintenance release. Product Management has 
requested further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed 
products. This request is not yet committed for inclusion in an Update release.

Comment 16 Aristeu Rozanski 2012-02-21 19:49:16 UTC
Patch(es) available on kernel-2.6.32-235.el6

Comment 20 Jeff Layton 2012-03-02 14:19:58 UTC
*** Bug 783938 has been marked as a duplicate of this bug. ***

Comment 25 Larry Troan 2012-03-06 15:12:20 UTC
Opening this bug up to the public at the request of Ric Wheeler. There is nothing confidential in it.

Also opening up Bug 783938 - NFS returns wrong cookies
which was marked Red Hat Confidential inappropriately and then CLOSED as a DUP of this bug.

Comment 26 Tim Niemueller 2012-03-19 15:01:34 UTC
What's the expected time of release for kernel-2.6.32-235.el6?

Comment 27 Niels de Vos 2012-03-19 15:35:50 UTC
Tim, we can not give any dates on when an errata is released for this Bug. Currently the patch for this bug is planned to be included in the kernel package for the upcoming RHEL-6.3 release.

If you have an urgent need for a supported package, you will need to contact your support representative or file a case at the Customer Portal at https://access.redhat.com/support/cases/new.

Comment 28 Tim Niemueller 2012-03-19 22:46:57 UTC
Too bad, I was hoping for a shorter time frame. As stated in the duplicate bug #783938 this occurs on CentOS for me. Is there some way to get hands on a a build earlier? Maybe giving such a detailed report as in bug #783938 is worth a treat?

Comment 29 Niels de Vos 2012-03-20 08:48:13 UTC
Tim, I'm sorry to hear that. Unfortunately we (as in Red Hat Global Support Services) will not be able to provide any test-packages if there is no customer case open with a suitable business justification and the impact the problem has.

You mentioned in bug #783938 that this problem is preventing you from migrating your RHEL-5 systems to RHEL-6. If that is the case, you seem to have RHEL subscriptions and a business case with which you can contact the support organisation.

There is currently no need for Red Hat to request any (non)customers to test a package that contains this patch.

Comment 30 Kelsey Cummings 2012-03-20 16:47:40 UTC
Can anyone provide a clue why this isn't this being pushed to 6.2?  This is a severe bug that affects pretty much anyone running Maildir spools over NFS.

Comment 31 Sirius Rayner-Karlsson 2012-03-26 09:57:02 UTC
Kelsey,

If you contact your Red Hat Support representative and open a support request through Red Hat GSS, stating why you need this as a 6.2 async errata, it'll be sent through the correct channels for evaluation.
Bugzilla is not quite the correct forum for Support requests, Bugzilla is an engineering tool so your question is addressed to somewhat the wrong audience.

Hope this helps,

/Anders

Comment 35 Tore H. Larsen 2012-04-16 11:57:26 UTC
cc

Comment 36 Tore H. Larsen 2012-04-20 14:11:26 UTC
Is there any progress on this?  A new kernel coming soon?

Comment 39 Jeff Layton 2012-05-01 13:28:59 UTC
*** Bug 789452 has been marked as a duplicate of this bug. ***

Comment 40 csb sysadmin 2012-05-01 13:41:21 UTC
Where's the location of the new test kernels ? There's nothing in
http://people.redhat.com/dzickus/rhel6/

Comment 41 Jeff Layton 2012-05-01 13:52:34 UTC
I'm afraid that test kernels are not provided to the general public. You'll need to open a customer support case in order to get one.

Comment 46 Tore H. Larsen 2012-05-08 20:58:36 UTC
Will there be a backport to 6.2?  I'm a paying customer  and we have RHEL on all production critical hw. We've got other dependencies (weak modules) thus, upgrading to 6.3 is not an option.

Comment 47 Niels de Vos 2012-05-09 08:12:19 UTC
Hi Tore,

Bug 811135 is the one for RHEL-6.2. You should open a case at the customer portal (https://access.redhat.com) to let the support team know that you need an errata for that bug. They will likely need to double-check that you are hitting the same problem, and not something similar but different. When the errata is available, you will be informed through your case again.

Comment 48 Tomas Capek 2012-05-16 11:47:31 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
On NFS, when repeatedly reading a directory, content of which kept changing, the client issued the same readdir request twice. Consequently, the following warning messages were returned to the dmesg output:

    NFS: directory A/B/C contains a readdir loop.

    This update fixes the bug by turning off the loop detection and letting the NFS client try to recover in the described scenario and the messages are no longer returned.

Comment 50 errata-xmlrpc 2012-06-20 08:12:16 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2012-0862.html

Comment 51 Rajesh 2013-01-17 10:42:02 UTC
*** Bug 849300 has been marked as a duplicate of this bug. ***

Comment 52 Rajesh 2013-01-17 10:42:18 UTC
*** Bug 814052 has been marked as a duplicate of this bug. ***

Comment 53 Vijay Bellur 2013-02-26 08:34:25 UTC
*** Bug 854636 has been marked as a duplicate of this bug. ***