Bug 531016 - NFS: stale nfs_fattr passed to nfs_readdir_lookup()
Summary: NFS: stale nfs_fattr passed to nfs_readdir_lookup()
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.2
Hardware: All
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Harshula Jayasuriya
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-10-26 14:35 UTC by Harshula Jayasuriya
Modified: 2018-10-27 12:26 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-03-30 07:15:42 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
linux-2.6-nfs-initialise-nfs_fattr-in-nfs_do_filldir.patch (474 bytes, patch)
2009-10-26 14:38 UTC, Harshula Jayasuriya
no flags Details | Diff
Backport of commit 1f4eab7e7c1d90dcd8ca4d7c064ee78dfbb345eb (3.81 KB, patch)
2009-11-23 12:07 UTC, Harshula Jayasuriya
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2010:0178 0 normal SHIPPED_LIVE Important: Red Hat Enterprise Linux 5.5 kernel security and bug fix update 2010-03-29 12:18:21 UTC

Description Harshula Jayasuriya 2009-10-26 14:35:25 UTC
Description of problem:

This bug appears to be triggered during an nfs_readdir() when these two conditions are satisfied:

a) An uninitialised stale struct nfs_fattr is passed to nfs_readdir_lookup()
b) The dentry can not be found in the dentry cache

------------------------------------------------------------
nfs_readdir
- nfs_revalidate_mapping
- nfs_fattr_init
- readdir_search_pagecache
  - find_dirent_page
    - read_cache_page
      - nfs_readdir_filler
    - find_dirent
    - find_dirent_index
    - dir_page_release
- nfs_do_filldir
  - nfs_readdir_lookup
    - d_lookup
    - d_alloc
    - nfs_fhget
      - nfs_refresh_inode
        - nfs_update_inode
  - filldir
------------------------------------------------------------

Most of the time nfs_readdir_lookup() finds the dentry in the dcache when d_lookup() is called. On the rare occasions when the dentry is not found in the dcache, a new dentry is allocated, nfs_fhget() is called and eventually nfs_update_inode(). Unfortunately the struct nfs_fattr that is passed to nfs_update_inode() can be stale and results in the stale metadata being copied to the inode.

An updated nfs_entry pointing to the stale nfs_fattr is returned by readdir_search_pagecache(), where the size & valid fields are incorrect:
------------------------------------------------------------
Sep 22 12:12:12 kaedila-r53-7 kernel: nfs_readdir: 2.1: fattr->size: 6037821808640 fattr->valid: 0
Sep 22 12:12:12 kaedila-r53-7 kernel: NFS: readdir_search_pagecache() searching for offset 0
Sep 22 12:12:12 kaedila-r53-7 kernel: NFS: find_dirent_page: searching page 0 for target 0
Sep 22 12:12:12 kaedila-r53-7 kernel: NFS: found cookie 325517684 at index 0
Sep 22 12:12:12 kaedila-r53-7 kernel: NFS: find_dirent_page: returns 0
Sep 22 12:12:12 kaedila-r53-7 kernel: NFS: readdir_search_pagecache: returns 0
Sep 22 12:12:12 kaedila-r53-7 kernel: nfs_readdir: 2.2: fattr->size: 71236 fattr->valid: 6
...
Sep 22 12:12:13 kaedila-r53-7 kernel: NFS: nfs_update_inode(0:16/745512 ct=1 info=0x6)
Sep 22 12:12:13 kaedila-r53-7 kernel: NFS: mtime change on server for file 0:16/745512
Sep 22 12:12:13 kaedila-r53-7 kernel: NFS: isize change on server for file 0:16/745512
Sep 22 12:12:13 kaedila-r53-7 kernel: nfs_update_inode: cur_isize: 71753 new_isize: 71236
------------------------------------------------------------

There is a check to ensure that a new dentry is only allocated if we have an nfs_fattr populated after an RPC reply:
------------------------------------------------------------
 63 #define NFS_ATTR_FATTR          0x0002          /* post-op attributes */
------------------------------------------------------------
------------------------------------------------------------
1067 static struct dentry *nfs_readdir_lookup(nfs_readdir_descriptor_t *desc)
1068 {

1114         if (!desc->plus || !(entry->fattr->valid & NFS_ATTR_FATTR))
1115                 return NULL;

1140 }
------------------------------------------------------------
The entry->fattr->valid should be 0 because there has not been an RPC reply and should have returned at line 1115. However, since the nfs_fattr is stale and entry->fattr->valid is 6, this critical check is passed and d_alloc() & nfs_update_inode() are called. The end result is that stale metadata is assumed to be recent metadata from the NFS server and is copied to the inode.


Version-Release number of selected component (if applicable):
* kernel-2.6.18-92.el5-i686

How reproducible:
* Rarely

Steps to Reproduce:
* Customer provided a set of scripts to reproduce the problem, but it requires a few NFS clients. I'm trying to develop a simpler reproducer.

Actual results:
-rw-r--r-- root/root     71236 2009-09-22 05:35:03 ./kaedila-r53-2

Expected results:
-rw-r--r-- root/root     71764 2009-09-22 13:36:52 ./kaedila-r53-2


Additional info:
* Able to reproduce the problem on 2.6.18-168.
* I tested a patch that reinitialises the struct nfs_fattr in nfs_do_filldir() before calling nfs_readdir_lookup(). It fixes the problem.

Comment 1 Harshula Jayasuriya 2009-10-26 14:38:53 UTC
Created attachment 366109 [details]
linux-2.6-nfs-initialise-nfs_fattr-in-nfs_do_filldir.patch

Comment 14 Harshula Jayasuriya 2009-11-19 23:40:37 UTC
Hi Peter,

In your email conversation with Trond, the fields in nfs_readdir_descriptor_t he's referring to look like they were introduced in:
------------------------------------------------------------
commit 1f4eab7e7c1d90dcd8ca4d7c064ee78dfbb345eb
Author: Neil Brown <neilb>
Date:   Mon Apr 16 09:35:27 2007 +1000

    NFS: Set meaningful value for fattr->time_start in readdirplus results.
    
    Don't use uninitialsed value for fattr->time_start in readdirplus results.
    
    The 'fattr' structure filled in by nfs3_decode_direct does not get a
    value for ->time_start set.
    Thus if an entry is for an inode that we already have in cache,
    when nfs_readdir_lookup calls nfs_fhget, it will call nfs_refresh_inode
    and may update the inode with out-of-date information.
    
    Directories are read a page at a time, so each page could have a
    different timestamp that "should" be used to set the time_start for
    the fattr for info in that page.  However storing the timestamp per
    page is awkward.  (We could stick in the first 4 bytes and only read 4092
    bytes, but that is a bigger code change than I am interested it).
    
    This patch ignores the readdir_plus attributes if a readdir finds the
    information already in cache, and otherwise sets ->time_start to the time
    the readdir request was sent to the server.
    
    It might be nice to store - in the directory inode - the time stamp for
    the earliest readdir request that is still in the page cache, so that we
    don't ignore attribute data that we don't have to.  This patch doesn't do
    that.
    
    Signed-off-by: Neil Brown <neilb>
    Signed-off-by: Trond Myklebust <Trond.Myklebust>
------------------------------------------------------------

I'll try and backport the patch, build a kernel and test it. Fingers crossed ...

Comment 15 Harshula Jayasuriya 2009-11-23 12:07:11 UTC
Created attachment 373091 [details]
Backport of commit 1f4eab7e7c1d90dcd8ca4d7c064ee78dfbb345eb

The bug was not triggered over the weekend with this patch. I will keep testing the patch this week. However, this patch does modify nfs_readdir_descriptor_t.

Comment 18 RHEL Program Management 2009-11-25 23:10:38 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 20 Don Zickus 2009-12-02 21:14:46 UTC
in kernel-2.6.18-176.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5

Please do NOT transition this bugzilla state to VERIFIED until our QE team
has sent specific instructions indicating when to do so.  However feel free
to provide a comment indicating that this fix has been verified.

Comment 22 Chris Ward 2010-02-11 10:25:02 UTC
~~ Attention Customers and Partners - RHEL 5.5 Beta is now available on RHN ~~

RHEL 5.5 Beta has been released! There should be a fix present in this 
release that addresses your request. Please test and report back results 
here, by March 3rd 2010 (2010-03-03) or sooner.

Upon successful verification of this request, post your results and update 
the Verified field in Bugzilla with the appropriate value.

If you encounter any issues while testing, please describe them and set 
this bug into NEED_INFO. If you encounter new defects or have additional 
patch(es) to request for inclusion, please clone this bug per each request
and escalate through your support representative.

Comment 26 errata-xmlrpc 2010-03-30 07:15:42 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2010-0178.html


Note You need to log in before you can comment on or make changes to this bug.