Login
[x]
Log in using an account from:
Fedora Account System
Red Hat Associate
Red Hat Customer
Or login using a Red Hat Bugzilla account
Forgot Password
Login:
Hide Forgot
Create an Account
Red Hat Bugzilla – Attachment 309620 Details for
Bug 436132
Reduce NFS cache invalidations due to writes from same client
[?]
New
Simple Search
Advanced Search
My Links
Browse
Requests
Reports
Current State
Search
Tabular reports
Graphical reports
Duplicates
Other Reports
User Changes
Plotly Reports
Bug Status
Bug Severity
Non-Defaults
|
Product Dashboard
Help
Page Help!
Bug Writing Guidelines
What's new
Browser Support Policy
5.0.4.rh83 Release notes
FAQ
Guides index
User guide
Web Services
Contact
Legal
This site requires JavaScript to be enabled to function correctly, please enable it.
[patch]
5 patch set
nfs-overlapping-writes.patch (text/plain), 16.04 KB, created by
Jeff Layton
on 2008-06-17 14:39:23 UTC
(
hide
)
Description:
5 patch set
Filename:
MIME Type:
Creator:
Jeff Layton
Created:
2008-06-17 14:39:23 UTC
Size:
16.04 KB
patch
obsolete
>------------------------------------------------------------------------------- >time-add-time_in_range_exclusi >------------------------------------------------------------------------------- >jiffies: add time_between macro > >From: Jeff Layton <jlayton@redhat.com> > >The time_in_range macro was added fairly recently to detect when a >jiffies value is between or equal to two given jiffies values. Add >a time_between macro to do the same thing, but to exclude the edge >cases (the _eq part). Needed for some NFS calculations. > >Signed-off-by: Jeff Layton <jlayton@redhat.com> >--- > > include/linux/jiffies.h | 4 ++++ > 1 files changed, 4 insertions(+), 0 deletions(-) > > >diff --git a/include/linux/jiffies.h b/include/linux/jiffies.h >index abb6ac6..9b4a214 100644 >--- a/include/linux/jiffies.h >+++ b/include/linux/jiffies.h >@@ -119,6 +119,10 @@ static inline u64 get_jiffies_64(void) > (time_after_eq(a,b) && \ > time_before_eq(a,c)) > >+#define time_between(a,b,c) \ >+ (time_after(a,b) && \ >+ time_before(a,c)) >+ > /* Same as above, but does so with platform independent 64bit types. > * These must be used when utilizing jiffies_64 (i.e. return value of > * get_jiffies_64() */ >------------------------------------------------------------------------------- >nfs-don-t-use-post_op_attrs-fr >------------------------------------------------------------------------------- >NFS: don't use post_op_attrs from WRITE replies for cache revalidation > >From: Jeff Layton <jlayton@redhat.com> > >Only use post_op_attrs from write replies for updating the cached >attributes. Additionally, don't even take the cached attributes if >the mtime in them is older than the current mtime on the inode. > >Signed-off-by: Jeff Layton <jlayton@redhat.com> >--- > > fs/nfs/inode.c | 60 +++++++++++++++++++++++++++++++++++++++++++++++ > fs/nfs/nfs3proc.c | 6 +++-- > fs/nfs/nfs4proc.c | 2 +- > fs/nfs/proc.c | 6 +++-- > include/linux/nfs_fs.h | 1 + > include/linux/nfs_xdr.h | 1 + > 6 files changed, 71 insertions(+), 5 deletions(-) > > >diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c >index 596c5d8..40a7a5f 100644 >--- a/fs/nfs/inode.c >+++ b/fs/nfs/inode.c >@@ -953,6 +953,66 @@ int nfs_post_op_update_inode(struct inode *inode, struct nfs_fattr *fattr) > } > > /** >+ * nfs_update_inode_attributes(struct inode *inode, struct nfs_fattr *fattr) >+ * @inode - pointer to inode >+ * @fattr - updated attributes >+ * >+ * Some operations (writes, in particular) will change inode metadata, but >+ * because of ordering problems we can't really trust that the data returned >+ * is current enough to use to revalidate the cache. This is also a problem >+ * for other calls that return attributes when they overlap with a write (or >+ * another call that changes inode metadata). In this case, we just want to >+ * use the fattr to update the cached attributes. For some calls (again, mostly >+ * writes), we don't even want to do that if the mtime in fattr is older than >+ * the one that the inode currently has. >+ **/ >+void >+nfs_update_inode_attributes(struct inode *inode, struct nfs_fattr *fattr) >+{ >+ struct nfs_inode *nfsi = NFS_I(inode); >+ loff_t cur_size, new_size; >+ unsigned long now = jiffies; >+ >+ /* mark attrcache as suspect since we're not revalidating */ >+ nfsi->cache_validity |= NFS_INO_INVALID_ATTR; >+ if (S_ISDIR(inode->i_mode)) >+ nfsi->cache_validity |= NFS_INO_INVALID_DATA; >+ >+ /* only update attrs if mtime isn't older on v2/3 write replies */ >+ if ((fattr->valid & NFS_ATTR_NO_MTIME_REVERSE) && >+ timespec_compare(&inode->i_mtime, &fattr->mtime) > 0) >+ return; >+ >+ /* has the file grown beyond our last write? */ >+ cur_size = i_size_read(inode); >+ new_size = nfs_size_to_loff_t(fattr->size); >+ if (new_size > cur_size) { >+ inode->i_size = new_size; >+ nfsi->cache_validity |= NFS_INO_INVALID_DATA; >+ } >+ >+ inode->i_mode = fattr->mode; >+ inode->i_nlink = fattr->nlink; >+ inode->i_uid = fattr->uid; >+ inode->i_gid = fattr->gid; >+ nfsi->change_attr = fattr->change_attr; >+ memcpy(&inode->i_mtime, &fattr->mtime, sizeof(inode->i_mtime)); >+ memcpy(&inode->i_ctime, &fattr->ctime, sizeof(inode->i_ctime)); >+ memcpy(&inode->i_atime, &fattr->atime, sizeof(inode->i_atime)); >+ >+ if (fattr->valid & (NFS_ATTR_FATTR_V3 | NFS_ATTR_FATTR_V4)) >+ inode->i_blocks = nfs_calc_block_size(fattr->du.nfs3.used); >+ else >+ inode->i_blocks = fattr->du.nfs2.blocks; >+ >+ /* set the attrcache timestamps so that we reval soon */ >+ nfsi->read_cache_jiffies = fattr->time_start; >+ nfsi->attrtimeo = NFS_MINATTRTIMEO(inode); >+ nfsi->attrtimeo_timestamp = now; >+ nfsi->last_updated = now; >+} >+ >+/** > * nfs_post_op_update_inode_force_wcc - try to update the inode attribute cache > * @inode - pointer to inode > * @fattr - updated attributes >diff --git a/fs/nfs/nfs3proc.c b/fs/nfs/nfs3proc.c >index c3523ad..4e4bdc5 100644 >--- a/fs/nfs/nfs3proc.c >+++ b/fs/nfs/nfs3proc.c >@@ -738,8 +738,10 @@ static int nfs3_write_done(struct rpc_task *task, struct nfs_write_data *data) > { > if (nfs3_async_handle_jukebox(task, data->inode)) > return -EAGAIN; >- if (task->tk_status >= 0) >- nfs_post_op_update_inode_force_wcc(data->inode, data->res.fattr); >+ if (task->tk_status >= 0) { >+ data->res.fattr->valid |= NFS_ATTR_NO_MTIME_REVERSE; >+ nfs_update_inode_attributes(data->inode, data->res.fattr); >+ } > return 0; > } > >diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c >index 1293e0a..433f89a 100644 >--- a/fs/nfs/nfs4proc.c >+++ b/fs/nfs/nfs4proc.c >@@ -2440,7 +2440,7 @@ static int nfs4_write_done(struct rpc_task *task, struct nfs_write_data *data) > } > if (task->tk_status >= 0) { > renew_lease(NFS_SERVER(inode), data->timestamp); >- nfs_post_op_update_inode_force_wcc(inode, data->res.fattr); >+ nfs_update_inode_attributes(inode, data->res.fattr); > } > return 0; > } >diff --git a/fs/nfs/proc.c b/fs/nfs/proc.c >index 03599bf..96c10dc 100644 >--- a/fs/nfs/proc.c >+++ b/fs/nfs/proc.c >@@ -572,8 +572,10 @@ static void nfs_proc_read_setup(struct nfs_read_data *data, struct rpc_message * > > static int nfs_write_done(struct rpc_task *task, struct nfs_write_data *data) > { >- if (task->tk_status >= 0) >- nfs_post_op_update_inode_force_wcc(data->inode, data->res.fattr); >+ if (task->tk_status >= 0) { >+ data->res.fattr->valid |= NFS_ATTR_NO_MTIME_REVERSE; >+ nfs_update_inode_attributes(data->inode, data->res.fattr); >+ } > return 0; > } > >diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h >index 27d6a8d..051aa04 100644 >--- a/include/linux/nfs_fs.h >+++ b/include/linux/nfs_fs.h >@@ -320,6 +320,7 @@ extern struct inode *nfs_fhget(struct super_block *, struct nfs_fh *, > struct nfs_fattr *); > extern int nfs_refresh_inode(struct inode *, struct nfs_fattr *); > extern int nfs_post_op_update_inode(struct inode *inode, struct nfs_fattr *fattr); >+extern void nfs_update_inode_attributes(struct inode *inode, struct nfs_fattr *fattr); > extern int nfs_post_op_update_inode_force_wcc(struct inode *inode, struct nfs_fattr *fattr); > extern int nfs_getattr(struct vfsmount *, struct dentry *, struct kstat *); > extern int nfs_permission(struct inode *, int, struct nameidata *); >diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h >index 24263bb..4236823 100644 >--- a/include/linux/nfs_xdr.h >+++ b/include/linux/nfs_xdr.h >@@ -64,6 +64,7 @@ struct nfs_fattr { > #define NFS_ATTR_FATTR_V4 0x0008 /* NFSv4 change attribute */ > #define NFS_ATTR_WCC_V4 0x0010 /* pre-op change attribute */ > #define NFS_ATTR_FATTR_V4_REFERRAL 0x0020 /* NFSv4 referral */ >+#define NFS_ATTR_NO_MTIME_REVERSE 0x0040 /* no backward mtime updates */ > > /* > * Info on the file system >------------------------------------------------------------------------------- >nfs-track-number-of-writes-on- >------------------------------------------------------------------------------- >NFS: track number of writes on the wire and when last one started/ended > >From: Jeff Layton <jlayton@redhat.com> > >Track the number of writes currently outstanding in the RPC queue, and >also track when the last write call started and ended. Tracking this >should give us information that we can use to determine whether an RPC >call overlapped with a write. > >Signed-off-by: Jeff Layton <jlayton@redhat.com> >--- > > fs/nfs/write.c | 13 +++++++++++++ > include/linux/nfs_fs.h | 3 +++ > 2 files changed, 16 insertions(+), 0 deletions(-) > > >diff --git a/fs/nfs/write.c b/fs/nfs/write.c >index 6d8ace3..abd2c09 100644 >--- a/fs/nfs/write.c >+++ b/fs/nfs/write.c >@@ -785,6 +785,7 @@ static int nfs_write_rpcsetup(struct nfs_page *req, > int how) > { > struct inode *inode = req->wb_context->path.dentry->d_inode; >+ struct nfs_inode *nfsi = NFS_I(inode); > int flags = (how & FLUSH_SYNC) ? 0 : RPC_TASK_ASYNC; > int priority = flush_task_priority(how); > struct rpc_task *task; >@@ -832,6 +833,10 @@ static int nfs_write_rpcsetup(struct nfs_page *req, > /* Set up the initial task struct. */ > NFS_PROTO(inode)->write_setup(data, &msg); > >+ /* update overlapping write detection info */ >+ nfsi->nwrites++; >+ nfsi->last_write_start = jiffies; >+ > dprintk("NFS: %5u initiated write call " > "(req %s/%Ld, %u bytes @ offset %Lu)\n", > data->task.tk_pid, >@@ -994,6 +999,10 @@ static void nfs_writeback_release_partial(void *calldata) > struct nfs_page *req = data->req; > struct page *page = req->wb_page; > int status = data->task.tk_status; >+ struct nfs_inode *nfsi = NFS_I(data->inode); >+ >+ nfsi->nwrites--; >+ nfsi->last_write_done = jiffies; > > if (status < 0) { > nfs_set_pageerror(page); >@@ -1049,6 +1058,10 @@ static void nfs_writeback_release_full(void *calldata) > { > struct nfs_write_data *data = calldata; > int status = data->task.tk_status; >+ struct nfs_inode *nfsi = NFS_I(data->inode); >+ >+ nfsi->nwrites--; >+ nfsi->last_write_done = jiffies; > > /* Update attributes as result of writeback. */ > while (!list_empty(&data->pages)) { >diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h >index 051aa04..ae4394b 100644 >--- a/include/linux/nfs_fs.h >+++ b/include/linux/nfs_fs.h >@@ -165,6 +165,9 @@ struct nfs_inode { > /* List of deferred sillydelete requests */ > struct hlist_head silly_list; > wait_queue_head_t waitqueue; >+ unsigned int nwrites; /* # writes on wire */ >+ unsigned long last_write_start; >+ unsigned long last_write_done; > > #ifdef CONFIG_NFS_V4 > struct nfs4_cached_acl *nfs4_acl; >------------------------------------------------------------------------------- >nfs-change-nfs_refresh_inode-t >------------------------------------------------------------------------------- >NFS: change nfs_refresh_inode to detect overlapping writes > >From: Jeff Layton <jlayton@redhat.com> > >When a call that returns post_op_attrs overlaps with a write, then >the attributes returned are suspect. It could reflect the status of >the inode at any time before, during or after the write and we can't >predict which. > >When this happens we don't want to invalidate the cached data since >that can be very expensive. Instead, just update the inode attributes >and mark the attrcache as invalid. Also in this case, if the mtime >would move backward, then don't even update the attributes. Just mark >the attrcache as invalid. > >Signed-off-by: Jeff Layton <jlayton@redhat.com> >--- > > fs/nfs/inode.c | 37 +++++++++++++++++++++++++++++++++---- > 1 files changed, 33 insertions(+), 4 deletions(-) > > >diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c >index 40a7a5f..2cf3140 100644 >--- a/fs/nfs/inode.c >+++ b/fs/nfs/inode.c >@@ -55,6 +55,7 @@ > static int enable_ino64 = NFS_64_BIT_INODE_NUMBERS_ENABLED; > > static void nfs_invalidate_inode(struct inode *); >+static bool nfs_overlapping_write(struct inode *inode, struct nfs_fattr *fattr); > static int nfs_update_inode(struct inode *, struct nfs_fattr *); > > static void nfs_zap_acl_cache(struct inode *); >@@ -675,7 +676,12 @@ __nfs_revalidate_inode(struct nfs_server *server, struct inode *inode) > } > > spin_lock(&inode->i_lock); >- status = nfs_update_inode(inode, &fattr); >+ >+ if (nfs_overlapping_write(inode, &fattr)) >+ nfs_update_inode_attributes(inode, &fattr); >+ else >+ status = nfs_update_inode(inode, &fattr); >+ > if (status) { > spin_unlock(&inode->i_lock); > dfprintk(PAGECACHE, "nfs_revalidate_inode: (%s/%Ld) refresh failed, error=%d\n", >@@ -899,6 +905,28 @@ static int nfs_check_inode_attributes(struct inode *inode, struct nfs_fattr *fat > return 0; > } > >+/* >+ * nfs_overlapping_write - determine whether a write overlapped with this call >+ * @inode - pointer to inode >+ * @fattr - updated attributes >+ * >+ * Did the call that returned these attributes overlap with a write to the >+ * same inode? >+ */ >+static bool >+nfs_overlapping_write(struct inode *inode, struct nfs_fattr *fattr) >+{ >+ struct nfs_inode *nfsi = NFS_I(inode); >+ unsigned long now = jiffies; >+ >+ if (nfsi->nwrites != 0 || >+ time_between(nfsi->last_write_start, fattr->time_start, now) || >+ time_between(nfsi->last_write_done, fattr->time_start, now)) >+ return true; >+ >+ return false; >+} >+ > /** > * nfs_refresh_inode - try to update the inode attribute cache > * @inode - pointer to inode >@@ -912,16 +940,17 @@ static int nfs_check_inode_attributes(struct inode *inode, struct nfs_fattr *fat > int nfs_refresh_inode(struct inode *inode, struct nfs_fattr *fattr) > { > struct nfs_inode *nfsi = NFS_I(inode); >- int status; >+ int status = 0; > > if ((fattr->valid & NFS_ATTR_FATTR) == 0) > return 0; > spin_lock(&inode->i_lock); >- if (time_after(fattr->time_start, nfsi->last_updated)) >+ if (nfs_overlapping_write(inode, fattr)) >+ nfs_update_inode_attributes(inode, fattr); >+ else if (time_after(fattr->time_start, nfsi->last_updated)) > status = nfs_update_inode(inode, fattr); > else > status = nfs_check_inode_attributes(inode, fattr); >- > spin_unlock(&inode->i_lock); > return status; > } >------------------------------------------------------------------------------- >nfs-remove-nfs_post_op_update_ >------------------------------------------------------------------------------- >NFS: remove nfs_post_op_update_inode_force_wcc > >From: Jeff Layton <jlayton@redhat.com> > >This has been superceded by nfs_update_inode_attributes() > >Signed-off-by: Jeff Layton <jlayton@redhat.com> >--- > > fs/nfs/inode.c | 28 ---------------------------- > include/linux/nfs_fs.h | 1 - > 2 files changed, 0 insertions(+), 29 deletions(-) > > >diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c >index 2cf3140..283e583 100644 >--- a/fs/nfs/inode.c >+++ b/fs/nfs/inode.c >@@ -1041,34 +1041,6 @@ nfs_update_inode_attributes(struct inode *inode, struct nfs_fattr *fattr) > nfsi->last_updated = now; > } > >-/** >- * nfs_post_op_update_inode_force_wcc - try to update the inode attribute cache >- * @inode - pointer to inode >- * @fattr - updated attributes >- * >- * After an operation that has changed the inode metadata, mark the >- * attribute cache as being invalid, then try to update it. Fake up >- * weak cache consistency data, if none exist. >- * >- * This function is mainly designed to be used by the ->write_done() functions. >- */ >-int nfs_post_op_update_inode_force_wcc(struct inode *inode, struct nfs_fattr *fattr) >-{ >- if ((fattr->valid & NFS_ATTR_FATTR_V4) != 0 && >- (fattr->valid & NFS_ATTR_WCC_V4) == 0) { >- fattr->pre_change_attr = NFS_I(inode)->change_attr; >- fattr->valid |= NFS_ATTR_WCC_V4; >- } >- if ((fattr->valid & NFS_ATTR_FATTR) != 0 && >- (fattr->valid & NFS_ATTR_WCC) == 0) { >- memcpy(&fattr->pre_ctime, &inode->i_ctime, sizeof(fattr->pre_ctime)); >- memcpy(&fattr->pre_mtime, &inode->i_mtime, sizeof(fattr->pre_mtime)); >- fattr->pre_size = inode->i_size; >- fattr->valid |= NFS_ATTR_WCC; >- } >- return nfs_post_op_update_inode(inode, fattr); >-} >- > /* > * Many nfs protocol calls return the new file attributes after > * an operation. Here we update the inode to reflect the state >diff --git a/include/linux/nfs_fs.h b/include/linux/nfs_fs.h >index ae4394b..fa66a5a 100644 >--- a/include/linux/nfs_fs.h >+++ b/include/linux/nfs_fs.h >@@ -324,7 +324,6 @@ extern struct inode *nfs_fhget(struct super_block *, struct nfs_fh *, > extern int nfs_refresh_inode(struct inode *, struct nfs_fattr *); > extern int nfs_post_op_update_inode(struct inode *inode, struct nfs_fattr *fattr); > extern void nfs_update_inode_attributes(struct inode *inode, struct nfs_fattr *fattr); >-extern int nfs_post_op_update_inode_force_wcc(struct inode *inode, struct nfs_fattr *fattr); > extern int nfs_getattr(struct vfsmount *, struct dentry *, struct kstat *); > extern int nfs_permission(struct inode *, int, struct nameidata *); > extern int nfs_open(struct inode *, struct file *);
You cannot view the attachment while viewing its details because your browser does not support IFRAMEs.
View the attachment on a separate page
.
View Attachment As Diff
View Attachment As Raw
Actions:
View
|
Diff
Attachments on
bug 436132
:
297190
|
297191
|
297250
|
297251
|
297252
|
309488
|
309489
|
309490
| 309620