Bug 1191100 - Data corruption due to lack of cache revalidation on open
Summary: Data corruption due to lack of cache revalidation on open
Keywords:
Status: CLOSED DUPLICATE of bug 1158120
Alias: None
Product: GlusterFS
Classification: Community
Component: core
Version: mainline
Hardware: All
OS: Linux
unspecified
high
Target Milestone: ---
Assignee: bugs@gluster.org
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks: 1158120
TreeView+ depends on / blocked
 
Reported: 2015-02-10 12:56 UTC by Niels de Vos
Modified: 2017-08-10 13:04 UTC (History)
7 users (show)

Fixed In Version:
Clone Of: 1158120
Environment:
Last Closed: 2017-08-10 13:04:16 UTC
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Niels de Vos 2015-02-10 12:56:22 UTC
+++ This bug was initially created as a clone of Bug #1158120 +++
+++                                                           +++
+++ Use this bug to post the patch for mainline.              +++

We've started to see data corruption happening when processes on different hosts are accessing the same SQLite database, coordinating their efforts through proper locking.

We eventually found the following simple scenario that reproduces the problem. Here $D is a directory mounted via a native glusterfs mount, and $OTH is another host that also has $D mounted and can be ssh'd to without a password and reachable quickly (in < 0.3 seconds or so).

$ echo init > $D/f; cat $D/f; ssh $OTHER_HOST "echo second > $D/$f"; cat $D/f
init
init

If one does a directory listing before the final cat, then one gets the expected result:

$ echo init > $D/f; cat $D/f; ssh $OTHER_HOST "echo second > $D/$f"; ls $D > /dev/null; cat $D/f
init
second

Also, if one omits the first cat, or does another modification to the file after it, then one also gets the expected result:

$ echo init > $D/f; ssh $OTHER_HOST "echo second > $D/$f"; cat $D/f
second

There seem to be several different bugs contributing to this problem and I will post separate bug reports and patches for each of them.

This bug report concerns the case when the default volume options are used on mount. Then fopen-keep-cache is used by default, and fuse does not flush its cache from the first "cat $D/f" before doing the second "cat $D/f", and nothing else has told fuse to invalidate its cache. That is why the incorrect data is displayed.

When one does an "ls $D" prior to the second cat, then code in md-cache.c invalidates the cache for every file within the directory whose contents have changed. This then gets passed up to fuse-bridge to tell fuse to flush its page cache.

However, no such check is done when opening a file, and I think it should be, and also should be done when obtaining a lock on part of a file that is already open.

The attached patch is what we are now using to avoid this problem. It may not be implemented in the best way, as I was not sure how to access the iatt data for the file on an open, so the patch, after a successful open callback, winds an fstat fop down the stack, and, on return, checks the iatt values from that. Better would be for the open fop to return the iatt value directly, the way other fops do, since I think the storage backend does a stat anyway when responding to an open call, and if it were to return that stat information there would be no performance hit in using it. As it stands, the patch does incur a slight performance penalty.

Even with this patch, the behaviour is still not always entirely correct due to other bugs which I will report separately.

The patch does a similar thing for lock calls.

The patch is against 3.6.0beta3.

--- Additional comment from Philip Spencer on 2014-10-28 21:36:24 CET ---

Oops ... pretty fundamental typo in the first patch that forgot the "op_errno !=0" to skip doing the iatt check if op_errno was nonzero! Would result in locks appearing to succeed (returning 0) when in fact they failed!

Revised patch is attached.

--- Additional comment from krishnan parthasarathi on 2014-11-04 13:44:49 CET ---

Philip,

Thanks for opening this issue and posting a patch.  We use Gerrit (review.gluster.org) as our collaborative code review tool. It would be helpful if you could submit your patches there (Ref: http://www.gluster.org/community/documentation/index.php/Simplified_dev_workflow).

Comment 1 Amar Tumballi 2017-08-10 13:04:16 UTC

*** This bug has been marked as a duplicate of bug 1158120 ***


Note You need to log in before you can comment on or make changes to this bug.