1158120 – Data corruption due to lack of cache revalidation on open

Bug 1158120 - Data corruption due to lack of cache revalidation on open

Summary: Data corruption due to lack of cache revalidation on open

Keywords:
Status:	CLOSED WORKSFORME
Alias:	None
Product:	GlusterFS
Classification:	Community
Component:	md-cache
Sub Component:
Version:	mainline
Hardware:	All
OS:	Linux
Priority:	unspecified
Severity:	high
Target Milestone:	---
Assignee:	bugs@gluster.org
QA Contact:
Docs Contact:
URL:
Whiteboard:	gluster-stale-cache, GLUSTERFS_METADA...
Duplicates (1):	1191100 (view as bug list)
Depends On:	1191100
Blocks:
TreeView+	depends on / blocked

Reported:	2014-10-28 15:55 UTC by Philip Spencer
Modified:	2019-04-30 10:00 UTC (History)
CC List:	10 users (show)
Fixed In Version:	glusterfs-6.1
Clone Of:
Clones:	1191100 (view as bug list)
Environment:
Last Closed:	2019-04-30 10:00:41 UTC
Regression:	---
Mount Type:	---
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
Patch to check cache validity when opening or locking a file (3.46 KB, patch) 2014-10-28 15:55 UTC, Philip Spencer	no flags	Details \| Diff
Patch to check cache validity when opening or locking a file (corrected) (3.48 KB, patch) 2014-10-28 20:36 UTC, Philip Spencer	no flags	Details \| Diff
Show Obsolete (1) View All

Description Philip Spencer 2014-10-28 15:55:14 UTC

Created attachment 951437 [details]
Patch to check cache validity when opening or locking a file

We've started to see data corruption happening when processes on different hosts are accessing the same SQLite database, coordinating their efforts through proper locking.

We eventually found the following simple scenario that reproduces the problem. Here $D is a directory mounted via a native glusterfs mount, and $OTH is another host that also has $D mounted and can be ssh'd to without a password and reachable quickly (in < 0.3 seconds or so).

$ echo init > $D/f; cat $D/f; ssh $OTHER_HOST "echo second > $D/$f"; cat $D/f
init
init

If one does a directory listing before the final cat, then one gets the expected result:

$ echo init > $D/f; cat $D/f; ssh $OTHER_HOST "echo second > $D/$f"; ls $D > /dev/null; cat $D/f
init
second

Also, if one omits the first cat, or does another modification to the file after it, then one also gets the expected result:

$ echo init > $D/f; ssh $OTHER_HOST "echo second > $D/$f"; cat $D/f
second

There seem to be several different bugs contributing to this problem and I will post separate bug reports and patches for each of them.

This bug report concerns the case when the default volume options are used on mount. Then fopen-keep-cache is used by default, and fuse does not flush its cache from the first "cat $D/f" before doing the second "cat $D/f", and nothing else has told fuse to invalidate its cache. That is why the incorrect data is displayed.

When one does an "ls $D" prior to the second cat, then code in md-cache.c invalidates the cache for every file within the directory whose contents have changed. This then gets passed up to fuse-bridge to tell fuse to flush its page cache.

However, no such check is done when opening a file, and I think it should be, and also should be done when obtaining a lock on part of a file that is already open.

The attached patch is what we are now using to avoid this problem. It may not be implemented in the best way, as I was not sure how to access the iatt data for the file on an open, so the patch, after a successful open callback, winds an fstat fop down the stack, and, on return, checks the iatt values from that. Better would be for the open fop to return the iatt value directly, the way other fops do, since I think the storage backend does a stat anyway when responding to an open call, and if it were to return that stat information there would be no performance hit in using it. As it stands, the patch does incur a slight performance penalty.

Even with this patch, the behaviour is still not always entirely correct due to other bugs which I will report separately.

The patch does a similar thing for lock calls.

The patch is against 3.6.0beta3.

Comment 1 Philip Spencer 2014-10-28 20:36:24 UTC

Created attachment 951526 [details]
Patch to check cache validity when opening or locking a file (corrected)

Oops ... pretty fundamental typo in the first patch that forgot the "op_errno !=0" to skip doing the iatt check if op_errno was nonzero! Would result in locks appearing to succeed (returning 0) when in fact they failed!

Revised patch is attached.

Comment 2 krishnan parthasarathi 2014-11-04 12:44:49 UTC

Philip,

Thanks for opening this issue and posting a patch.  We use Gerrit (review.gluster.org) as our collaborative code review tool. It would be helpful if you could submit your patches there (Ref: http://www.gluster.org/community/documentation/index.php/Simplified_dev_workflow).

Comment 3 Amar Tumballi 2017-08-10 13:04:16 UTC

*** Bug 1191100 has been marked as a duplicate of this bug. ***

Comment 4 Vijay Bellur 2018-11-19 08:01:29 UTC

Raghavendra, Poornima - can you please comment on the validity of this patch now? Thanks!

Comment 5 Yaniv Kaul 2019-04-27 19:34:50 UTC

Status?

Comment 6 Amar Tumballi 2019-04-30 10:00:41 UTC

This is no more relevant as we did multiple fixes in our caching layers and fixed may issues for hosting DB workload on GlusterFS. Closing as WORKSFORME on GlusterFS-6.1 release.

Note You need to log in before you can comment on or make changes to this bug.