Bug 761738 (GLUSTER-6)

Summary: SQLite does not work on a replicate mountpoint
Product: [Community] GlusterFS Reporter: Vikas Gorur <vikas>
Component: replicateAssignee: Pavan Vilas Sondur <pavan>
Status: CLOSED WORKSFORME QA Contact:
Severity: medium Docs Contact:
Priority: low    
Version: mainlineCC: amarts, gluster-bugs, gordan, gowda, hauser, vijay
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: RTP Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Vikas Gorur 2009-06-08 06:48:21 UTC
Reported by Gordan Bobic <gordan> on the mailing list:

SQLite Affecting Bugs
=====================
There seems to be an issue that reliably (but very subtly) affects some
of the SQLite functionality. This is evident in the way the RPM database
behaves (converted to SQLite because as far as I can tell BDB needs
writable mmap() which means it won't work on any fuse based fs) - for
example, it just won't find some of the packages even though they are
installed. Here is an example (a somewhat ironic one, you might say):

# ls -la /usr/lib64/libfuse.so.*
lrwxrwxrwx 1 root root     16 May 25 12:39 /usr/lib64/libfuse.so.2 ->
libfuse.so.2.7.4
-rwxr-xr-x 1 root root 134256 Feb 19 21:40 /usr/lib64/libfuse.so.2.7.4

# rpm -q --whatprovides /usr/lib64/libfuse.so.2
fuse-libs-2.7.4glfs11-1

# rpm -Uvh glusterfs-client-2.0.2-1.el5.x86_64.rpm
glusterfs-server-2.0.2-1.el5.x86_64.rpm
glusterfs-client-2.0.2-1.el5.x86_64.rpm
warning: package glusterfs-client = 2.0.2-1.el5 was already added,
skipping glusterfs-client < 2.0.2-1.el5
error: Failed dependencies:
         libfuse.so.2()(64bit) is needed by
glusterfs-client-2.0.2-1.el5.x86_64
         libfuse.so.2(FUSE_2.4)(64bit) is needed by
glusterfs-client-2.0.2-1.el5.x86_64
         libfuse.so.2(FUSE_2.5)(64bit) is needed by
glusterfs-client-2.0.2-1.el5.x86_64
         libfuse.so.2(FUSE_2.6)(64bit) is needed by
glusterfs-client-2.0.2-1.el5.x86_64

So libfuse is there, RPM knows that fuse-libs-2.7.4glfs11-1 package
provides, and yet when glusterfs tries to install, it fails to find it.
This _only_ happens when the RPM DB (/var/lib/rpm) is on glusterfs. The
sama package sets on machines that aren't rooted on glusterfs deal with
this package combination just fine. rpm --rebuilddb doesn't alter the
situation at all, the issue is still present after the DB rebuild.

If the above is deemed difficult to set up, there is another way to
easily recreate an SQLite related issue. Mount /home via glusterfs, log
into X, and fire up Firefox 3.0.x (I'm using 3.0.10 on x86_64, but this
has been reproducible for a very long time with older versions, too).
Add a bookmark. It'll show up in the bookmarks menu. Now exit firefox,
wait a few seconds for it to shut down, and fire it up again. Check the
bookmarks - the page you have just added won't be there.

I only tested this (/home) with both nodes being up, I haven't tried it
with one node being down.

Has anybody got any ideas on what could be causing this or any
workarounds? In the RPM DB case, the FS is mounted with the following
parameters (from ps, after startup):
/usr/sbin/glusterfs --log-level=NONE --log-file=/dev/null
--disable-direct-io-mode --volfile=/etc/glusterfs/root.vol /mnt/newroot

Home is mounted with the following:
/usr/sbin/glusterfs --log-level=NORMAL --volfile=/etc/glusterfs/home.vol
/home

If these are the same bug, then this implies that direct-io-mode has no
effect on it.

Comment 1 Shehjar Tikoo 2010-03-02 08:30:58 UTC
*** Bug 65 has been marked as a duplicate of this bug. ***

Comment 2 Pavan Vilas Sondur 2010-03-23 02:48:32 UTC
*** Bug 186 has been marked as a duplicate of this bug. ***

Comment 3 Pavan Vilas Sondur 2010-03-23 02:54:45 UTC
Sqlite and all other firefox related bugs seem to be the same issue (firefox uses sqlite to store bookmarks). It is working on the mainline release-3.0 branch of GlusterFS. 

This probably was an issue with posix locks with 2.0.x releases, especially supporting server side AFR and flocks calls. Locks have undergone quite a lot of code change in the 3.0.x releases. There were a couple of issues with flock calls in 3.0.x and server side AFR config hanging, which have also been fixed. 

I am marking this is "works for me". Please re-open the bug, if it is hit again on the latest releases. Also, capture the output of the log file, by turning on an option in posix-locks: "option trace on" and if possible the process state dump of the processes (kill -USR1 <glusterfs pid>) (stored in /tmp/glusterdump.<pid>).

Comment 4 Gordan Bobic 2010-03-23 12:06:19 UTC
I can confirm that I haven't observed this with 3.0.2 using server-side AFR, and I have had it in use since a few days after 3.0.2 release.

Having said that - the fact that the cause and fix (if there is in fact a fix, rather than something incidental that makes the issue occur considerably less often) haven't actually been explicitly identified is terrifyingly uninspiring of confidence for something that can cause data loss/corruption.