Hide Forgot
Reported by Gordan Bobic <gordan> on the mailing list: SQLite Affecting Bugs ===================== There seems to be an issue that reliably (but very subtly) affects some of the SQLite functionality. This is evident in the way the RPM database behaves (converted to SQLite because as far as I can tell BDB needs writable mmap() which means it won't work on any fuse based fs) - for example, it just won't find some of the packages even though they are installed. Here is an example (a somewhat ironic one, you might say): # ls -la /usr/lib64/libfuse.so.* lrwxrwxrwx 1 root root 16 May 25 12:39 /usr/lib64/libfuse.so.2 -> libfuse.so.2.7.4 -rwxr-xr-x 1 root root 134256 Feb 19 21:40 /usr/lib64/libfuse.so.2.7.4 # rpm -q --whatprovides /usr/lib64/libfuse.so.2 fuse-libs-2.7.4glfs11-1 # rpm -Uvh glusterfs-client-2.0.2-1.el5.x86_64.rpm glusterfs-server-2.0.2-1.el5.x86_64.rpm glusterfs-client-2.0.2-1.el5.x86_64.rpm warning: package glusterfs-client = 2.0.2-1.el5 was already added, skipping glusterfs-client < 2.0.2-1.el5 error: Failed dependencies: libfuse.so.2()(64bit) is needed by glusterfs-client-2.0.2-1.el5.x86_64 libfuse.so.2(FUSE_2.4)(64bit) is needed by glusterfs-client-2.0.2-1.el5.x86_64 libfuse.so.2(FUSE_2.5)(64bit) is needed by glusterfs-client-2.0.2-1.el5.x86_64 libfuse.so.2(FUSE_2.6)(64bit) is needed by glusterfs-client-2.0.2-1.el5.x86_64 So libfuse is there, RPM knows that fuse-libs-2.7.4glfs11-1 package provides, and yet when glusterfs tries to install, it fails to find it. This _only_ happens when the RPM DB (/var/lib/rpm) is on glusterfs. The sama package sets on machines that aren't rooted on glusterfs deal with this package combination just fine. rpm --rebuilddb doesn't alter the situation at all, the issue is still present after the DB rebuild. If the above is deemed difficult to set up, there is another way to easily recreate an SQLite related issue. Mount /home via glusterfs, log into X, and fire up Firefox 3.0.x (I'm using 3.0.10 on x86_64, but this has been reproducible for a very long time with older versions, too). Add a bookmark. It'll show up in the bookmarks menu. Now exit firefox, wait a few seconds for it to shut down, and fire it up again. Check the bookmarks - the page you have just added won't be there. I only tested this (/home) with both nodes being up, I haven't tried it with one node being down. Has anybody got any ideas on what could be causing this or any workarounds? In the RPM DB case, the FS is mounted with the following parameters (from ps, after startup): /usr/sbin/glusterfs --log-level=NONE --log-file=/dev/null --disable-direct-io-mode --volfile=/etc/glusterfs/root.vol /mnt/newroot Home is mounted with the following: /usr/sbin/glusterfs --log-level=NORMAL --volfile=/etc/glusterfs/home.vol /home If these are the same bug, then this implies that direct-io-mode has no effect on it.
*** Bug 65 has been marked as a duplicate of this bug. ***
*** Bug 186 has been marked as a duplicate of this bug. ***
Sqlite and all other firefox related bugs seem to be the same issue (firefox uses sqlite to store bookmarks). It is working on the mainline release-3.0 branch of GlusterFS. This probably was an issue with posix locks with 2.0.x releases, especially supporting server side AFR and flocks calls. Locks have undergone quite a lot of code change in the 3.0.x releases. There were a couple of issues with flock calls in 3.0.x and server side AFR config hanging, which have also been fixed. I am marking this is "works for me". Please re-open the bug, if it is hit again on the latest releases. Also, capture the output of the log file, by turning on an option in posix-locks: "option trace on" and if possible the process state dump of the processes (kill -USR1 <glusterfs pid>) (stored in /tmp/glusterdump.<pid>).
I can confirm that I haven't observed this with 3.0.2 using server-side AFR, and I have had it in use since a few days after 3.0.2 release. Having said that - the fact that the cause and fix (if there is in fact a fix, rather than something incidental that makes the issue occur considerably less often) haven't actually been explicitly identified is terrifyingly uninspiring of confidence for something that can cause data loss/corruption.