Hide Forgot
To describe the read methods used to test this, if we do "md5sum ./filename" we get AAAAAAAAAAA, but if we do "md5sum ./*" we get BBBBBBBB for the same file. We get the content from the 'client_path' file (from the path differs from inode message) if we do "md5sum ./filename && cat ./filename", but not with just "cat ./filename".
This bug has similar symptoms as #173, #192, and #126 (according to the 2.0.6 release notes). There are no segfaults, but the file contents are redirected. We saw this before 2.0.6 was released, and we thought it was fixed. Reading the glusterfs code, it is clear that the (dis)connection problem is fixed, but I doubt thats what we saw, since we monitor the processes/network and nothing like that ever happened. However, a few times we saw that when opening a file, it had the contents of a different file, crashing a network. The content we saw depended on the method we used to open it, and each method consistently produced one of two contents. The problem was provably worse when we read and closed multiple files rapidly (but sequentially). Each of our installations runs two machines, each with a replicating client and a server backend. There are typically < 200 files, each < 10Mb. Each client replicates to both servers, but we guarantee that only one is writing at a time (will write multiple files in bursts). However, both machines read the files rapidly, with most reads clustered around the writes. We know that the files wont replicate while they're open, but thats not really an issue since close() triggers it. The issue might be aggravated by reading the files too soon after writing or vice-versa. The effects are seen soon after commiting changes to the files. We have tried every known method to replace the contents once the error appears, but only a gluster-server restart allowed those to work. The gluster-server debug log says "paths differ for inode(xxxxxx)". For the files with the bad content. The log message only stops repeating after restarting gluster-server. I saw that Anand mentioned in some forums from 2008 that the log message will be fixed, but its not the message that concerns us. Its that the file contents oscillate crashing networks. Gluster is great, but this bug will cause it to be promptly removed from our systems until fixed. We read the gluster code, but could not devise a patch for this. The client and server volfiles for one of the machines (machine1) are attached. The other machine (machine2) has a mirror image. We saw this bug with and without the read-subvolume option. We also replaced the client type from 'afr' to 'replicate' without success (execution path seems identical anyway). We dont use caching or write-behind. We have not tried version 2.0.7, there was nothing mentioned about this in the release notes. This bug is hard enough to reproduce consistently (we have not found the precise trigger), so the absence of it in 2.0.7 doesnt mean much. We hope its somehow our volfiles that trigger this, but if its not then thank you for looking into this.
Adam, Are your files ever deleted and recreated? From the symptoms you mention I have a feeling that this is related to the 2.0.x releases not having support for "generation" numbers for inodes. Basically what this means is that when a file is deleted and another file with the same name is created in quick succession, the inode number of the old file might be re-used by the backend filesystem for the new file as well. This can confuse the GlusterFS server. The 3.0.0 release fixes these kinds of errors by keeping track of (possible) inode number re-usage. Can you try 3.0.0 and tell us if you still face this problem?
There are no files being deleted and recreated. The initial state has no files. Then, about 100 files are created by another program in quick succession on machine_1. Then, when I ask winSCP to refresh the dir contents repeatedly (on machine_2 only), the file sizes change for a random selection of files (the last file to get written always messes up). I dont know how its actually triggered (tried many things), but once it is, the following tests exploit this bug until restart (timing between syscalls is important). Test 1) note which files have "different paths for inode" (after refreshing from winSCP) # ls fname_1 && ls fname_2 && cat fname_1 (from machine_2) see contents of fname_2 Test 2) # echo one > fname_1 && echo two > fname_2 (from machine_1, this seems to tie the two files together forever) wait any amount of time # ls fname_1 && ls fname_2 && cat fname_1 (from machine_2) see contents of fname_2 The program that can reproduce this is complex, but I've narrowed the search to when the files are actually written. There is lots of reading (dir and file) going on around the writes, and the files are written very quickly. The problem exists immediately after files are written. Also, machine_2 has to be up at the time of the writes. If it boots later, it does not see the problem.
Created attachment 128 [details] rnews should detect code 436 in response to "ihave" command I tried reproducing this with version 3.0.0, but the client keeps segfaulting at startup. Attached is the log. Sorry, but I cannot spend any more time debugging this, since we've now dropped gluster.
the crash should have happened because of version mismatch. This shouldn't be happening in the latest version. The original bug was fixed in 3.0.x