Hide Forgot
Background: We use GlusterFS in our system configured with DHT(Data loss is ok for us because we have other copies outside the system). When a server is down, we don't want to stop our service, and failing to reading/writing from/to a file hashed to the node that is down is acceptable. In other words, we want to continue our service with the rest servers before recovering the failed node. Sometimes when a node is down, the access(ls command) to client is ok, read/write also returns ok unless target file is hashed to the failed node. this is just what we want. But there is also situation when a node is down, 'ls MOUNTPOINT' output "transport endpoint is not connected", and all of read/write upon it fail. So i wonder if there is a "key node" in DHT only system, and if it exists, is it a bug? because it influences on all the access to system. Below is the procedure how i find the "key node". Test environment: Server: 3 nodes, each with one brick. Client: configured with DHT. In my environment, (1) Kill the glusterfsd process of node2, 'ls MOUNTPOINT' output "transport endpoint is not connected", and read/write also returns error. (2) Kill the glusterfsd of process of node1 or node 3 or both of them, and leave node2, 'ls MOUNTPOINT' runs ok, and output filenames of node2. File read/write returns ok if the file is not hashed to failed nodes.(read/write from/to node1 &node3 fails is acceptable because it's down). In order to find which is the "key node" , i did some debugs and read some sources about dht. I found that after the glusterfsd process of node2 is killed, 'ls MOUNTPOINT' will trigger a invocation of dht_lookup() . In dht_lookup, first find the cached subvol of MOUNTPOINT, and do lookup in cached_subvol. In my environment, the cached subvol correspond to node2, because node2 is down, lookup returns -1 and the errno is 107(transport endpoint is not connected). Then dht_revalidate_cbk() simply UNWIND the stack and send err to fuse, and output the error in command line. but in situation (2), because the cached subvol(node 2)is ok and dht_lookup return ok. so ls command runs normally. Conclusion: if dht_lookup the MOUNTPOINT and its cached subvol is down, the operations return err, ls command output err, read/write also fails. At last: When the cached subvol is down, if it tries to do lookups on all other nodes, we might avoid the "key node" problem. Remove the brick on the failed node and restart the client can runs ok, but most of times, we cannot stop our own service that read/write from/to the client, so it's no acceptable. I don't know whether my understanding is right, please correct me if these is any error. If you have any advice to solve the problem mentioned in the beginning, please tell me. Thanks in advance.
Hi, This bug was fixed as part of the bug 764264. This is part of 3.2 release, and a 3.1 release with this patch will be out soon. Patches are available here: release.3-1 http://patches.gluster.com/patch/6729/ release.3-2 http://patches.gluster.com/patch/6728/ With this patch, the lookup will go to all subvols, and not just the cached subvol *** This bug has been marked as a duplicate of bug 2532 ***