Description of problem: ======================== When performed add-brick on a distribute volume with 2 bricks to change the volume to distribute-replicate volume (2x2) and subsequent rebalance volume operation, the file operations on NFS mount fails with Input/output error. Version-Release number of selected component (if applicable): ============================================================= [11/09/12 - 06:38:37 root@king ~]# rpm -qa | grep gluster glusterfs-fuse-3.3.0.5rhs-37.el6rhs.x86_64 [11/09/12 - 06:38:44 root@king ~]# gluster --version glusterfs 3.3.0.5rhs built on Nov 8 2012 22:30:35 How reproducible: ================= Often Steps to Reproduce: ===================== 1.Create a distribute volume with 2 bricks. Start the volume 2.create nfs and fuse mounts to volume on client machine 3.on fuse mount execute : mkdir testdir1 ; cd testdir1 ; for i in `seq 1 100`; do mkdir dir.$i ; for j in `seq 1 100`; do dd if=/dev/input_file of=dir.$i/file.$j bs=1k count=1024 ; done ; done 4. on nfs mount execute : mkdir testdir2; cd testdir2 ; for i in `seq 1 100`; do mkdir dir.$i ; for j in `seq 1 100`; do dd if=/dev/input_file of=dir.$i/file.$j bs=1k count=1024 ; done ; done 5. add brick to the distribute volume to make it distribute-replicate volume 6. perform rebalance. Actual results: ================ The dd on the NFS mount fails with "Input/output error" NFS Log message for one of the failure:- ======================================= [2012-11-09 05:53:34.869753] W [client3_1-fops.c:418:client3_1_open_cbk] 0-distribute-client-1: remote operation failed: No such file or directory. Path: <gfid:d06f37d2-327b-49b9-a926-efaac2c4d39e> (00000000-0000-0000-0000-000000000000) [2012-11-09 05:53:34.869801] E [afr-self-heal-data.c:1311:afr_sh_data_open_cbk] 0-distribute-replicate-0: open of <gfid:d06f37d2-327b-49b9-a926-efaac2c4d39e> failed on child distribute-client-1 (No such file or directory) [2012-11-09 05:53:34.869822] E [afr-self-heal-common.c:2160:afr_self_heal_completion_cbk] 0-distribute-replicate-0: background meta-data data entry self-heal failed on <gfid:d06f37d2-327b-49b9-a926-efaac2c4d39e> [2012-11-09 05:53:34.869850] E [nfs3-helpers.c:3603:nfs3_fh_resolve_inode_lookup_cbk] 0-nfs-nfsv3: Lookup failed: <gfid:d06f37d2-327b-49b9-a926-efaac2c4d39e>: Input/output error [2012-11-09 05:53:34.869940] E [nfs3.c:2195:nfs3_write_resume] 0-nfs-nfsv3: Unable to resolve FH: (10.70.34.110:795) distribute : d06f37d2-327b-49b9-a926-efaac2c4d39e [2012-11-09 05:53:34.869965] W [nfs3-helpers.c:3389:nfs3_log_common_res] 0-nfs-nfsv3: XID: fcb62144, WRITE: NFS: 5(I/O error), POSIX: 14(Bad address) Expected results: =================== dd on all the files should be successful
Created attachment 641484 [details] NFS Log
This test case passed on the update2 build: [11/09/12 - 07:31:30 root@king ~]# rpm -qa | grep gluster glusterfs-server-3.3.0.2rhs-30.el6rhs.x86_64
seems to be an issue with self-heal not considered for entries as in this particular case, file was present only on one node, and on other node, it was not present.
pranith, assigning this to you, see if the issue is related to self-heal. If not re-assign it to me.
This also happens with the GlusterFS 3.3.1 using the fuse client to mount the gluster volume. I am using these RPM's from the main website: glusterfs-swift-plugin-3.3.1-1.fc17.noarch glusterfs-server-3.3.1-1.fc17.x86_64 glusterfs-devel-3.3.1-1.fc17.x86_64 glusterfs-debuginfo-3.3.1-1.fc17.x86_64 glusterfs-swift-proxy-3.3.1-1.fc17.noarch glusterfs-swift-container-3.3.1-1.fc17.noarch glusterfs-3.3.1-1.fc17.x86_64 glusterfs-swift-account-3.3.1-1.fc17.noarch glusterfs-fuse-3.3.1-1.fc17.x86_64 glusterfs-swift-3.3.1-1.fc17.noarch glusterfs-geo-replication-3.3.1-1.fc17.x86_64 glusterfs-rdma-3.3.1-1.fc17.x86_64 glusterfs-swift-doc-3.3.1-1.fc17.noarch glusterfs-swift-object-3.3.1-1.fc17.noarch glusterfs-vim-3.2.7-2.fc17.x86_64 I have the same issue when adding the 2nd brick to a 1 brick Distribute volume and I also get an issue when adding bricks 3 and 4 to a replicated volume of just two bricks. In my experience its something to do with the volume going from a single sub volume to two sub volumes. Adding sub volume 3 goes ok, and so do any after the 3rd one is added. To reproduce the error create a gluster volume with one sub volume and mount it with the normal "mount -t glusterfs ip:/vol /path" command and then run this in another session: watch --interval=0 find /path With that watch running every 0.1s go and add in your 2nd sub volume of either 1 brick (as in distribute) or two bricks (for replicate) and you should also get the error. It may help if you setup some VirtualBox nodes to test with as I've found that slower systems create this problem quicker. Hope this helps and please shout if you want me to test some RPM's or something? Rich
I don't know if this will help, but I get lots of these in my logs when it happens. For reference "md0" is the name of the gluster volume and "recon" is the only file in the volume. [2013-01-15 16:34:40.795592] W [fuse-bridge.c:292:fuse_entry_cbk] 0-glusterfs-fuse: 16499: LOOKUP() /recon => -1 (Invalid argument) [2013-01-15 16:34:40.795660] W [dht-layout.c:186:dht_layout_search] 1-md0-dht: no subvolume for hash (value) = 3228047937 [2013-01-15 16:34:40.795670] E [dht-common.c:1372:dht_lookup] 1-md0-dht: Failed to get hashed subvol for /recon [2013-01-15 16:34:40.795680] W [fuse-bridge.c:292:fuse_entry_cbk] 0-glusterfs-fuse: 16500: LOOKUP() /recon => -1 (Invalid argument) [2013-01-15 16:34:40.795827] W [dht-layout.c:186:dht_layout_search] 1-md0-dht: no subvolume for hash (value) = 3228047937 [2013-01-15 16:34:40.795843] E [dht-common.c:1372:dht_lookup] 1-md0-dht: Failed to get hashed subvol for /recon [2013-01-15 16:34:40.795854] W [fuse-bridge.c:292:fuse_entry_cbk] 0-glusterfs-fuse: 16501: LOOKUP() /recon => -1 (Invalid argument) [2013-01-15 16:34:40.892750] I [client-handshake.c:1636:select_server_supported_programs] 1-md0-client-1: Using Program GlusterFS 3.3.1, Num (1298437), Version (330) [2013-01-15 16:34:40.920663] I [client-handshake.c:1433:client_setvolume_cbk] 1-md0-client-1: Connected to 169.254.0.44:24009, attached to remote volume '/mnt/md0/brick1'. [2013-01-15 16:34:40.920744] I [client-handshake.c:1445:client_setvolume_cbk] 1-md0-client-1: Server and Client lk-version numbers are not same, reopening the fds [2013-01-15 16:34:40.921550] I [client-handshake.c:453:client_set_lk_version_cbk] 1-md0-client-1: Server lk version = 1
Hmmm, i've updated to Fedora 18 and installed Gluster with these RPM's: # rpm -qa | grep gluster glusterfs-fuse-3.3.1-4.fc18.x86_64 glusterfs-rdma-3.3.1-4.fc18.x86_64 glusterfs-3.3.1-4.fc18.x86_64 glusterfs-geo-replication-3.3.1-4.fc18.x86_64 glusterfs-server-3.3.1-4.fc18.x86_64 And the problem doesn't seem to happen any more... well, at least I've not been able to reproduce it yet. I don't know what changes there were between 3.3.1-1 and 3.3.1-4 but it may have resolved my issue. I'll come back and post an update if I get it to error again. Rich
ok, small udpate, the problem is still there, but it only lasts for a second or two now rather than the 20+ seconds before.
Conversion of a non distribute volume to a distribute volume leads to the above errors. Fix http://review.gluster.org/3838 for bug 815227 handles this by default adding distribute xlator for any volumes created. The fix should be available in release-3.4 or upstream master. *** This bug has been marked as a duplicate of bug 815227 ***