Created attachment 574405 [details] rebalancing dist-rep Description of problem: Brought down one of the child while rebalance is happening, again bringing up, after rebalance finishes I/O errors on the mount point along with the crash in rebalance Version-Release number of selected component (if applicable): 3.3.0qa32 How reproducible: Steps to Reproduce: 1.create a 2x2 distribute-replicate volume 2.start creating files on the mount point, in mycase i created 5000 files of 50MB each 3.Now add 2 more bricks now the volume is 3x2 dist-rep 4. Initiate the rebalance 5. while rebalance is happening bring down one the brick from of any pair (in my case first child of a pair) 6. After some time bring back the brick by volume start force 7. Let the rebalance finish(status complete), then try I/O on the mount point Actual results: I/O errors on the mount point Expected results: Additional info: Attached the logs.
After if i try to remount the volume mount fails saying file types differs on subvolumes. [2012-04-01 11:05:02.239511] I [afr-common.c:1866:afr_set_root_inode_on_first_lookup] 0-dist-rep-replicate-0: a dded root inode [2012-04-01 11:05:02.239707] E [afr-common.c:1115:afr_lookup_update_lk_counts] (-->/usr/local/lib/glusterfs/3.3.0qa32/xlator/protocol/client.so(client3_1_lookup_cbk+0x6f1) [0x7effee03afd3] (-->/usr/local/lib/glusterfs/3.3.0qa32/xlator/cluster/replicate.so(afr_lookup_cbk+0xb5) [0x7effeddf284a] (-->/usr/local/lib/glusterfs/3.3.0qa32/xlator/cluster/replicate.so(+0x6973d) [0x7effeddf273d]))) 0-: Assertion failed: xattr [2012-04-01 11:05:02.239749] W [dict.c:458:dict_ref] (-->/usr/local/lib/glusterfs/3.3.0qa32/xlator/cluster/replicate.so(afr_lookup_cbk+0xb5) [0x7effeddf284a] (-->/usr/local/lib/glusterfs/3.3.0qa32/xlator/cluster/replicate.so(+0x6975d) [0x7effeddf275d] (-->/usr/local/lib/glusterfs/3.3.0qa32/xlator/cluster/replicate.so(+0x69496) [0x7effeddf2496]))) 0-dict: dict is NULL [2012-04-01 11:05:02.239764] W [afr-common.c:1400:afr_conflicting_iattrs] 0-dist-rep-replicate-0: /: filetype differs on subvolumes (0, 1) [2012-04-01 11:05:02.244120] E [afr-common.c:1115:afr_lookup_update_lk_counts] (-->/usr/local/lib/glusterfs/3.3.0qa32/xlator/protocol/client.so(client3_1_lookup_cbk+0x6f1) [0x7effee03afd3] (-->/usr/local/lib/glusterfs/3.3.0qa32/xlator/cluster/replicate.so(afr_lookup_cbk+0xb5) [0x7effeddf284a] (-->/usr/local/lib/glusterfs/3.3.0qa32/xlator/cluster/replicate.so(+0x6973d) [0x7effeddf273d]))) 0-: Assertion failed: xattr [2012-04-01 11:05:02.244162] W [dict.c:458:dict_ref] (-->/usr/local/lib/glusterfs/3.3.0qa32/xlator/cluster/replicate.so(afr_lookup_cbk+0xb5) [0x7effeddf284a] (-->/usr/local/lib/glusterfs/3.3.0qa32/xlator/cluster/replicate.so(+0x6975d) [0x7effeddf275d] (-->/usr/local/lib/glusterfs/3.3.0qa32/xlator/cluster/replicate.so(+0x69496) [0x7effeddf2496]))) 0-dict: dict is NULL [2012-04-01 11:05:02.244175] W [afr-common.c:1400:afr_conflicting_iattrs] 0-dist-rep-replicate-0: /: filetype differs on subvolumes (0, 1) [2012-04-01 11:05:02.244245] W [fuse-bridge.c:490:fuse_attr_cbk] 0-glusterfs-fuse: 2: LOOKUP() / => -1 (Input/output error) [2012-04-01 11:05:02.252143] I [fuse-bridge.c:3980:fuse_thread_proc] 0-fuse: unmounting /mnt
CHANGE: http://review.gluster.com/3263 (glusterd/rebalance: Switch off afr self heal in rebalance process.) merged in master by Vijay Bellur (vijay)
*** Bug 810103 has been marked as a duplicate of this bug. ***
No I/O error will be seen on the mount point