Description of problem: On a 6x2 distribute-replicate volume added as Storage Domain for VM image store, 2 more replica pairs were added to make it 8x2, and rebalance operation was performed. The Vms stayed online, and the Storage Domain, and Data Center were healthy after the operation. ----------------------------------------------------------- [root@rhs-client45 ~]# gluster volume add-brick virtVOL rhs-client45.lab.eng.blr.redhat.com:/VM_brick4 rhs-client37.lab.eng.blr.redhat.com:/VM_brick4 rhs-client15.lab.eng.blr.redhat.com:/VM_brick4 rhs-client10.lab.eng.blr.redhat.com:/VM_brick4 Add Brick successful [root@rhs-client45 ~]# gluster volume info Volume Name: virtVOL Type: Distributed-Replicate Volume ID: 689aa65d-b49a-42f8-a20f-6bac6e116d6b Status: Started Number of Bricks: 8 x 2 = 16 Transport-type: tcp Bricks: Brick1: rhs-client45.lab.eng.blr.redhat.com:/VM_brick1 Brick2: rhs-client37.lab.eng.blr.redhat.com:/VM_brick1 Brick3: rhs-client15.lab.eng.blr.redhat.com:/VM_brick1 Brick4: rhs-client10.lab.eng.blr.redhat.com:/VM_brick1 Brick5: rhs-client45.lab.eng.blr.redhat.com:/VM_brick2 Brick6: rhs-client37.lab.eng.blr.redhat.com:/VM_brick2 Brick7: rhs-client15.lab.eng.blr.redhat.com:/VM_brick2 Brick8: rhs-client10.lab.eng.blr.redhat.com:/VM_brick2 Brick9: rhs-client45.lab.eng.blr.redhat.com:/VM_brick3 Brick10: rhs-client37.lab.eng.blr.redhat.com:/VM_brick3 Brick11: rhs-client15.lab.eng.blr.redhat.com:/VM_brick3 Brick12: rhs-client10.lab.eng.blr.redhat.com:/VM_brick3 Brick13: rhs-client45.lab.eng.blr.redhat.com:/VM_brick4 Brick14: rhs-client37.lab.eng.blr.redhat.com:/VM_brick4 Brick15: rhs-client15.lab.eng.blr.redhat.com:/VM_brick4 Brick16: rhs-client10.lab.eng.blr.redhat.com:/VM_brick4 Options Reconfigured: storage.owner-gid: 36 storage.owner-uid: 36 network.remote-dio: on cluster.eager-lock: enable performance.stat-prefetch: off performance.io-cache: off performance.read-ahead: off performance.quick-read: off [root@rhs-client45 ~]# gluster volume rebalance virtVOL start Starting rebalance on volume virtVOL has been successful [root@rhs-client45 ~]# gluster volume rebalance virtVOL status Node Rebalanced-files size scanned failures status --------- ----------- ----------- ----------- ----------- ------------ localhost 0 0 8 0 in progress rhs-client10.lab.eng.blr.redhat.com 0 0 24 0 in progress rhs-client15.lab.eng.blr.redhat.com 0 0 1 0 in progress rhs-client37.lab.eng.blr.redhat.com 0 0 24 0 in progress [root@rhs-client45 ~]# gluster volume rebalance virtVOL status Node Rebalanced-files size scanned failures status --------- ----------- ----------- ----------- ----------- ------------ localhost 2 2097152 12 0 in progress rhs-client15.lab.eng.blr.redhat.com 5 3150839 30 4 completed rhs-client37.lab.eng.blr.redhat.com 0 0 25 0 completed rhs-client10.lab.eng.blr.redhat.com 0 0 25 0 completed [root@rhs-client45 ~]# gluster volume rebalance virtVOL status Node Rebalanced-files size scanned failures status --------- ----------- ----------- ----------- ----------- ------------ localhost 10 48322580942 36 1 completed rhs-client37.lab.eng.blr.redhat.com 0 0 25 0 completed rhs-client15.lab.eng.blr.redhat.com 5 3150839 30 4 completed rhs-client10.lab.eng.blr.redhat.com 0 0 25 0 completed ----------------------------------------------------------- Then the remove-brick operation was performed to remove one replica pair. It seemed to go well, from the status messages, and the VMs stayed online. ----------------------------------------------------------- [root@rhs-client45 ~]# gluster volume remove-brick virtVOL rhs-client45.lab.eng.blr.redhat.com:/VM_brick4 rhs-client37.lab.eng.blr.redhat.com:/VM_brick4 rhs-client15.lab.eng.blr.redhat.com:/VM_brick4 rhs-client10.lab.eng.blr.redhat.com:/VM_brick4 Removing brick(s) can result in data loss. Do you want to Continue? (y/n) n [root@rhs-client45 ~]# gluster volume remove-brick virtVOL rhs-client45.lab.eng.blr.redhat.com:/VM_brick4 rhs-client37.lab.eng.blr.redhat.com:/VM_brick4 rhs-client15.lab.eng.blr.redhat.com:/VM_brick4 rhs-client10.lab.eng.blr.redhat.com:/VM_brick4 start Bricks not from same subvol for replica [root@rhs-client45 ~]# gluster volume remove-brick virtVOL rhs-client45.lab.eng.blr.redhat.com:/VM_brick4 rhs-client37.lab.eng.blr.redhat.com:/VM_brick4 start Remove Brick start successful [root@rhs-client45 ~]# [root@rhs-client45 ~]# gluster volume remove-brick virtVOL rhs-client45.lab.eng.blr.redhat.com:/VM_brick4 rhs-client37.lab.eng.blr.redhat.com:/VM_brick4 status Node Rebalanced-files size scanned failures status --------- ----------- ----------- ----------- ----------- ------------ localhost 2 1048576 20 0 in progress rhs-client37.lab.eng.blr.redhat.com 0 0 28 0 completed rhs-client15.lab.eng.blr.redhat.com 5 3150839 30 4 not started rhs-client10.lab.eng.blr.redhat.com 0 0 25 0 not started [root@rhs-client45 ~]# gluster volume remove-brick virtVOL rhs-client45.lab.eng.blr.redhat.com:/VM_brick4 rhs-client37.lab.eng.blr.redhat.com:/VM_brick4 status Node Rebalanced-files size scanned failures status --------- ----------- ----------- ----------- ----------- ------------ localhost 9 10739796172 31 0 completed rhs-client37.lab.eng.blr.redhat.com 0 0 28 0 completed rhs-client15.lab.eng.blr.redhat.com 5 3150839 30 4 not started rhs-client10.lab.eng.blr.redhat.com 0 0 25 0 not started [root@rhs-client45 ~]# gluster volume remove-brick virtVOL rhs-client45.lab.eng.blr.redhat.com:/VM_brick4 rhs-client37.lab.eng.blr.redhat.com:/VM_brick4 commit Removing brick(s) can result in data loss. Do you want to Continue? (y/n) y Remove Brick commit successful [root@rhs-client45 ~]# gluster volume remove-brick virtVOL rhs-client45.lab.eng.blr.redhat.com:/VM_brick4 rhs-client37.lab.eng.blr.redhat.com:/VM_brick4 status Node Rebalanced-files size scanned failures status --------- ----------- ----------- ----------- ----------- ------------ localhost 9 10739796172 31 0 not started rhs-client37.lab.eng.blr.redhat.com 0 0 28 0 not started rhs-client10.lab.eng.blr.redhat.com 0 0 25 0 not started rhs-client15.lab.eng.blr.redhat.com 5 3150839 30 4 not started [root@rhs-client45 ~]# gluster volume info Volume Name: virtVOL Type: Distributed-Replicate Volume ID: 689aa65d-b49a-42f8-a20f-6bac6e116d6b Status: Started Number of Bricks: 7 x 2 = 14 Transport-type: tcp Bricks: Brick1: rhs-client45.lab.eng.blr.redhat.com:/VM_brick1 Brick2: rhs-client37.lab.eng.blr.redhat.com:/VM_brick1 Brick3: rhs-client15.lab.eng.blr.redhat.com:/VM_brick1 Brick4: rhs-client10.lab.eng.blr.redhat.com:/VM_brick1 Brick5: rhs-client45.lab.eng.blr.redhat.com:/VM_brick2 Brick6: rhs-client37.lab.eng.blr.redhat.com:/VM_brick2 Brick7: rhs-client15.lab.eng.blr.redhat.com:/VM_brick2 Brick8: rhs-client10.lab.eng.blr.redhat.com:/VM_brick2 Brick9: rhs-client45.lab.eng.blr.redhat.com:/VM_brick3 Brick10: rhs-client37.lab.eng.blr.redhat.com:/VM_brick3 Brick11: rhs-client15.lab.eng.blr.redhat.com:/VM_brick3 Brick12: rhs-client10.lab.eng.blr.redhat.com:/VM_brick3 Brick13: rhs-client15.lab.eng.blr.redhat.com:/VM_brick4 Brick14: rhs-client10.lab.eng.blr.redhat.com:/VM_brick4 Options Reconfigured: storage.owner-gid: 36 storage.owner-uid: 36 network.remote-dio: on cluster.eager-lock: enable performance.stat-prefetch: off performance.io-cache: off performance.read-ahead: off performance.quick-read: off [root@rhs-client45 ~]# gluster volume status Status of volume: virtVOL Gluster process Port Online Pid ------------------------------------------------------------------------------ Brick rhs-client45.lab.eng.blr.redhat.com:/VM_brick1 24009 Y 6694 Brick rhs-client37.lab.eng.blr.redhat.com:/VM_brick1 24009 Y 6650 Brick rhs-client15.lab.eng.blr.redhat.com:/VM_brick1 24009 Y 6658 Brick rhs-client10.lab.eng.blr.redhat.com:/VM_brick1 24009 Y 6666 Brick rhs-client45.lab.eng.blr.redhat.com:/VM_brick2 24010 Y 6700 Brick rhs-client37.lab.eng.blr.redhat.com:/VM_brick2 24010 Y 6655 Brick rhs-client15.lab.eng.blr.redhat.com:/VM_brick2 24010 Y 6664 Brick rhs-client10.lab.eng.blr.redhat.com:/VM_brick2 24010 Y 6671 Brick rhs-client45.lab.eng.blr.redhat.com:/VM_brick3 24011 Y 6705 Brick rhs-client37.lab.eng.blr.redhat.com:/VM_brick3 24011 Y 6661 Brick rhs-client15.lab.eng.blr.redhat.com:/VM_brick3 24011 Y 6669 Brick rhs-client10.lab.eng.blr.redhat.com:/VM_brick3 24011 Y 6677 Brick rhs-client15.lab.eng.blr.redhat.com:/VM_brick4 24012 Y 7276 Brick rhs-client10.lab.eng.blr.redhat.com:/VM_brick4 24012 Y 7284 NFS Server on localhost 38467 Y 8433 Self-heal Daemon on localhost N/A Y 8439 NFS Server on rhs-client37.lab.eng.blr.redhat.com 38467 Y 8271 Self-heal Daemon on rhs-client37.lab.eng.blr.redhat.com N/A Y 8277 NFS Server on rhs-client15.lab.eng.blr.redhat.com 38467 Y 8349 Self-heal Daemon on rhs-client15.lab.eng.blr.redhat.com N/A Y 8355 NFS Server on rhs-client10.lab.eng.blr.redhat.com 38467 Y 8340 Self-heal Daemon on rhs-client10.lab.eng.blr.redhat.com N/A Y 8346 [root@rhs-client45 ~]# ----------------------------------------------------------- Suddenly the Data Center and the Storage Domain were brought down, with invalid status messages, and one of the Hypervisors was brought to non-responsive status. However the VMs were accessible. ----------------------------------------------------------- 2013-Mar-26, 23:49 Invalid status on Data Center SpaceMan. Setting status to Non-Responsive. 2013-Mar-26, 23:49 Invalid status on Data Center SpaceMan. Setting status to Non-Responsive. 2013-Mar-26, 23:48 Invalid status on Data Center SpaceMan. Setting status to Non-Responsive. 2013-Mar-26, 23:48 Invalid status on Data Center SpaceMan. Setting status to Non-Responsive. 2013-Mar-26, 23:47 Invalid status on Data Center SpaceMan. Setting status to Non-Responsive. 2013-Mar-26, 23:47 Invalid status on Data Center SpaceMan. Setting status to Non-Responsive. 2013-Mar-26, 23:46 Invalid status on Data Center SpaceMan. Setting status to Non-Responsive. 2013-Mar-26, 23:46 Invalid status on Data Center SpaceMan. Setting status to Non-Responsive. 2013-Mar-26, 23:45 Invalid status on Data Center SpaceMan. Setting status to Non-Responsive. 2013-Mar-26, 23:45 Failed to connect Host RHEV-H-6.4-rhs-gp-srv12 to Storage Pool SpaceMan 2013-Mar-26, 23:45 Host RHEV-H-6.4-rhs-gp-srv12 cannot access one of the Storage Domains attached to the Data Center SpaceMan. Setting Host state to Non-Operational. 2013-Mar-26, 23:45 Invalid status on Data Center SpaceMan. Setting status to Non-Responsive. 2013-Mar-26, 23:45 Detected new Host RHEV-H-6.4-rhs-gp-srv12. Host state was set to Up. 2013-Mar-26, 23:43 Invalid status on Data Center SpaceMan. Setting status to Non-Responsive. 2013-Mar-26, 23:43 Invalid status on Data Center SpaceMan. Setting status to Non-Responsive. 2013-Mar-26, 23:42 Invalid status on Data Center SpaceMan. Setting status to Non-Responsive. 2013-Mar-26, 23:42 Invalid status on Data Center SpaceMan. Setting status to Non-Responsive. 2013-Mar-26, 23:41 Invalid status on Data Center SpaceMan. Setting status to Non-Responsive. 2013-Mar-26, 23:41 Invalid status on Data Center SpaceMan. Setting status to Non-Responsive. 2013-Mar-26, 23:40 Invalid status on Data Center SpaceMan. Setting status to Non-Responsive. 2013-Mar-26, 23:40 Invalid status on Data Center SpaceMan. Setting status to Non-Responsive. 2013-Mar-26, 23:40 Failed to connect Host RHEV-H-6.4-rhs-gp-srv12 to Storage Pool SpaceMan 2013-Mar-26, 23:40 Detected new Host RHEV-H-6.4-rhs-gp-srv12. Host state was set to Up. 2013-Mar-26, 23:40 Host RHEV-H-6.4-rhs-gp-srv12 was autorecovered. 2013-Mar-26, 23:39 VM yabadaba03 is down. Exit message: Migration succeeded 2013-Mar-26, 23:39 VM writebackVM is down. Exit message: Migration succeeded 2013-Mar-26, 23:39 Migration complete (VM: yabadaba03, Source Host: RHEV-H-6.4-rhs-gp-srv12). 2013-Mar-26, 23:39 Migration complete (VM: writebackVM, Source Host: RHEV-H-6.4-rhs-gp-srv12). 2013-Mar-26, 23:39 Invalid status on Data Center SpaceMan. Setting status to Non-Responsive. 2013-Mar-26, 23:39 Failed to connect Host RHEV-H-6.4-rhs-gp-srv12 to Storage Pool SpaceMan 2013-Mar-26, 23:39 Power Management test failed for Host RHEV-H-6.4-rhs-gp-srv12.There is no other host in the data center that can be used to test the power management settings. 2013-Mar-26, 23:39 Host RHEV-H-6.4-rhs-gp-srv12 cannot access one of the Storage Domains attached to the Data Center SpaceMan. Setting Host state to Non-Operational. 2013-Mar-26, 23:39 Invalid status on Data Center SpaceMan. Setting status to Non-Responsive. 2013-Mar-26, 23:39 Detected new Host RHEV-H-6.4-rhs-gp-srv12. Host state was set to Up. 2013-Mar-26, 23:39 Host RHEV-H-6.4-rhs-gp-srv12 is initializing. Message: Recovering from crash or Initializing 2013-Mar-26, 23:39 Invalid status on Data Center SpaceMan. Setting Data Center status to Non-Responsive (On host RHEV-H-6.4-rhs-gp-srv12, Error: Error marking master storage domain). ----------------------------------------------------------- All attempts to revive the Storage Domain and Data Center were futile. Then I shutdown all the VMs, and rebooted the Hypervisors. That too did not bring any relief. The VMs now were not bootable, since the master Storage Domain was down. After about 40 minutes, the Storage Domain and the Data Center recovered itself, and I brought back the VMs online. To try issue reproduction, I went ahead and removed another replica pair. But everything went smooth this time, with no issues at all. ----------------------------------------------------------- [root@rhs-client45 ~]# gluster volume info Volume Name: virtVOL Type: Distributed-Replicate Volume ID: 689aa65d-b49a-42f8-a20f-6bac6e116d6b Status: Started Number of Bricks: 7 x 2 = 14 Transport-type: tcp Bricks: Brick1: rhs-client45.lab.eng.blr.redhat.com:/VM_brick1 Brick2: rhs-client37.lab.eng.blr.redhat.com:/VM_brick1 Brick3: rhs-client15.lab.eng.blr.redhat.com:/VM_brick1 Brick4: rhs-client10.lab.eng.blr.redhat.com:/VM_brick1 Brick5: rhs-client45.lab.eng.blr.redhat.com:/VM_brick2 Brick6: rhs-client37.lab.eng.blr.redhat.com:/VM_brick2 Brick7: rhs-client15.lab.eng.blr.redhat.com:/VM_brick2 Brick8: rhs-client10.lab.eng.blr.redhat.com:/VM_brick2 Brick9: rhs-client45.lab.eng.blr.redhat.com:/VM_brick3 Brick10: rhs-client37.lab.eng.blr.redhat.com:/VM_brick3 Brick11: rhs-client15.lab.eng.blr.redhat.com:/VM_brick3 Brick12: rhs-client10.lab.eng.blr.redhat.com:/VM_brick3 Brick13: rhs-client15.lab.eng.blr.redhat.com:/VM_brick4 Brick14: rhs-client10.lab.eng.blr.redhat.com:/VM_brick4 Options Reconfigured: storage.owner-gid: 36 storage.owner-uid: 36 network.remote-dio: on cluster.eager-lock: enable performance.stat-prefetch: off performance.io-cache: off performance.read-ahead: off performance.quick-read: off [root@rhs-client45 ~]# gluster volume remove-brick virtVOL rhs-client15.lab.eng.blr.redhat.com:/VM_brick4 rhs-client10.lab.eng.blr.redhat.com:/VM_brick4 start Remove Brick start successful [root@rhs-client45 ~]# gluster volume remove-brick virtVOL rhs-client15.lab.eng.blr.redhat.com:/VM_brick4 rhs-client10.lab.eng.blr.redhat.com:/VM_brick4 status Node Rebalanced-files size scanned failures status --------- ----------- ----------- ----------- ----------- ------------ localhost 9 10739796172 31 0 not started rhs-client10.lab.eng.blr.redhat.com 0 0 28 0 completed rhs-client37.lab.eng.blr.redhat.com 0 0 28 0 not started rhs-client15.lab.eng.blr.redhat.com 4 3146298 8 0 in progress [root@rhs-client45 ~]# gluster volume remove-brick virtVOL rhs-client15.lab.eng.blr.redhat.com:/VM_brick4 rhs-client10.lab.eng.blr.redhat.com:/VM_brick4 status Node Rebalanced-files size scanned failures status --------- ----------- ----------- ----------- ----------- ------------ localhost 9 10739796172 31 0 not started rhs-client15.lab.eng.blr.redhat.com 16 21499751897 28 0 completed rhs-client37.lab.eng.blr.redhat.com 0 0 28 0 not started rhs-client10.lab.eng.blr.redhat.com 0 0 28 0 completed [root@rhs-client45 ~]# gluster volume remove-brick virtVOL rhs-client15.lab.eng.blr.redhat.com:/VM_brick4 rhs-client10.lab.eng.blr.redhat.com:/VM_brick4 commit Removing brick(s) can result in data loss. Do you want to Continue? (y/n) y Remove Brick commit successful [root@rhs-client45 ~]# gluster volume info Volume Name: virtVOL Type: Distributed-Replicate Volume ID: 689aa65d-b49a-42f8-a20f-6bac6e116d6b Status: Started Number of Bricks: 6 x 2 = 12 Transport-type: tcp Bricks: Brick1: rhs-client45.lab.eng.blr.redhat.com:/VM_brick1 Brick2: rhs-client37.lab.eng.blr.redhat.com:/VM_brick1 Brick3: rhs-client15.lab.eng.blr.redhat.com:/VM_brick1 Brick4: rhs-client10.lab.eng.blr.redhat.com:/VM_brick1 Brick5: rhs-client45.lab.eng.blr.redhat.com:/VM_brick2 Brick6: rhs-client37.lab.eng.blr.redhat.com:/VM_brick2 Brick7: rhs-client15.lab.eng.blr.redhat.com:/VM_brick2 Brick8: rhs-client10.lab.eng.blr.redhat.com:/VM_brick2 Brick9: rhs-client45.lab.eng.blr.redhat.com:/VM_brick3 Brick10: rhs-client37.lab.eng.blr.redhat.com:/VM_brick3 Brick11: rhs-client15.lab.eng.blr.redhat.com:/VM_brick3 Brick12: rhs-client10.lab.eng.blr.redhat.com:/VM_brick3 Options Reconfigured: storage.owner-gid: 36 storage.owner-uid: 36 network.remote-dio: on cluster.eager-lock: enable performance.stat-prefetch: off performance.io-cache: off performance.read-ahead: off performance.quick-read: off [root@rhs-client45 ~]# ----------------------------------------------------------- Version-Release number of selected component (if applicable): RHEV-M: 3.1.0-50.el6ev Hypervisors: RHEV-H 6.4 RHEL 6.4 RHEL 6.3 RHS: RHS-2.0-20130320.2-RHS-x86_64-DVD1.iso glusterfs-server-3.3.0.7rhs-1.el6rhs.x86_64 How reproducible: Occurred once Not sure if reproducible Actual results: remove-brick operation led to the Data Center and Storage Domain being brought down for about 40 minutes Expected results: the remove-brick operation must not impact the availability of the Data Center and Storage Domain Additional info:
Looks like ops are failing with permission denied issues, which is leading to failures. [2013-03-27 00:21:11.045151] W [fuse-bridge.c:725:fuse_fd_cbk] 0-glusterfs-fuse: 2401: OPEN() /79f2acbd-6f1c-4976-a8e6-c82a0073b6bb/dom_md/ids => -1 (Invalid argument) [2013-03-27 00:21:17.454194] I [afr-self-heal-entry.c:2309:afr_sh_entry_fix] 0-virtVOL-replicate-2: /79f2acbd-6f1c-4976-a8e6-c82a0073b6bb/dom_md: Performing conservative merge [2013-03-27 00:21:17.454300] I [afr-self-heal-entry.c:2309:afr_sh_entry_fix] 0-virtVOL-replicate-0: /79f2acbd-6f1c-4976-a8e6-c82a0073b6bb/dom_md: Performing conservative merge [2013-03-27 00:21:17.466916] I [dht-common.c:997:dht_lookup_everywhere_cbk] 0-virtVOL-dht: deleting stale linkfile /79f2acbd-6f1c-4976-a8e6-c82a0073b6bb/dom_md/ids on virtVOL-replicate-2 [2013-03-27 00:21:17.467590] W [client3_1-fops.c:651:client3_1_unlink_cbk] 0-virtVOL-client-4: remote operation failed: Permission denied [2013-03-27 00:21:17.467630] W [client3_1-fops.c:651:client3_1_unlink_cbk] 0-virtVOL-client-5: remote operation failed: Permission denied [2013-03-27 00:21:17.468303] W [client3_1-fops.c:258:client3_1_mknod_cbk] 0-virtVOL-client-0: remote operation failed: Permission denied. Path: /79f2acbd-6f1c-4976-a8e6-c82a0073b6bb/dom_md/ids (e5572401-ce56-4c8 2-a4c1-5f54f6948f44) [2013-03-27 00:21:17.468345] W [client3_1-fops.c:258:client3_1_mknod_cbk] 0-virtVOL-client-1: remote operation failed: Permission denied. Path: /79f2acbd-6f1c-4976-a8e6-c82a0073b6bb/dom_md/ids (e5572401-ce56-4c8 2-a4c1-5f54f6948f44)
The problem lies in dht_discover_cbk returning EINVAL on ENOENT errors on newly added bricks. It should trigger selfheal on these bricks. Looks like a duplicate of bug 924572 [2013-03-26 18:15:03.372833] I [dht-layout.c:611:dht_layout_normalize] 1-virtVOL-dht: found anomalies in /79f2acbd-6f1c-4976-a8e6-c82a0073b6bb/dom_md. holes=1 overlaps=1 [2013-03-26 18:15:03.377632] I [dht-layout.c:611:dht_layout_normalize] 1-virtVOL-dht: found anomalies in <gfid:4099f439-fc15-4379-bf25-8c15c401952d>. holes=0 overlaps=1 [2013-03-26 18:15:03.377691] W [fuse-resolve.c:152:fuse_resolve_gfid_cbk] 0-fuse: 4099f439-fc15-4379-bf25-8c15c401952d: failed to resolve (Invalid argument) [2013-03-26 18:15:03.377707] E [fuse-bridge.c:555:fuse_getattr_resume] 0-glusterfs-fuse: 689633: GETATTR 139816465486164 (4099f439-fc15-4379-bf25-8c15c401952d) resolution failed
Marking it as duplicate of bug 924572, as the root cause is dht_discover_complete returning EINVAL errors when layout anomalies were found. *** This bug has been marked as a duplicate of bug 924572 ***