Bug 859387 - [RHEV-RHS] Rebalance migration failures are seen when replicate bricks are brought down and restarted
[RHEV-RHS] Rebalance migration failures are seen when replicate bricks are b...
Status: CLOSED ERRATA
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: glusterfs (Show other bugs)
unspecified
x86_64 Linux
high Severity medium
: ---
: ---
Assigned To: shishir gowda
shylesh
: Reopened
: 820518 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2012-09-21 08:12 EDT by shylesh
Modified: 2013-12-08 20:33 EST (History)
12 users (show)

See Also:
Fixed In Version: glusterfs-3.3.0.5rhs-40
Doc Type: Bug Fix
Doc Text:
Cause: In a afr setup, when readdir request is handled by one of the subvolume. If that brick goes down, the readdir request is handled from the other brick. The file entry offsets might differ in the new subvolume, which might make rebalance revisit files which were already migrated Consequence: Rebalance errors reported through failure count in cli status output Fix: Disable afr to failover to other subvolume for rebalance process. If the subvolume goes down, we just error out Result:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-03-28 18:25:50 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
sgowda: needinfo-


Attachments (Terms of Use)
rebalance-migration failure logs (118.46 KB, text/x-log)
2012-09-21 08:12 EDT, shylesh
no flags Details
rebalance failures logs (1.89 MB, text/x-log)
2012-10-25 05:54 EDT, shylesh
no flags Details

  None (edit)
Description shylesh 2012-09-21 08:12:25 EDT
Created attachment 615411 [details]
rebalance-migration failure logs

Description of problem:
Created a distribute-replicate volume of 2x2 configuration this was used as VM store, added one more pair while rebalancing there are failures in migration

Version-Release number of selected component (if applicable):
RHS 2.0

How reproducible:


Steps to Reproduce:
1. Crated a 2x2 distribute and used this as VM store
2. added 2 more bricks , configuration becomes 3x2 dist-replicate
3. started rebalance
4. while rebalance is in progress brought down one of the pair from the newly added brick
5. brought back the brick after some time 
  
Actual results:
Rebalance migration started failing 

Expected results:


Additional info:

attached the complete log
failures: 1
[2012-09-21 07:01:10.526936] C [client-handshake.c:126:rpc_client_ping_timer_expired] 0-distrep2-client-1: server 10.70.36.15:24009 has not responded in the last 42 seconds, disconnecting.
[2012-09-21 07:01:10.527129] E [rpc-clnt.c:373:saved_frames_unwind] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x78) [0x7f411e5fc818] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xb0) [0x7f411e5fc4d0] (-->/usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe) [0x7f411e5fbf3e]))) 0-distrep2-client-1: forced unwinding frame type(GlusterFS 3.1) op(READ(12)) called at 2012-09-21 06:59:46.574034 (xid=0x363x)
[2012-09-21 07:01:10.527168] W [client3_1-fops.c:2700:client3_1_readv_cbk] 0-distrep2-client-1: remote operation failed: Transport endpoint is not connected
[2012-09-21 07:01:10.527218] E [rpc-clnt.c:373:saved_frames_unwind] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x78) [0x7f411e5fc818] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xb0) [0x7f411e5fc4d0] (-->/usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe) [0x7f411e5fbf3e]))) 0-distrep2-client-1: forced unwinding frame type(GlusterFS 3.1) op(SETXATTR(17)) called at 2012-09-21 06:59:46.574489 (xid=0x366x)
[2012-09-21 07:01:10.527252] W [client3_1-fops.c:992:client3_1_setxattr_cbk] 0-distrep2-client-1: remote operation failed: Transport endpoint is not connected
[2012-09-21 07:01:10.527275] E [dht-rebalance.c:721:dht_migrate_file] 0-distrep2-dht: /9c17fd91-2e28-463d-b9b3-93fcd9a77679/dom_md/metadata: failed to migrate data
[2012-09-21 07:01:10.531314] I [socket.c:2315:socket_submit_request] 0-distrep2-client-1: not connected (priv->connected = 0)
[2012-09-21 07:01:10.531345] W [rpc-clnt.c:1498:rpc_clnt_submit] 0-distrep2-client-1: failed to submit rpc-request (XID: 0x373x Program: GlusterFS 3.1, ProgVers: 330, Proc: 29) to rpc-transport (distrep2-client-1)
[2012-09-21 07:01:10.531378] W [client3_1-fops.c:1495:client3_1_inodelk_cbk] 0-distrep2-client-1: remote operation failed: Transport endpoint is not connected
[2012-09-21 07:01:10.531426] E [rpc-clnt.c:373:saved_frames_unwind] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x78) [0x7f411e5fc818] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xb0) [0x7f411e5fc4d0] (-->/usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe) [0x7f411e5fbf3e]))) 0-distrep2-client-1: forced unwinding frame type(GlusterFS 3.1) op(READDIR(28)) called at 2012-09-21 06:59:46.574516 (xid=0x367x)
[2012-09-21 07:01:10.531443] W [client3_1-fops.c:2250:client3_1_readdir_cbk] 0-distrep2-client-1: remote operation failed: Transport endpoint is not connected remote_fd = 20
[2012-09-21 07:01:10.531453] I [afr-dir-read.c:117:afr_examine_dir_readdir_cbk] 0-distrep2-replicate-0: /9c17fd91-2e28-463d-b9b3-93fcd9a77679/images/8179fad6-6fe8-4f19-9fdc-76592eef53d8: failed to do opendir on distrep2-client-1
[2012-09-21 07:01:10.531501] E [rpc-clnt.c:373:saved_frames_unwind] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x78) [0x7f411e5fc818] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xb0) [0x7f411e5fc4d0] (-->/usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe) [0x7f411e5fbf3e]))) 0-distrep2-client-1: forced unwinding frame type(GlusterFS 3.1) op(OPEN(11)) called at 2012-09-21 06:59:46.574574 (xid=0x368x)
[2012-09-21 07:01:10.531519] W [client3_1-fops.c:418:client3_1_open_cbk] 0-distrep2-client-1: remote operation failed: Transport endpoint is not connected. Path: /9c17fd91-2e28-463d-b9b3-93fcd9a77679/dom_md/metadata (00000000-0000-0000-0000-000000000000)
[2012-09-21 07:01:10.531580] E [rpc-clnt.c:373:saved_frames_unwind] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x78) [0x7f411e5fc818] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xb0) [0x7f411e5fc4d0] (-->/usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe) [0x7f411e5fbf3e]))) 0-distrep2-client-1: forced unwinding frame type(GlusterFS 3.1) op(LOOKUP(27)) called at 2012-09-21 06:59:46.575343 (xid=0x369x)
[2012-09-21 07:01:10.531598] W [client3_1-fops.c:2630:client3_1_lookup_cbk] 0-distrep2-client-1: remote operation failed: Transport endpoint is not connected. Path: /9c17fd91-2e28-463d-b9b3-93fcd9a77679/dom_md/metadata (00000000-0000-0000-0000-000000000000)
[2012-09-21 07:01:10.531625] W [rpc-clnt.c:1498:rpc_clnt_submit] 0-distrep2-client-1: failed to submit rpc-request (XID: 0x374x Program: GlusterFS 3.1, ProgVers: 330, Proc: 29) to rpc-transport (distrep2-client-1)
[2012-09-21 07:01:10.531663] W [client3_1-fops.c:1495:client3_1_inodelk_cbk] 0-distrep2-client-1: remote operation failed: Transport endpoint is not connected
[2012-09-21 07:01:10.531672] E [rpc-clnt.c:373:saved_frames_unwind] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x78) [0x7f411e5fc818] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xb0) [0x7f411e5fc4d0] (-->/usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe) [0x7f411e5fbf3e]))) 0-distrep2-client-1: forced unwinding frame type(GlusterFS 3.1) op(STATFS(14)) called at 2012-09-21 06:59:46.576251 (xid=0x370x)
[2012-09-21 07:01:10.531704] W [client3_1-fops.c:763:client3_1_statfs_cbk] 0-distrep2-client-1: remote operation failed: Transport endpoint is not connected
[2012-09-21 07:01:10.531745] E [rpc-clnt.c:373:saved_frames_unwind] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x78) [0x7f411e5fc818] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xb0) [0x7f411e5fc4d0] (-->/usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe) [0x7f411e5fbf3e]))) 0-distrep2-client-1: forced unwinding frame type(GlusterFS 3.1) op(LOOKUP(27)) called at 2012-09-21 06:59:53.524024 (xid=0x371x)
[2012-09-21 07:01:10.531761] W [client3_1-fops.c:2630:client3_1_lookup_cbk] 0-distrep2-client-1: remote operation failed: Transport endpoint is not connected. Path: / (00000000-0000-0000-0000-000000000001)
[2012-09-21 07:01:10.531844] W [rpc-clnt.c:1498:rpc_clnt_submit] 0-distrep2-client-1: failed to submit rpc-request (XID: 0x375x Program: GlusterFS 3.1, ProgVers: 330, Proc: 27) to rpc-transport (distrep2-client-1)
[2012-09-21 07:01:10.531864] W [client3_1-fops.c:2630:client3_1_lookup_cbk] 0-distrep2-client-1: remote operation failed: Transport endpoint is not connected. Path: / (00000000-0000-0000-0000-000000000001)
[2012-09-21 07:01:10.532065] E [rpc-clnt.c:373:saved_frames_unwind] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x78) [0x7f411e5fc818] (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xb0) [0x7f411e5fc4d0] (-->/usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe) [0x7f411e5fbf3e]))) 0-distrep2-client-1: forced unwinding frame type(GlusterFS Handshake) op(PING(3)) called at 2012-09-21 07:00:28.521070 (xid=0x372x)
[2012-09-21 07:01:10.532099] W [client-handshake.c:275:client_ping_cbk] 0-distrep2-client-1: timer must have expired
[2012-09-21 07:01:10.532119] I [client.c:2090:client_rpc_notify] 0-distrep2-client-1: disconnected
[2012-09-21 07:01:10.532529] I [client-handshake.c:1636:select_server_supported_programs] 0-distrep2-client-1: Using Program GlusterFS 3.3.0rhsvirt1, Num (1298437), Version (330)
[2012-09-21 07:01:10.532806] I [dht-common.c:2337:dht_setxattr] 0-distrep2-dht: fixing the layout of /
[2012-09-21 07:01:10.532848] I [client-handshake.c:1433:client_setvolume_cbk] 0-distrep2-client-1: Connected to 10.70.36.15:24009, attached to remote volume '/disk2'.
[2012-09-21 07:01:10.532869] I [client-handshake.c:1445:client_setvolume_cbk] 0-distrep2-client-1: Server and Client lk-version numbers are not same, reopening the fds
[2012-09-21 07:01:10.532879] I [client-handshake.c:1282:client_post_handshake] 0-distrep2-client-1: 21 fds open - Delaying child_up until they are re-opened
[2012-09-21 07:01:10.532959] E [inode.c:1090:__inode_path] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(client_post_handshake+0x145) [0x7f411a2b3fa5] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(protocol_client_reopen+0xd7) [0x7f411a2b3657] (-->/usr/lib64/libglusterfs.so.0(inode_path+0x4a) [0x7f411e830b0a]))) 0-: Assertion failed: 0
[2012-09-21 07:01:10.533005] W [inode.c:1091:__inode_path] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(client_post_handshake+0x145) [0x7f411a2b3fa5] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(protocol_client_reopen+0xd7) [0x7f411a2b3657] (-->/usr/lib64/libglusterfs.so.0(inode_path+0x4a) [0x7f411e830b0a]))) 0-distrep2-client-1: invalid inode
[2012-09-21 07:01:10.533019] W [client-handshake.c:1187:protocol_client_reopen] 0-distrep2-client-1: couldn't build path from inode 00000000-0000-0000-0000-000000000000
[2012-09-21 07:01:10.533172] E [inode.c:1090:__inode_path] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(client_post_handshake+0x110) [0x7f411a2b3f70] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(protocol_client_reopendir+0x107) [0x7f411a2b3be7] (-->/usr/lib64/libglusterfs.so.0(inode_path+0x4a) [0x7f411e830b0a]))) 0-: Assertion failed: 0
[2012-09-21 07:01:10.533227] W [inode.c:1091:__inode_path] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(client_post_handshake+0x110) [0x7f411a2b3f70] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(protocol_client_reopendir+0x107) [0x7f411a2b3be7] (-->/usr/lib64/libglusterfs.so.0(inode_path+0x4a) [0x7f411e830b0a]))) 0-distrep2-client-1: invalid inode
[2012-09-21 07:01:10.533264] W [client-handshake.c:1108:protocol_client_reopendir] 0-distrep2-client-1: couldn't build path from inode 00000000-0000-0000-0000-000000000000
[2012-09-21 07:01:10.533322] E [inode.c:1090:__inode_path] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(client_post_handshake+0x110) [0x7f411a2b3f70] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(protocol_client_reopendir+0x107) [0x7f411a2b3be7] (-->/usr/lib64/libglusterfs.so.0(inode_path+0x4a) [0x7f411e830b0a]))) 0-: Assertion failed: 0
[2012-09-21 07:01:10.533380] W [inode.c:1091:__inode_path] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(client_post_handshake+0x110) [0x7f411a2b3f70] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(protocol_client_reopendir+0x107) [0x7f411a2b3be7] (-->/usr/lib64/libglusterfs.so.0(inode_path+0x4a) [0x7f411e830b0a]))) 0-distrep2-client-1: invalid inode
[2012-09-21 07:01:10.533397] W [client-handshake.c:1108:protocol_client_reopendir] 0-distrep2-client-1: couldn't build path from inode 00000000-0000-0000-0000-000000000000
[2012-09-21 07:01:10.533430] E [inode.c:1090:__inode_path] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(client_post_handshake+0x110) [0x7f411a2b3f70] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(protocol_client_reopendir+0x107) [0x7f411a2b3be7] (-->/usr/lib64/libglusterfs.so.0(inode_path+0x4a) [0x7f411e830b0a]))) 0-: Assertion failed: 0
[2012-09-21 07:01:10.533476] W [inode.c:1091:__inode_path] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(client_post_handshake+0x110) [0x7f411a2b3f70] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(protocol_client_reopendir+0x107) [0x7f411a2b3be7] (-->/usr/lib64/libglusterfs.so.0(inode_path+0x4a) [0x7f411e830b0a]))) 0-distrep2-client-1: invalid inode
[2012-09-21 07:01:10.533491] W [client-handshake.c:1108:protocol_client_reopendir] 0-distrep2-client-1: couldn't build path from inode 00000000-0000-0000-0000-000000000000
[2012-09-21 07:01:10.533522] E [inode.c:1090:__inode_path] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(client_post_handshake+0x110) [0x7f411a2b3f70] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(protocol_client_reopendir+0x107) [0x7f411a2b3be7] (-->/usr/lib64/libglusterfs.so.0(inode_path+0x4a) [0x7f411e830b0a]))) 0-: Assertion failed: 0
[2012-09-21 07:01:10.533555] W [inode.c:1091:__inode_path] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(client_post_handshake+0x110) [0x7f411a2b3f70] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(protocol_client_reopendir+0x107) [0x7f411a2b3be7] (-->/usr/lib64/libglusterfs.so.0(inode_path+0x4a) [0x7f411e830b0a]))) 0-distrep2-client-1: invalid inode
[2012-09-21 07:01:10.533570] W [client-handshake.c:1108:protocol_client_reopendir] 0-distrep2-client-1: couldn't build path from inode 00000000-0000-0000-0000-000000000000
[2012-09-21 07:01:10.533601] E [inode.c:1090:__inode_path] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(client_post_handshake+0x110) [0x7f411a2b3f70] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(protocol_client_reopendir+0x107) [0x7f411a2b3be7] (-->/usr/lib64/libglusterfs.so.0(inode_path+0x4a) [0x7f411e830b0a]))) 0-: Assertion failed: 0
[2012-09-21 07:01:10.533640] W [inode.c:1091:__inode_path] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(client_post_handshake+0x110) [0x7f411a2b3f70] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(protocol_client_reopendir+0x107) [0x7f411a2b3be7] (-->/usr/lib64/libglusterfs.so.0(inode_path+0x4a) [0x7f411e830b0a]))) 0-distrep2-client-1: invalid inode
[2012-09-21 07:01:10.533655] W [client-handshake.c:1108:protocol_client_reopendir] 0-distrep2-client-1: couldn't build path from inode 00000000-0000-0000-0000-000000000000
[2012-09-21 07:01:10.533686] E [inode.c:1090:__inode_path] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(client_post_handshake+0x110) [0x7f411a2b3f70] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(protocol_client_reopendir+0x107) [0x7f411a2b3be7] (-->/usr/lib64/libglusterfs.so.0(inode_path+0x4a) [0x7f411e830b0a]))) 0-: Assertion failed: 0
[2012-09-21 07:01:10.533726] W [inode.c:1091:__inode_path] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(client_post_handshake+0x110) [0x7f411a2b3f70] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(protocol_client_reopendir+0x107) [0x7f411a2b3be7] (-->/usr/lib64/libglusterfs.so.0(inode_path+0x4a) [0x7f411e830b0a]))) 0-distrep2-client-1: invalid inode
[2012-09-21 07:01:10.533739] W [client-handshake.c:1108:protocol_client_reopendir] 0-distrep2-client-1: couldn't build path from inode 00000000-0000-0000-0000-000000000000
[2012-09-21 07:01:10.533769] E [inode.c:1090:__inode_path] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(client_post_handshake+0x110) [0x7f411a2b3f70] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(protocol_client_reopendir+0x107) [0x7f411a2b3be7] (-->/usr/lib64/libglusterfs.so.0(inode_path+0x4a) [0x7f411e830b0a]))) 0-: Assertion failed: 0
[2012-09-21 07:01:10.533802] W [inode.c:1091:__inode_path] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(client_post_handshake+0x110) [0x7f411a2b3f70] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(protocol_client_reopendir+0x107) [0x7f411a2b3be7] (-->/usr/lib64/libglusterfs.so.0(inode_path+0x4a) [0x7f411e830b0a]))) 0-distrep2-client-1: invalid inode
[2012-09-21 07:01:10.533816] W [client-handshake.c:1108:protocol_client_reopendir] 0-distrep2-client-1: couldn't build path from inode 00000000-0000-0000-0000-000000000000
[2012-09-21 07:01:10.533845] E [inode.c:1090:__inode_path] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(client_post_handshake+0x110) [0x7f411a2b3f70] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(protocol_client_reopendir+0x107) [0x7f411a2b3be7] (-->/usr/lib64/libglusterfs.so.0(inode_path+0x4a) [0x7f411e830b0a]))) 0-: Assertion failed: 0
[2012-09-21 07:01:10.533876] W [inode.c:1091:__inode_path] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(client_post_handshake+0x110) [0x7f411a2b3f70] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(protocol_client_reopendir+0x107) [0x7f411a2b3be7] (-->/usr/lib64/libglusterfs.so.0(inode_path+0x4a) [0x7f411e830b0a]))) 0-distrep2-client-1: invalid inode
[2012-09-21 07:01:10.533890] W [client-handshake.c:1108:protocol_client_reopendir] 0-distrep2-client-1: couldn't build path from inode 00000000-0000-0000-0000-000000000000
[2012-09-21 07:01:10.533918] E [inode.c:1090:__inode_path] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(client_post_handshake+0x110) [0x7f411a2b3f70] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(protocol_client_reopendir+0x107) [0x7f411a2b3be7] (-->/usr/lib64/libglusterfs.so.0(inode_path+0x4a) [0x7f411e830b0a]))) 0-: Assertion failed: 0
[2012-09-21 07:01:10.533949] W [inode.c:1091:__inode_path] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(client_post_handshake+0x110) [0x7f411a2b3f70] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(protocol_client_reopendir+0x107) [0x7f411a2b3be7] (-->/usr/lib64/libglusterfs.so.0(inode_path+0x4a) [0x7f411e830b0a]))) 0-distrep2-client-1: invalid inode
[2012-09-21 07:01:10.533962] W [client-handshake.c:1108:protocol_client_reopendir] 0-distrep2-client-1: couldn't build path from inode 00000000-0000-0000-0000-000000000000
[2012-09-21 07:01:10.533991] E [inode.c:1090:__inode_path] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(client_post_handshake+0x110) [0x7f411a2b3f70] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(protocol_client_reopendir+0x107) [0x7f411a2b3be7] (-->/usr/lib64/libglusterfs.so.0(inode_path+0x4a) [0x7f411e830b0a]))) 0-: Assertion failed: 0
[2012-09-21 07:01:10.534022] W [inode.c:1091:__inode_path] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(client_post_handshake+0x110) [0x7f411a2b3f70] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(protocol_client_reopendir+0x107) [0x7f411a2b3be7] (-->/usr/lib64/libglusterfs.so.0(inode_path+0x4a) [0x7f411e830b0a]))) 0-distrep2-client-1: invalid inode
[2012-09-21 07:01:10.534040] W [client-handshake.c:1108:protocol_client_reopendir] 0-distrep2-client-1: couldn't build path from inode 00000000-0000-0000-0000-000000000000
[2012-09-21 07:01:10.534069] E [inode.c:1090:__inode_path] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(client_post_handshake+0x110) [0x7f411a2b3f70] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(protocol_client_reopendir+0x107) [0x7f411a2b3be7] (-->/usr/lib64/libglusterfs.so.0(inode_path+0x4a) [0x7f411e830b0a]))) 0-: Assertion failed: 0
[2012-09-21 07:01:10.534100] W [inode.c:1091:__inode_path] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(client_post_handshake+0x110) [0x7f411a2b3f70] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(protocol_client_reopendir+0x107) [0x7f411a2b3be7] (-->/usr/lib64/libglusterfs.so.0(inode_path+0x4a) [0x7f411e830b0a]))) 0-distrep2-client-1: invalid inode
[2012-09-21 07:01:10.534113] W [client-handshake.c:1108:protocol_client_reopendir] 0-distrep2-client-1: couldn't build path from inode 00000000-0000-0000-0000-000000000000
[2012-09-21 07:01:10.534141] E [inode.c:1090:__inode_path] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(client_post_handshake+0x145) [0x7f411a2b3fa5] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(protocol_client_reopen+0xd7) [0x7f411a2b3657] (-->/usr/lib64/libglusterfs.so.0(inode_path+0x4a) [0x7f411e830b0a]))) 0-: Assertion failed: 0
[2012-09-21 07:01:10.534173] W [inode.c:1091:__inode_path] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(client_post_handshake+0x145) [0x7f411a2b3fa5] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(protocol_client_reopen+0xd7) [0x7f411a2b3657] (-->/usr/lib64/libglusterfs.so.0(inode_path+0x4a) [0x7f411e830b0a]))) 0-distrep2-client-1: invalid inode
[2012-09-21 07:01:10.534186] W [client-handshake.c:1187:protocol_client_reopen] 0-distrep2-client-1: couldn't build path from inode 00000000-0000-0000-0000-000000000000
[2012-09-21 07:01:10.534215] E [inode.c:1090:__inode_path] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(client_post_handshake+0x145) [0x7f411a2b3fa5] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(protocol_client_reopen+0xd7) [0x7f411a2b3657] (-->/usr/lib64/libglusterfs.so.0(inode_path+0x4a) [0x7f411e830b0a]))) 0-: Assertion failed: 0
[2012-09-21 07:01:10.534245] W [inode.c:1091:__inode_path] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(client_post_handshake+0x145) [0x7f411a2b3fa5] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(protocol_client_reopen+0xd7) [0x7f411a2b3657] (-->/usr/lib64/libglusterfs.so.0(inode_path+0x4a) [0x7f411e830b0a]))) 0-distrep2-client-1: invalid inode
[2012-09-21 07:01:10.534259] W [client-handshake.c:1187:protocol_client_reopen] 0-distrep2-client-1: couldn't build path from inode 00000000-0000-0000-0000-000000000000
[2012-09-21 07:01:10.534287] E [inode.c:1090:__inode_path] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(client_post_handshake+0x110) [0x7f411a2b3f70] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(protocol_client_reopendir+0x107) [0x7f411a2b3be7] (-->/usr/lib64/libglusterfs.so.0(inode_path+0x4a) [0x7f411e830b0a]))) 0-: Assertion failed: 0
[2012-09-21 07:01:10.534318] W [inode.c:1091:__inode_path] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(client_post_handshake+0x110) [0x7f411a2b3f70] (-->/usr/lib64/glusterfs/3.3.0rhsvirt1/xlator/protocol/client.so(protocol_client_reopendir+0x107) [0x7f411a2b3be7] (-->/usr/lib64/libglusterfs.so.0(inode_path+0x4a) [0x7f411e830b0a]))) 0-distrep2-client-1: invalid inode
[2012-09-21 07:01:10.534336] W [client-handshake.c:1108:protocol_client_reopendir] 0-distrep2-client-1: couldn't build path from inode 00000000-0000-0000-0000-000000000000
[2012-09-21 07:01:10.534574] I [client-handshake.c:1041:client3_1_reopendir_cbk] 0-distrep2-client-1: reopendir on / succeeded (fd = 0)
[2012-09-21 07:01:10.534665] I [client-handshake.c:1041:client3_1_reopendir_cbk] 0-distrep2-client-1: reopendir on / succeeded (fd = 1)
[2012-09-21 07:01:10.534737] I [client-handshake.c:1041:client3_1_reopendir_cbk] 0-distrep2-client-1: reopendir on / succeeded (fd = 2)
[2012-09-21 07:01:10.534766] E [dht-rebalance.c:1202:gf_defrag_migrate_data] 0-distrep2-dht: migrate-data failed for /9c17fd91-2e28-463d-b9b3-93fcd9a77679/dom_md/metadata
[2012-09-21 07:01:10.534820] I [client-handshake.c:1041:client3_1_reopendir_cbk] 0-distrep2-client-1: reopendir on / succeeded (fd = 3)
Comment 2 shylesh 2012-09-21 09:08:13 EDT
[root@rhs-gp-srv4 ~]# rpm -qa | grep glus
glusterfs-server-3.3.0rhsvirt1-5.el6rhs.x86_64
vdsm-gluster-4.9.6-14.el6rhs.noarch
gluster-swift-plugin-1.0-5.noarch
gluster-swift-container-1.4.8-4.el6.noarch
org.apache.hadoop.fs.glusterfs-glusterfs-0.20.2_0.2-1.noarch
glusterfs-fuse-3.3.0rhsvirt1-5.el6rhs.x86_64
glusterfs-rdma-3.3.0rhsvirt1-5.el6rhs.x86_64
gluster-swift-proxy-1.4.8-4.el6.noarch
gluster-swift-account-1.4.8-4.el6.noarch
gluster-swift-doc-1.4.8-4.el6.noarch
glusterfs-3.3.0rhsvirt1-5.el6rhs.x86_64
glusterfs-geo-replication-3.3.0rhsvirt1-5.el6rhs.x86_64
gluster-swift-1.4.8-4.el6.noarch
gluster-swift-object-1.4.8-4.el6.noarch
Comment 3 shishir gowda 2012-09-25 03:39:22 EDT
This looks like a duplicate of bug 826080. The bug was fixed in upstream. We need to back-port it.

*** This bug has been marked as a duplicate of bug 826080 ***
Comment 4 shishir gowda 2012-09-25 04:32:51 EDT
Re-opened the bug. Will update it once the patch for bug 826080 is back-ported
Comment 5 shishir gowda 2012-09-26 02:02:14 EDT
The fix (https://code.engineering.redhat.com/gerrit/#/c/26/) has gone into glusterfs-3.3.0rhsvirt1-6.el6rhs release.
Comment 6 shylesh 2012-10-25 05:54:49 EDT
Created attachment 633259 [details]
rebalance failures logs
Comment 7 shylesh 2012-10-25 05:58:13 EDT
This bug is still reproducible 

here is the snippet from rebalance logs, attached the complete logs


[2012-10-25 04:58:27.712656] W [client3_1-fops.c:1114:client3_1_getxattr_cbk] 0-unique-client-0: remote operation failed: No such file or directory. Path: /2ec80488-67ed-4746-9baf-812429133e1d/images/a4af645c-7d45-4a08-85c8-600169f5cd00/d5bc5297-fd81-4639-8e14-b8d7fa438e19.meta (00000000-0000-0000-0000-000000000000). Key: (null)
[2012-10-25 04:58:27.712703] W [dht-rebalance.c:739:dht_migrate_file] 0-unique-dht: /2ec80488-67ed-4746-9baf-812429133e1d/images/a4af645c-7d45-4a08-85c8-600169f5cd00/d5bc5297-fd81-4639-8e14-b8d7fa438e19.meta: failed to get xattr from unique-replicate-0 (No such file or directory)
[2012-10-25 04:58:27.712726] E [afr-inode-write.c:1489:afr_setxattr] 0-unique-replicate-2: setxattr dict is null
[2012-10-25 04:58:27.712741] W [dht-rebalance.c:745:dht_migrate_file] 0-unique-dht: /2ec80488-67ed-4746-9baf-812429133e1d/images/a4af645c-7d45-4a08-85c8-600169f5cd00/d5bc5297-fd81-4639-8e14-b8d7fa438e19.meta: failed to set xattr on unique-replicate-2 (Invalid argument)
[2012-10-25 04:58:27.716365] I [dht-rebalance.c:1620:gf_defrag_status_get] 0-glusterfs: Rebalance is completed
[2012-10-25 04:58:27.716392] I [dht-rebalance.c:1623:gf_defrag_status_get] 0-glusterfs: Files migrated: 4, size: 214748365440, lookups: 93, failures: 7
[2012-10-25 04:58:27.716584] E [dht-rebalance.c:1374:gf_defrag_fix_layout] 0-unique-dht: Fix layout failed for /2ec80488-67ed-4746-9baf-812429133e1d/images/a4af645c-7d45-4a08-85c8-600169f5cd00
[2012-10-25 04:58:27.716651] W [glusterfsd.c:906:cleanup_and_exit] (-->/lib64/libc.so.6(clone+0x6d) [0x3910ae5ccd] (-->/lib64/libpthread.so.0() [0x39112077f1] (-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xdd) [0x405d2d]))) 0-: received signum (15), shutting down
[2012-10-25 04:58:27.716810] E [dht-rebalance.c:1374:gf_defrag_fix_layout] 0-unique-dht: Fix layout failed for /2ec80488-67ed-4746-9baf-812429133e1d/images
[2012-10-25 04:58:27.716988] E [dht-rebalance.c:1374:gf_defrag_fix_layout] 0-unique-dht: Fix layout failed for /2ec80488-67ed-4746-9baf-812429133e1d
Comment 8 shylesh 2012-10-25 06:00:19 EDT
i was verifying the bug on following version
[root@rhs-gp-srv4 /]# rpm -qa | grep gluster
glusterfs-server-3.3.0rhsvirt1-7.el6rhs.x86_64
glusterfs-debuginfo-3.3.0rhsvirt1-7.el6rhs.x86_64
vdsm-gluster-4.9.6-14.el6rhs.noarch
gluster-swift-plugin-1.0-5.noarch
gluster-swift-container-1.4.8-4.el6.noarch
org.apache.hadoop.fs.glusterfs-glusterfs-0.20.2_0.2-1.noarch
glusterfs-fuse-3.3.0rhsvirt1-7.el6rhs.x86_64
glusterfs-geo-replication-3.3.0rhsvirt1-7.el6rhs.x86_64
glusterfs-devel-3.3.0rhsvirt1-7.el6rhs.x86_64
gluster-swift-proxy-1.4.8-4.el6.noarch
gluster-swift-account-1.4.8-4.el6.noarch
gluster-swift-doc-1.4.8-4.el6.noarch
glusterfs-3.3.0rhsvirt1-7.el6rhs.x86_64
glusterfs-rdma-3.3.0rhsvirt1-7.el6rhs.x86_64
gluster-swift-1.4.8-4.el6.noarch
gluster-swift-object-1.4.8-4.el6.noarch
Comment 9 shishir gowda 2012-10-25 06:15:34 EDT
This seems to be a duplicate of bug  820518.
We seem to be getting incorrect entries from readdirp, resulting in visiting the
files/directories more than once as logged below:

[2012-10-25 01:44:17.512660] I [dht-common.c:2337:dht_setxattr] 0-unique-dht: fixing the layout of /
[2012-10-25 01:44:17.514223] I [dht-rebalance.c:1063:gf_defrag_migrate_data] 0-unique-dht: migrate data 
called on /

[2012-10-25 01:45:40.059143] I [dht-common.c:2337:dht_setxattr] 0-unique-dht: fixing the layout of /
[2012-10-25 01:45:40.062599] I [dht-rebalance.c:1063:gf_defrag_migrate_data] 0-unique-dht: migrate data called on /

addtional failure messages:

[2012-10-25 04:58:27.712289] W [client3_1-fops.c:1114:client3_1_getxattr_cbk] 0-unique-client-1: remote 
operation failed: No such file or directory. Path: /2ec80488-67ed-4746-9baf-812429133e1d/images/a4af645c
-7d45-4a08-85c8-600169f5cd00/d5bc5297-fd81-4639-8e14-b8d7fa438e19.meta (00000000-0000-0000-0000-00000000
0000). Key: (null)
[2012-10-25 04:58:27.712656] W [client3_1-fops.c:1114:client3_1_getxattr_cbk] 0-unique-client-0: remote 
operation failed: No such file or directory. Path: /2ec80488-67ed-4746-9baf-812429133e1d/images/a4af645c
-7d45-4a08-85c8-600169f5cd00/d5bc5297-fd81-4639-8e14-b8d7fa438e19.meta (00000000-0000-0000-0000-00000000
0000). Key: (null)
[2012-10-25 04:58:27.712703] W [dht-rebalance.c:739:dht_migrate_file] 0-unique-dht: /2ec80488-67ed-4746-
9baf-812429133e1d/images/a4af645c-7d45-4a08-85c8-600169f5cd00/d5bc5297-fd81-4639-8e14-b8d7fa438e19.meta:
 failed to get xattr from unique-replicate-0 (No such file or directory)
[2012-10-25 04:58:27.712726] E [afr-inode-write.c:1489:afr_setxattr] 0-unique-replicate-2: setxattr dict
 is null
[2012-10-25 04:58:27.712741] W [dht-rebalance.c:745:dht_migrate_file] 0-unique-dht: /2ec80488-67ed-4746-
9baf-812429133e1d/images/a4af645c-7d45-4a08-85c8-600169f5cd00/d5bc5297-fd81-4639-8e14-b8d7fa438e19.meta:
 failed to set xattr on unique-replicate-2 (Invalid argument)
Comment 10 shishir gowda 2012-10-29 09:08:42 EDT
These are replica related bugs. Readdir returns different entries at different offset from subvolume, when a replica pair goes down.
Comment 11 Pranith Kumar K 2012-11-05 21:53:19 EST
posted the following patch to provide the functionality to disable readdir-failover in afr.
http://review.gluster.org/#change,4159

Could you please integrate with this patch.

Pranith.
Comment 12 Pranith Kumar K 2012-11-07 08:43:59 EST
Re-assigning to Shishir for integrating with the patch.
Comment 16 Amar Tumballi 2012-11-29 03:45:29 EST
http://review.gluster.org/4159 needs to go in for making dht specific changes, marking as POST to indicate the patch is in review process.
Comment 17 Vijay Bellur 2012-12-03 03:11:14 EST
CHANGE: http://review.gluster.org/4159 (cluster/afr: Provide option to disable readdir failover) merged in master by Vijay Bellur (vbellur@redhat.com)
Comment 18 Pranith Kumar K 2012-12-11 03:26:26 EST
*** Bug 820518 has been marked as a duplicate of this bug. ***
Comment 19 Vijay Bellur 2012-12-16 22:46:05 EST
CHANGE: http://review.gluster.org/4300 (cluster/dht: Add "afr.readdir-failover=off" option the rebalance process) merged in master by Anand Avati (avati@redhat.com)
Comment 22 shylesh 2013-01-23 09:25:55 EST
verified on 3.3.0.5rhs-40.el6rhs.x86_64
Comment 23 Divya 2013-02-12 05:03:13 EST
Hi Shishir,

This bug has been added to Update 4 errata. Could you provide your inputs in doc text field which will enable me to update errata??

Thanks,
Divya
Comment 25 errata-xmlrpc 2013-03-28 18:25:50 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2013-0691.html

Note You need to log in before you can comment on or make changes to this bug.