Description of problem: ------------------- After hitting the issue raised in BZ#1567100 , I did a rebalance on the now empty volume. rebalance succeeded. Then When I created a new directory on this setup, I see below errors this happens for the first directory created, and not seen for other dir creates. To reproduce this on the same setup, issue a rebalance and again create a new dir and will see this problem or create a new directory on a new client [2018-04-13 13:23:46.008976] E [MSGID: 114031] [client-rpc-fops.c:295:client3_3_mkdir_cbk] 0-zen-client-1: remote operation failed. Path: /dir3 [2018-04-13 13:23:46.009066] E [MSGID: 114031] [client-rpc-fops.c:295:client3_3_mkdir_cbk] 0-zen-client-3: remote operation failed. Path: /dir3 [2018-04-13 13:23:46.009102] E [MSGID: 114031] [client-rpc-fops.c:295:client3_3_mkdir_cbk] 0-zen-client-4: remote operation failed. Path: /dir3 [2018-04-13 13:23:46.009191] E [MSGID: 114031] [client-rpc-fops.c:295:client3_3_mkdir_cbk] 0-zen-client-0: remote operation failed. Path: /dir3 [2018-04-13 13:23:46.009247] E [MSGID: 114031] [client-rpc-fops.c:295:client3_3_mkdir_cbk] 0-zen-client-5: remote operation failed. Path: /dir3 [2018-04-13 13:23:46.009395] E [MSGID: 114031] [client-rpc-fops.c:295:client3_3_mkdir_cbk] 0-zen-client-2: remote operation failed. Path: /dir3 [2018-04-13 13:23:46.009445] I [MSGID: 109114] [dht-common.c:8775:dht_mkdir_hashed_cbk] 0-zen-dht: mkdir (00000000-0000-0000-0000-000000000001/dir3) (path: /dir3): parent layout changed. Attempting a refresh and then a retry Version-Release number of selected component (if applicable): ---------- 3.12.2-7 How reproducible: -------------- always Steps to Reproduce: 1.have the setup as mentioned in bz#1567100 2.now on this setup as there are no more files, do a rebalance and wait for it complete 3.now issue a "mkdir dir1" from the client, you will see the rpc errors Yet, to assess the functional imapct
dht layout is as below of the newly created dir folder1 dht-subvol1 # file: gluster/brick1/zen/folder1 security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.ec.version=0x00000000000000000000000000000001 trusted.gfid=0xb71633380fc74dbfb8552708e1b6e40c trusted.glusterfs.dht=0x0000000000000000000000007ffffffe dht-subvol2 # file: gluster/brick2/zen/folder1 security.selinux=0x73797374656d5f753a6f626a6563745f723a676c7573746572645f627269636b5f743a733000 trusted.ec.version=0x00000000000000000000000000000001 trusted.gfid=0xb71633380fc74dbfb8552708e1b6e40c trusted.glusterfs.dht=0x00000000000000007fffffffffffffff trusted.glusterfs.dht.mds=0x00000000
Considering the bug mentioned in comment#1 is now in fixed state, should we retry the setup? We dont see any issue here.
Continuing comment #8: if BZ is being planned for re-validation, please collect and attach logs to the BZ. If not, I'd like to see a reason for delegating this BZ to the RPC team/component. Since there are: * no connection failures * no RPC message drops * no call bails * no ping-timer expiry * no re-connection attempts reported, this does not seem like an RPC issue. Please clarify.
Not going to fix. Original BZ was fixed and released.