Description of problem: ======================== Healing of one of the file not happening after updating the RHGS nodes from 3.0.4 version to 3.1.3 version using in-service steps. Heal info returned non zero value for one of the file entry ( see the out put in next comment ) after second node update in two node cluster ( Heal worked successfully after updating the first node ) and brick log was filled with below error message: [2016-07-13 14:40:15.099868] W [trash.c:1861:trash_mkdir] 0-Dis-Rep-trash: mkdir issued on /.trashcan/, which is not permitted [2016-07-13 14:40:15.099915] E [MSGID: 115056] [server-rpc-fops.c:515:server_mkdir_cbk] 0-Dis-Rep-server: 219138: MKDIR /.trashcan/internal_op (00000000-0000-0000-0000-000000000005/internal_ op) client: dhcp37-82.lab.eng.blr.redhat.com-1742-2016/07/13-11:25:47:240601-Dis-Rep-client-1-0-1 [Operation not permitted] [2016-07-13 14:41:08.400320] W [trash.c:1861:trash_mkdir] 0-Dis-Rep-trash: mkdir issued on /.trashcan/, which is not permitted [2016-07-13 14:41:08.400387] E [MSGID: 115056] [server-rpc-fops.c:515:server_mkdir_cbk] 0-Dis-Rep-server: 280117: MKDIR /.trashcan/internal_op (00000000-0000-0000-0000-000000000005/internal_ op) client: dhcp37-82.lab.eng.blr.redhat.com-1742-2016/07/13-11:25:47:240601-Dis-Rep-client-1-0-1 [Operation not permitted] [2016-07-13 14:41:22.227228] W [trash.c:1861:trash_mkdir] 0-Dis-Rep-trash: mkdir issued on /.trashcan/, which is not permitted [2016-07-13 14:41:22.227329] E [MSGID: 115056] [server-rpc-fops.c:515:server_mkdir_cbk] 0-Dis-Rep-server: 294673: MKDIR /.trashcan/internal_op (00000000-0000-0000-0000-000000000005/internal_ op) client: dhcp37-82.lab.eng.blr.redhat.com-1742-2016/07/13-11:25:47:240601-Dis-Rep-client-1-0-1 [Operation not permitted] [2016-07-13 14:41:23.407611] W [trash.c:1861:trash_mkdir] 0-Dis-Rep-trash: mkdir issued on /.trashcan/, which is not permitted [2016-07-13 14:41:23.407693] E [MSGID: 115056] [server-rpc-fops.c:515:server_mkdir_cbk] 0-Dis-Rep-server: 294859: MKDIR /.trashcan/internal_op (00000000-0000-0000-0000-000000000005/internal_ op) client: dhcp37-82.lab.eng.blr.redhat.com-1742-2016/07/13-11:25:47:240601-Dis-Rep-client-1-0-1 [Operation not permitted] [2016-07-13 14:41:24.437168] W [trash.c:1861:trash_mkdir] 0-Dis-Rep-trash: mkdir issued on /.trashcan/, which is not permitted [2016-07-13 14:41:24.437240] E [MSGID: 115056] [server-rpc-fops.c:515:server_mkdir_cbk] 0-Dis-Rep-server: 294876: MKDIR /.trashcan/internal_op (00000000-0000-0000-0000-000000000005/internal_ op) client: dhcp37-82.lab.eng.blr.redhat.com-1742-2016/07/13-11:25:47:240601-Dis-Rep-client-1-0-1 [Operation not permitted] [2016-07-13 14:42:23.007917] I [login.c:81:gf_auth] 0-auth/login: allowed user names: a95f8f04-e073-4133-bb56-30db37b736d0 [root@dhcp37-84 ~]# Version-Release number of selected component (if applicable): =============================== In-service update from 3.0.4 to 3.1.3 Followed the in-service update steps mentioned in the installation guide of RHGS. How reproducible: ================= One time Steps to Reproduce: ==================== 1.Do In-service update from 3.0.4 to 3.1.3 using the steps mentioned in the installation guide. 2. 3. Actual results: =============== Heal is not happening successfully. Expected results: ================== Heal should happen successfully. Additional info: ================= Heal happened successfully after upgrading first node of two node cluster and repeated the same steps on other node and found that heal not happening for one of the file. Note: Used glusterd workaround to regenerate the vol files during update time.
Created attachment 1180039 [details] sos report from node-1
Created attachment 1180040 [details] sos report from node-2
Pasting the observations I shared over email in examining the setup: ================================================= I looked at the setup, Node1 has .trashcan to be entry healed (verified from the getfattr on the bricks) to Node2. The entry in question is '.trashcan/internal_op' folder. Node1 was updated first, then Node2. I/O was happening from the client all the time. I have a theory (and the logs too) but there is only one observation by Byreddy that I'm not able to explain. 1. Node 1 is updated to 3.1.3 and the bricks have .trashcan and .trashcan/internal_op after upgrade. 2. Client is doing IO all this while, which triggered entry-selfheal and created .trashcan on Node2 (still running 3.0.4), with a new-entry mark for it on Node1 [root@dhcp37-92 glusterfs]# grep -rine completed mnt.log 334:[2016-07-13 14:23:13.732508] I [afr-self-heal-common.c:2874:afr_log_self_heal_completion_status] 2-Dis-Rep-replicate-1: entry self heal is successfully completed, on / 335:[2016-07-13 14:23:13.734510] I [afr-self-heal-common.c:2874:afr_log_self_heal_completion_status] 2-Dis-Rep-replicate-0: entry self heal is successfully completed, on / 3. Index heal completes metadataheal of .trashcan (having gfid 5) but not entry heal. [root@dhcp37-82 glusterfs]# grep -rine Completed /var/log/glusterfs/glustershd.log |grep 00000000 650:[2016-07-13 14:37:21.644343] I [MSGID: 108026] [afr-self-heal-common.c:770:afr_log_selfheal] 0-Dis-Rep-replicate-0: Completed metadata selfheal on 00000000-0000-0000-0000-000000000005. sources=[0] sinks=1 868:[2016-07-13 14:37:25.093057] I [MSGID: 108026] [afr-self-heal-common.c:770:afr_log_selfheal] 0-Dis-Rep-replicate-1: Completed metadata selfheal on 00000000-0000-0000-0000-000000000005. sources=[0] sinks=1 `heal info` output still should show .trashcan for pending entry-heal. 4. Node2 is also upgraded. Normally .trashcan and .trashcan/internal_op are created by trash xlator but if .trashcan is already present (due to heal from Node1), it does not create 'internal_op'. 5. Now when shd entry heal tries mkdir of .trashcan/internal_op, trash_mkdir() blocks it. The only thing I'm not able to explain is Byreddy says that at step-3, he waited for heal-info to be zero before going to step-4. But if that were the case, the shd logs should also contain information about entry heal of .trash, something like: [[MSGID: 108026] [afr-self-heal-common.c:770:afr_log_selfheal] 0-testvol-replicate-0: Completed entry selfheal on 00000000-0000-0000-0000-000000000005. sources=[0] sinks=1 [[MSGID: 108026] [afr-self-heal-metadata.c:56:__afr_selfheal_metadata_do] 0-testvol-replicate-0: performing metadata selfheal on 00000000-0000-0000-0000-000000000006 [[MSGID: 108026] [afr-self-heal-common.c:770:afr_log_selfheal] 0-testvol-replicate-0: Completed metadata selfheal on 00000000-0000-0000-0000-000000000006. sources=[0] sinks=1 [[MSGID: 108026] [afr-self-heal-entry.c:589:afr_selfheal_entry_do] 0-testvol-replicate-0: performing entry selfheal on 00000000-0000-0000-0000-000000000006 [[MSGID: 108026] [afr-self-heal-common.c:770:afr_log_selfheal] 0-testvol-replicate-0: Completed entry selfheal on 00000000-0000-0000-0000-000000000006. sources=[0] sinks=1 But I don't see any such message. Do you have any pointers on how heal-info could show zero entries when entry self-heal hasn't completed yet? I looked at the setup, Node1 has .trashcan to be entry healed (verified from the getfattr on the bricks) to Node2. The entry in question is '.trashcan/internal_op' folder. Node1 was updated first, then Node2. I/O was happening from the client all the time. I have a theory (and the logs too) but there is only one observation by Byreddy that I'm not able to explain. 1. Node 1 is updated to 3.1.3 and the bricks have .trashcan and .trashcan/internal_op after upgrade. 2. Client is doing IO all this while, which triggered entry-selfheal and created .trashcan on Node2 (still running 3.0.4), with a new-entry mark for it on Node1 [root@dhcp37-92 glusterfs]# grep -rine completed mnt.log 334:[2016-07-13 14:23:13.732508] I [afr-self-heal-common.c:2874:afr_log_self_heal_completion_status] 2-Dis-Rep-replicate-1: entry self heal is successfully completed, on / 335:[2016-07-13 14:23:13.734510] I [afr-self-heal-common.c:2874:afr_log_self_heal_completion_status] 2-Dis-Rep-replicate-0: entry self heal is successfully completed, on / 3. Index heal completes metadataheal of .trashcan (having gfid 5) but not entry heal. [root@dhcp37-82 glusterfs]# grep -rine Completed /var/log/glusterfs/glustershd.log |grep 00000000 650:[2016-07-13 14:37:21.644343] I [MSGID: 108026] [afr-self-heal-common.c:770:afr_log_selfheal] 0-Dis-Rep-replicate-0: Completed metadata selfheal on 00000000-0000-0000-0000-000000000005. sources=[0] sinks=1 868:[2016-07-13 14:37:25.093057] I [MSGID: 108026] [afr-self-heal-common.c:770:afr_log_selfheal] 0-Dis-Rep-replicate-1: Completed metadata selfheal on 00000000-0000-0000-0000-000000000005. sources=[0] sinks=1 `heal info` output still should show .trashcan for pending entry-heal. 4. Node2 is also upgraded. Normally .trashcan and .trashcan/internal_op are created by trash xlator but if .trashcan is already present (due to heal from Node1), it does not create 'internal_op'. 5. Now when shd entry heal tries mkdir of .trashcan/internal_op, trash_mkdir() blocks it. The only thing I'm not able to explain is Byreddy says that at step-3, he waited for heal-info to be zero before going to step-4. But if that were the case, the shd logs should also contain information about entry heal of .trash, something like: [[MSGID: 108026] [afr-self-heal-common.c:770:afr_log_selfheal] 0-testvol-replicate-0: Completed entry selfheal on 00000000-0000-0000-0000-000000000005. sources=[0] sinks=1 [[MSGID: 108026] [afr-self-heal-metadata.c:56:__afr_selfheal_metadata_do] 0-testvol-replicate-0: performing metadata selfheal on 00000000-0000-0000-0000-000000000006 [[MSGID: 108026] [afr-self-heal-common.c:770:afr_log_selfheal] 0-testvol-replicate-0: Completed metadata selfheal on 00000000-0000-0000-0000-000000000006. sources=[0] sinks=1 [[MSGID: 108026] [afr-self-heal-entry.c:589:afr_selfheal_entry_do] 0-testvol-replicate-0: performing entry selfheal on 00000000-0000-0000-0000-000000000006 [[MSGID: 108026] [afr-self-heal-common.c:770:afr_log_selfheal] 0-testvol-replicate-0: Completed entry selfheal on 00000000-0000-0000-0000-000000000006. sources=[0] sinks=1 But I don't see any such message. =================================================
After discussing with Jiffin and Pranith, it was decided that the trash xlator needs to unconditionally create the .trashcan/internal_op folder. Assigning the bug to Jiffin as there is already a patch upstream which aims to do this.
Can this be targeted for 3.2?
(In reply to Atin Mukherjee from comment #7) > Can this be targeted for 3.2? Yup, the patch posted upstream for review http://review.gluster.org/#/c/14938/
Upstream mainline patch http://review.gluster.org/14938 is merged
Upstream mainline : http://review.gluster.org/14938 Upstream 3.8 : http://review.gluster.org/14965 And the fix is available in rhgs-3.2.0 as part of rebase to GlusterFS 3.8.4.
have done a in service upgrade from 3.0.4 to 3.2.0(3.8.4-14) I see that all files are getting healed and no pending entries in heal info checked even 3.1.3 to 3.2.0 no issues seen hence moving to verified
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHSA-2017-0486.html