1356804 – Healing of one of the file not happened during upgrade from 3.0.4 to 3.1.3 ( In-service )

Bug 1356804 - Healing of one of the file not happened during upgrade from 3.0.4 to 3.1.3 ( In-service )

Summary: Healing of one of the file not happened during upgrade from 3.0.4 to 3.1.3 ( ...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	replicate
Sub Component:
Version:	rhgs-3.1
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	RHGS 3.2.0
Assignee:	Jiffin
QA Contact:	Nag Pavan Chilakam
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1351522 1357397 1358262 1358268
TreeView+	depends on / blocked

Reported:	2016-07-15 05:25 UTC by Byreddy
Modified:	2017-03-23 05:39 UTC (History)
CC List:	6 users (show)
Fixed In Version:	glusterfs-3.8.4-1
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	1357397 (view as bug list)
Environment:
Last Closed:	2017-03-23 05:39:37 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
sos report from node-1 (8.24 MB, application/x-xz) 2016-07-15 05:48 UTC, Byreddy	no flags	Details
sos report from node-2 (7.78 MB, application/x-xz) 2016-07-15 05:50 UTC, Byreddy	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2017:0486	0	normal	SHIPPED_LIVE	Moderate: Red Hat Gluster Storage 3.2.0 security, bug fix, and enhancement update	2017-03-23 09:18:45 UTC

Description Byreddy 2016-07-15 05:25:48 UTC

Description of problem:
========================
Healing of one of the file not happening after updating the RHGS nodes from 3.0.4 version to 3.1.3 version using in-service steps.


Heal info returned non zero value for one of the file entry  ( see the out put in next comment ) after second node update in two node cluster ( Heal worked successfully after updating the first node )

and brick log was filled with below error message:

[2016-07-13 14:40:15.099868] W [trash.c:1861:trash_mkdir] 0-Dis-Rep-trash: mkdir issued on /.trashcan/, which is not permitted
[2016-07-13 14:40:15.099915] E [MSGID: 115056] [server-rpc-fops.c:515:server_mkdir_cbk] 0-Dis-Rep-server: 219138: MKDIR /.trashcan/internal_op (00000000-0000-0000-0000-000000000005/internal_
op) client: dhcp37-82.lab.eng.blr.redhat.com-1742-2016/07/13-11:25:47:240601-Dis-Rep-client-1-0-1 [Operation not permitted]
[2016-07-13 14:41:08.400320] W [trash.c:1861:trash_mkdir] 0-Dis-Rep-trash: mkdir issued on /.trashcan/, which is not permitted
[2016-07-13 14:41:08.400387] E [MSGID: 115056] [server-rpc-fops.c:515:server_mkdir_cbk] 0-Dis-Rep-server: 280117: MKDIR /.trashcan/internal_op (00000000-0000-0000-0000-000000000005/internal_
op) client: dhcp37-82.lab.eng.blr.redhat.com-1742-2016/07/13-11:25:47:240601-Dis-Rep-client-1-0-1 [Operation not permitted]
[2016-07-13 14:41:22.227228] W [trash.c:1861:trash_mkdir] 0-Dis-Rep-trash: mkdir issued on /.trashcan/, which is not permitted
[2016-07-13 14:41:22.227329] E [MSGID: 115056] [server-rpc-fops.c:515:server_mkdir_cbk] 0-Dis-Rep-server: 294673: MKDIR /.trashcan/internal_op (00000000-0000-0000-0000-000000000005/internal_
op) client: dhcp37-82.lab.eng.blr.redhat.com-1742-2016/07/13-11:25:47:240601-Dis-Rep-client-1-0-1 [Operation not permitted]
[2016-07-13 14:41:23.407611] W [trash.c:1861:trash_mkdir] 0-Dis-Rep-trash: mkdir issued on /.trashcan/, which is not permitted
[2016-07-13 14:41:23.407693] E [MSGID: 115056] [server-rpc-fops.c:515:server_mkdir_cbk] 0-Dis-Rep-server: 294859: MKDIR /.trashcan/internal_op (00000000-0000-0000-0000-000000000005/internal_
op) client: dhcp37-82.lab.eng.blr.redhat.com-1742-2016/07/13-11:25:47:240601-Dis-Rep-client-1-0-1 [Operation not permitted]
[2016-07-13 14:41:24.437168] W [trash.c:1861:trash_mkdir] 0-Dis-Rep-trash: mkdir issued on /.trashcan/, which is not permitted
[2016-07-13 14:41:24.437240] E [MSGID: 115056] [server-rpc-fops.c:515:server_mkdir_cbk] 0-Dis-Rep-server: 294876: MKDIR /.trashcan/internal_op (00000000-0000-0000-0000-000000000005/internal_
op) client: dhcp37-82.lab.eng.blr.redhat.com-1742-2016/07/13-11:25:47:240601-Dis-Rep-client-1-0-1 [Operation not permitted]
[2016-07-13 14:42:23.007917] I [login.c:81:gf_auth] 0-auth/login: allowed user names: a95f8f04-e073-4133-bb56-30db37b736d0


[root@dhcp37-84 ~]# 



Version-Release number of selected component (if applicable):
===============================
In-service update from 3.0.4 to 3.1.3 

Followed the in-service update steps  mentioned in the installation guide of RHGS.


How reproducible:
=================
One time


Steps to Reproduce:
====================
1.Do In-service update from 3.0.4 to 3.1.3  using the steps mentioned in the installation guide.
2.
3.

Actual results:
===============
Heal is not happening successfully.


Expected results:
==================
Heal should happen successfully.


Additional info:
=================
Heal happened successfully after upgrading first node of two node cluster and repeated the same steps on other node and found that heal not happening for one of the file.

Note: Used glusterd workaround to regenerate the vol files during update time.

Comment 3 Byreddy 2016-07-15 05:48:07 UTC

Created attachment 1180039 [details]
sos report from node-1

Comment 4 Byreddy 2016-07-15 05:50:56 UTC

Created attachment 1180040 [details]
sos report from node-2

Comment 5 Ravishankar N 2016-07-15 07:02:48 UTC

Pasting the observations I shared over email in examining the setup:

=================================================
I looked at the setup, Node1 has .trashcan to be entry healed (verified from the getfattr on the bricks) to Node2. The entry in question is '.trashcan/internal_op' folder.

Node1 was updated first, then Node2. I/O was happening from the client all the time.

I have a theory (and the logs too) but there is only one observation by Byreddy that I'm not able to explain.

1. Node 1 is updated to 3.1.3 and the bricks have .trashcan and .trashcan/internal_op after upgrade.

2. Client is doing IO all this while, which triggered entry-selfheal and created .trashcan on Node2 (still running 3.0.4), with a new-entry mark for it on Node1
[root@dhcp37-92 glusterfs]# grep -rine completed  mnt.log
334:[2016-07-13 14:23:13.732508] I [afr-self-heal-common.c:2874:afr_log_self_heal_completion_status] 2-Dis-Rep-replicate-1:  entry self heal  is successfully completed,   on /
335:[2016-07-13 14:23:13.734510] I [afr-self-heal-common.c:2874:afr_log_self_heal_completion_status] 2-Dis-Rep-replicate-0:  entry self heal  is successfully completed,   on /

3. Index heal completes metadataheal of .trashcan (having gfid 5) but not entry heal.
[root@dhcp37-82 glusterfs]# grep -rine Completed /var/log/glusterfs/glustershd.log |grep 00000000
650:[2016-07-13 14:37:21.644343] I [MSGID: 108026] [afr-self-heal-common.c:770:afr_log_selfheal] 0-Dis-Rep-replicate-0: Completed metadata selfheal on 00000000-0000-0000-0000-000000000005. sources=[0]  sinks=1
868:[2016-07-13 14:37:25.093057] I [MSGID: 108026] [afr-self-heal-common.c:770:afr_log_selfheal] 0-Dis-Rep-replicate-1: Completed metadata selfheal on 00000000-0000-0000-0000-000000000005. sources=[0]  sinks=1

`heal info` output still should show .trashcan for pending entry-heal.

4. Node2 is also upgraded. Normally .trashcan and .trashcan/internal_op are created by trash xlator but if .trashcan is already present (due to heal from Node1), it does not create 'internal_op'.

5. Now when shd entry heal  tries mkdir of .trashcan/internal_op, trash_mkdir() blocks it.

The only thing I'm not able to explain is Byreddy says that at step-3,  he waited for heal-info to be zero before going to step-4. But if that were the case, the shd logs should also contain information about entry heal of .trash, something like:
[[MSGID: 108026] [afr-self-heal-common.c:770:afr_log_selfheal] 0-testvol-replicate-0: Completed entry selfheal on 00000000-0000-0000-0000-000000000005. sources=[0]  sinks=1
[[MSGID: 108026] [afr-self-heal-metadata.c:56:__afr_selfheal_metadata_do] 0-testvol-replicate-0: performing metadata selfheal on 00000000-0000-0000-0000-000000000006
[[MSGID: 108026] [afr-self-heal-common.c:770:afr_log_selfheal] 0-testvol-replicate-0: Completed metadata selfheal on 00000000-0000-0000-0000-000000000006. sources=[0]  sinks=1
[[MSGID: 108026] [afr-self-heal-entry.c:589:afr_selfheal_entry_do] 0-testvol-replicate-0: performing entry selfheal on 00000000-0000-0000-0000-000000000006
[[MSGID: 108026] [afr-self-heal-common.c:770:afr_log_selfheal] 0-testvol-replicate-0: Completed entry selfheal on 00000000-0000-0000-0000-000000000006. sources=[0]  sinks=1

But I don't see any such message. Do you have any pointers on how heal-info could show zero entries when entry self-heal hasn't completed yet?
I looked at the setup, Node1 has .trashcan to be entry healed (verified from the getfattr on the bricks) to Node2. The entry in question is '.trashcan/internal_op' folder.

Node1 was updated first, then Node2. I/O was happening from the client all the time.

I have a theory (and the logs too) but there is only one observation by Byreddy that I'm not able to explain.

1. Node 1 is updated to 3.1.3 and the bricks have .trashcan and .trashcan/internal_op after upgrade.

2. Client is doing IO all this while, which triggered entry-selfheal and created .trashcan on Node2 (still running 3.0.4), with a new-entry mark for it on Node1
[root@dhcp37-92 glusterfs]# grep -rine completed  mnt.log
334:[2016-07-13 14:23:13.732508] I [afr-self-heal-common.c:2874:afr_log_self_heal_completion_status] 2-Dis-Rep-replicate-1:  entry self heal  is successfully completed,   on /
335:[2016-07-13 14:23:13.734510] I [afr-self-heal-common.c:2874:afr_log_self_heal_completion_status] 2-Dis-Rep-replicate-0:  entry self heal  is successfully completed,   on /

3. Index heal completes metadataheal of .trashcan (having gfid 5) but not entry heal.
[root@dhcp37-82 glusterfs]# grep -rine Completed /var/log/glusterfs/glustershd.log |grep 00000000
650:[2016-07-13 14:37:21.644343] I [MSGID: 108026] [afr-self-heal-common.c:770:afr_log_selfheal] 0-Dis-Rep-replicate-0: Completed metadata selfheal on 00000000-0000-0000-0000-000000000005. sources=[0]  sinks=1
868:[2016-07-13 14:37:25.093057] I [MSGID: 108026] [afr-self-heal-common.c:770:afr_log_selfheal] 0-Dis-Rep-replicate-1: Completed metadata selfheal on 00000000-0000-0000-0000-000000000005. sources=[0]  sinks=1

`heal info` output still should show .trashcan for pending entry-heal.

4. Node2 is also upgraded. Normally .trashcan and .trashcan/internal_op are created by trash xlator but if .trashcan is already present (due to heal from Node1), it does not create 'internal_op'.

5. Now when shd entry heal  tries mkdir of .trashcan/internal_op, trash_mkdir() blocks it.

The only thing I'm not able to explain is Byreddy says that at step-3,  he waited for heal-info to be zero before going to step-4. But if that were the case, the shd logs should also contain information about entry heal of .trash, something like:
[[MSGID: 108026] [afr-self-heal-common.c:770:afr_log_selfheal] 0-testvol-replicate-0: Completed entry selfheal on 00000000-0000-0000-0000-000000000005. sources=[0]  sinks=1
[[MSGID: 108026] [afr-self-heal-metadata.c:56:__afr_selfheal_metadata_do] 0-testvol-replicate-0: performing metadata selfheal on 00000000-0000-0000-0000-000000000006
[[MSGID: 108026] [afr-self-heal-common.c:770:afr_log_selfheal] 0-testvol-replicate-0: Completed metadata selfheal on 00000000-0000-0000-0000-000000000006. sources=[0]  sinks=1
[[MSGID: 108026] [afr-self-heal-entry.c:589:afr_selfheal_entry_do] 0-testvol-replicate-0: performing entry selfheal on 00000000-0000-0000-0000-000000000006
[[MSGID: 108026] [afr-self-heal-common.c:770:afr_log_selfheal] 0-testvol-replicate-0: Completed entry selfheal on 00000000-0000-0000-0000-000000000006. sources=[0]  sinks=1

But I don't see any such message.
=================================================

Comment 6 Ravishankar N 2016-07-15 07:37:40 UTC

After discussing with Jiffin and Pranith, it was decided that the trash xlator needs to unconditionally create the .trashcan/internal_op folder. Assigning the bug to Jiffin as there is already a patch upstream which aims to do this.

Comment 7 Atin Mukherjee 2016-07-18 03:47:34 UTC

Can this be targeted for 3.2?

Comment 8 Jiffin 2016-07-18 07:08:59 UTC

(In reply to Atin Mukherjee from comment #7)
> Can this be targeted for 3.2?

Yup, the patch posted upstream for review http://review.gluster.org/#/c/14938/

Comment 9 Atin Mukherjee 2016-08-09 04:18:41 UTC

Upstream mainline patch http://review.gluster.org/14938 is merged

Comment 11 Atin Mukherjee 2016-09-17 13:15:36 UTC

Upstream mainline : http://review.gluster.org/14938
Upstream 3.8 : http://review.gluster.org/14965

And the fix is available in rhgs-3.2.0 as part of rebase to GlusterFS 3.8.4.

Comment 14 Nag Pavan Chilakam 2017-02-08 15:25:58 UTC

have done a in service upgrade from 3.0.4 to 3.2.0(3.8.4-14)

I see that all files are getting healed and no pending entries in heal info

checked even 3.1.3 to 3.2.0

no issues seen hence moving to verified

Comment 16 errata-xmlrpc 2017-03-23 05:39:37 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2017-0486.html

Note You need to log in before you can comment on or make changes to this bug.