Bug 1740968
Summary: | glustershd can not decide heald_sinks, and skip repair, so some entries lingering in volume heal info | ||||||
---|---|---|---|---|---|---|---|
Product: | [Community] GlusterFS | Reporter: | zhou lin <zz.sh.cynthia> | ||||
Component: | replicate | Assignee: | Karthik U S <ksubrahm> | ||||
Status: | CLOSED NEXTRELEASE | QA Contact: | Nag Pavan Chilakam <nchilaka> | ||||
Severity: | high | Docs Contact: | |||||
Priority: | unspecified | ||||||
Version: | 4.1 | CC: | bugs, rhs-bugs, shujun.huang, storage-qa-internal | ||||
Target Milestone: | --- | Keywords: | Reopened | ||||
Target Release: | --- | ||||||
Hardware: | Unspecified | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | |||||||
: | 1749322 (view as bug list) | Environment: | |||||
Last Closed: | 2019-10-11 08:34:04 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Bug Depends On: | 1749322 | ||||||
Bug Blocks: | |||||||
Attachments: |
|
Description
zhou lin
2019-08-14 02:48:18 UTC
According to the gdb output, healed_sinks are not set to zero, instead they are optimized out, due to which you are not able to see the values set in that. Please attach the gluster logs from all the nodes to debug this further. Since the gluster version is 3.12.15, please file a new bug under https://bugzilla.redhat.com/enter_bug.cgi?product=GlusterFS with the logs attached. I'm closing this bug as this is a community project. each time we find this issue, the change logs are very alike, two node are mutually accused each other, and the third node does not accuse anyone, at the same time the node with this entry in "gluster volume heal info" glustershd log are like following,it seems each round of scan the shd does not complete, instead it dropped out , because it can not find head_sinks, and according to afr-self-heal-entry.c, when count of head_sinks is zero, it just skip the following repair steps. [root@mn-0:/mnt/bricks/ccs/brick/.glusterfs/indices/xattrop] # gluster v heal ccs info Brick mn-0.local:/mnt/bricks/ccs/brick / Status: Connected Number of entries: 1 Brick mn-1.local:/mnt/bricks/ccs/brick Status: Connected Number of entries: 0 Brick dbm-0.local:/mnt/bricks/ccs/brick / Status: Connected Number of entries: 1 [root@mn-0:/root] # getfattr -m . -d -e hex /mnt/bricks/ccs/brick/ getfattr: Removing leading '/' from absolute path names # file: mnt/bricks/ccs/brick/ trusted.afr.ccs-client-1=0x000000000000000000000000 trusted.afr.ccs-client-2=0x000000000000000000000002 trusted.afr.dirty=0x000000000000000000000000 trusted.gfid=0x00000000000000000000000000000001 trusted.glusterfs.dht=0x000000010000000000000000ffffffff trusted.glusterfs.volume-id=0x15301766900e4e9fb0b87c6b3f6e90f0 [root@mn-1:/home/robot] # getfattr -m . -d -e hex /mnt/bricks/ccs/brick getfattr: Removing leading '/' from absolute path names # file: mnt/bricks/ccs/brick trusted.afr.ccs-client-0=0x000000000000000000000000 trusted.afr.ccs-client-2=0x000000000000000000000000 trusted.afr.dirty=0x000000000000000000000000 trusted.gfid=0x00000000000000000000000000000001 trusted.glusterfs.dht=0x000000010000000000000000ffffffff trusted.glusterfs.volume-id=0x15301766900e4e9fb0b87c6b3f6e90f0 [root@dbm-0:/root] # getfattr -m . -d -e hex /mnt/bricks/ccs/brick/ getfattr: Removing leading '/' from absolute path names # file: mnt/bricks/ccs/brick/ trusted.afr.ccs-client-0=0x000000000000000000000004 trusted.afr.ccs-client-1=0x000000000000000000000000 trusted.afr.dirty=0x000000000000000000000000 trusted.gfid=0x00000000000000000000000000000001 trusted.glusterfs.dht=0x000000010000000000000000ffffffff trusted.glusterfs.volume-id=0x15301766900e4e9fb0b87c6b3f6e90f0 [2019-07-13 02:05:28.001785] I [MSGID: 108026] [afr-self-heald.c:432:afr_shd_index_heal] 0-ccs-replicate-0: got entry: 00000000-0000-0000-0000-000000000001 from ccs-client-0 [2019-07-13 02:05:28.002066] I [MSGID: 108026] [afr-self-heald.c:341:afr_shd_selfheal] 0-ccs-replicate-0: entry: path /, gfid: 00000000-0000-0000-0000-000000000001 [2019-07-13 02:06:29.001650] I [MSGID: 108026] [afr-self-heald.c:432:afr_shd_index_heal] 0-ccs-replicate-0: got entry: 00000000-0000-0000-0000-000000000001 from ccs-client-0 [2019-07-13 02:06:29.001986] I [MSGID: 108026] [afr-self-heald.c:341:afr_shd_selfheal] 0-ccs-replicate-0: entry: path /, gfid: 00000000-0000-0000-0000-000000000001 [2019-07-13 02:07:30.003468] I [MSGID: 108026] [afr-self-heald.c:432:afr_shd_index_heal] 0-ccs-replicate-0: got entry: 00000000-0000-0000-0000-000000000001 from ccs-client-0 [2019-07-13 02:07:30.004325] I [MSGID: 108026] [afr-self-heald.c:341:afr_shd_selfheal] 0-ccs-replicate-0: entry: path /, gfid: 00000000-0000-0000-0000-000000000001 [2019-07-13 02:08:31.001744] I [MSGID: 108026] [afr-self-heald.c:432:afr_shd_index_heal] 0-ccs-replicate-0: got entry: 00000000-0000-0000-0000-000000000001 from ccs-client-0 [2019-07-13 02:08:31.002067] I [MSGID: 108026] [afr-self-heald.c:341:afr_shd_selfheal] 0-ccs-replicate-0: entry: path /, gfid: 00000000-0000-0000-0000-000000000001 [2019-07-13 02:09:15.002043] I [MSGID: 108026] [afr-self-heald.c:432:afr_shd_index_heal] 0-encryptfile-replicate-0: got entry: 00000000-0000-0000-0000-000000000001 from encryptfile-client-0 [2019-07-13 02:09:15.002679] I [MSGID: 108026] [afr-self-heald.c:341:afr_shd_selfheal] 0-encryptfile-replicate-0: entry: path /, gfid: 00000000-0000-0000-0000-000000000001 [2019-07-13 02:09:32.001682] I [MSGID: 108026] [afr-self-heald.c:432:afr_shd_index_heal] 0-ccs-replicate-0: got entry: 00000000-0000-0000-0000-000000000001 from ccs-client-0 [2019-07-13 02:09:32.002015] I [MSGID: 108026] [afr-self-heald.c:341:afr_shd_selfheal] 0-ccs-replicate-0: entry: path /, gfid: 00000000-0000-0000-0000-000000000001 The volume for which heal info was given in the description and the one given in the last comment are different. - How many volumes have this problem? - Please provide the gluster logs to debug this further. Created attachment 1605200 [details]
mn-0 node glustershd log+ services brick glsuterfsd log
this issue happens, not very often, but if keep running reboot case, there is a change to happen both the volume services and volume ccs have happened this issue, and each time the changelog is the same pattern, two storage nodes accuse each other, and the third one blame nothing, i study the glusterfs source code, and feels in this case __afr_selfheal_entry_prepare can not decide the healed_sinks, even with the latest code, i think it should be the same i do not have log right now, maybe next time it happened, i can collect. Hi Cynthia, Appreciate your efforts on finding the root cause for this issue. Yes you are right. In __afr_selfheal_entry_prepare() it is not setting the bricks which are needing heal as healed_sinks in this case. I created this locally by setting the required xattrs and creating the gfid entries manually on the backend. Will work on the fix for this. thanks! looking forward for your fix patch :)! the healed_sinks is empty is because afr_selfheal_find_direction do not find any "sink". In the function, only the node who accuse by source node can be decided as sink, other accuse node will not be identified as sink. The rule is valid or not? Any reason? for (i = 0; i < priv->child_count; i++) { if (!sources[i])---> the accuse info will not be taken into consider when the node is not source continue; if (self_accused[i]) continue; for (j = 0; j < priv->child_count; j++) { if (matrix[i][j]) sinks[j] = 1; } } (In reply to Hunang Shujun from comment #9) > the healed_sinks is empty is because afr_selfheal_find_direction do not find > any "sink". In the function, only the node who accuse by source node can be > decided as sink, other accuse node will not be identified as sink. The rule > is valid or not? Any reason? > for (i = 0; i < priv->child_count; i++) { > if (!sources[i])---> the accuse info will not be taken into > consider when the node is not source > continue; > if (self_accused[i]) > continue; > for (j = 0; j < priv->child_count; j++) { > if (matrix[i][j]) > sinks[j] = 1; > } > } This is a valid code. Here we consider only those bricks which are not blamed by any of the non-accused bricks as sinks. Then in __afr_selfheal_entry_prepare() we will intersect the locked_on and sinks to populate the healed_sinks. After this __afr_selfheal_entry_finalize_source() will be called which attempts to mark all the bricks which are not source as healed_sinks. sources_count = AFR_COUNT(sources, priv->child_count); if ((AFR_CMP(locked_on, healed_sinks, priv->child_count) == 0) || !sources_count || afr_does_witness_exist(this, witness)) { -------> These condition does not hold true in this case so it fails to mark the non-sources as sinks memset(sources, 0, sizeof(*sources) * priv->child_count); afr_mark_active_sinks(this, sources, locked_on, healed_sinks); return -1; } source = afr_choose_source_by_policy(priv, sources, AFR_ENTRY_TRANSACTION); return source; We need to handle this case separately where we have source set but there is no brick marked as sink. Since this is happening for entry heal we can not directly consider all the other bricks as sinks, which might lead to data loss. So the best way would be to do conservative merge here. I will check whether this happens for data & metadata heal case as well (ideally it should not) and then send a patch to fix this. I am appreciate for your detail explaination. :) Hi, This bug is fixed in the latest master by the patch [1], which is tracked by BZ #1749322. I have done the backports to other maintained branches. The fix should be part of the next releases. Since the 4.1 branch is no longer maintained I am closing this bug. [1] https://review.gluster.org/#/c/glusterfs/+/23364/ Regards, Karthik good!thanks! |