Bug 1414456

Summary:

[GSS]Entry heal pending for directories which has symlinks to a different replica set

Product:

[Red Hat Storage] Red Hat Gluster Storage

Reporter:

Riyas Abdulrasak <rnalakka>

Component:

quota

Assignee:

Sanoj Unnikrishnan <sunnikri>

Status:

CLOSED ERRATA

QA Contact:

Vinayak Papnoi <vpapnoi>

Severity:

medium

Docs Contact:

Priority:

medium

Version:

rhgs-3.1

CC:

amukherj, bkunal, nchilaka, ravishankar, rcyriac, rhinduja, rhs-bugs, sheggodu, srmukher, storage-qa-internal, sunnikri

Target Milestone:

---

Target Release:

RHGS 3.4.0

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

rebase

Fixed In Version:

glusterfs-3.12.2-1

Doc Type:

Bug Fix

Doc Text:

The path ancestry was not accurately populated when a symbolic link file had multiple hard links to it. This resulted in entry heal pending. This fix populates the ancestry precisely by handling the scenario of symbolic link file with multiple hard links.

Story Points:

---

Clone Of:

Environment:

Last Closed:

2018-09-04 06:32:03 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Bug Depends On:

1429198, 1429402, 1429405, 1436673

Bug Blocks:

1408949, 1472361, 1503135

Attachments:

Description	Flags
Brick logs from the n6 server	none
glustershd.log from n7 server	none
tcpdump from the source server	none
gdb script to print ancestry	none

Description Riyas Abdulrasak 2017-01-18 14:39:19 UTC

Description of problem:

Entry heal pending for directories which has symlinks to a different replica set. Customer noticed this after a rebalance failure. 

~~~~
[2016-12-23 02:35:36.425400] I [MSGID: 109028] [dht-rebalance.c:3872:gf_defrag_status_get] 0-nfs-vol1-dht: Rebalance is completed. Time taken is 594617.00 secs
[2016-12-23 02:35:36.425418] I [MSGID: 109028] [dht-rebalance.c:3876:gf_defrag_status_get] 0-nfs-vol1-dht: Files migrated: 942363, size: 863115246048, lookups: 6538531, failures: 18981, skipped: 1281102
~~~~

* Around 1000+ directories are shown to be healed from n7-gluster1-qh2 to n6-gluster1-qh2

~~~~
<snip> from gluster v heal info
Brick n6-gluster1-qh2:/rhgs/bricks/brick1/bricksrv1
Status: Connected
Number of entries: 0

Brick n7-gluster1-qh2:/rhgs/bricks/brick1/bricksrv1
/6000-science/6040-RWJ44/space/cchung/cchung/ldg/gt4.0.7-all-source-installer/source-trees-thr/gsi/proxy/proxy_core/source/autom4te.cache 
/6000-science/6040-RWJ44/space/scarassou/anaconda/pkgs/openssl-1.0.1c-0/lib 
/6000-science/6040-RWJ44/space/cchung/cchung/tmp/fftw-3.0.1/.libs 
/6000-science/6040-RWJ44/space/cchung/cchung/ldg/gt4.0.7-all-source-installer/source-trees-thr/gsi/proxy/proxy_ssl/source/autom4te.cache 
/6000-science/6040-RWJ44/space/cchung/cchung/ldg/gt4.0.7-all-source-installer/source-trees-thr/gsi/proxy/proxy_ssl/source/doxygen 
/6000-science/6040-RWJ44/space/cchung/cchung/ldg/gt4.0.7-all-source-installer/source-trees-thr/gsi/sasl/gssplugins 
/6000-science/6040-RWJ44/space/scarassou/anaconda/pkgs/opencv-2.4.2-np17py27_1/lib 
[.....]

/6000-science/6040-RWJ44/space/cmagoulas/LAPTOP/Library/Application Support/iDVD/Installed Themes/iDVD 6/Travel-Main+.theme/Contents/Resources 
Status: Connected
Number of entries: 1027

Brick n8-gluster1-qh2:/rhgs/bricks/brick1/bricksrv1
Status: Connected
Number of entries: 0
~~~~


* The glustershd shows the log messages like below

~~~~
[2016-12-31 16:51:15.007765] I [MSGID: 108026] [afr-self-heal-entry.c:589:afr_selfheal_entry_do] 0-nfs-vol1-replicate-3: performing entry selfheal on f1f3a846-f07d-49bb-999f-d3ab78568cce
[2016-12-31 16:51:15.011284] W [MSGID: 114031] [client-rpc-fops.c:2812:client3_3_link_cbk] 0-nfs-vol1-client-6: remote operation failed: (<gfid:f3c7f4ee-9db8-4126-b9f7-14de175c5f02> -> (null)) [Invalid argument]
~~~~

* The error in the above gfid points to a symlink exists in the "n7-gluster1-qh2:/rhgs/bricks/brick1/bricksrv1". The symlink(file) doesn't exist in it's replica, but the gfid link exists. 

* The brick logs of n6-gluster1-qh2 shows the below errors

~~~~
[2017-01-08 16:13:56.011298] I [MSGID: 115062] [server-rpc-fops.c:1208:server_link_cbk] 0-nfs-vol1-server: 211804157: LINK <gfid:f3c7f4ee-9db8-4126-b9f7-14de175c5f02> (f3c7f4ee-9db8-4126-b9f7-14de175c5f02) -> f1f3a846-f07d-49bb-999f-d3ab78568cce/output.0 ==> (Invalid argument) [Invalid argument]
~~~~


Version-Release number of selected component (if applicable):

glusterfs-3.7.9-10.el7rhgs.x86_64


How reproducible:

Happened once for the customer. 

Actual results:

Large number of directories shown to be healed

Expected results:

Need engineering help for resolving the heal issue. 

Additional info:

Volume Name: nfs-vol1
Type: Distributed-Replicate
Volume ID: 3c0b3e98-ef93-4502-a0e4-63d5da5963f6
Status: Started
Number of Bricks: 10 x 2 = 20
Transport-type: tcp
Bricks:
Brick1: n0-gluster1-qh2:/rhgs/bricks/brick1/bricksrv1
Brick2: n1-gluster1-qh2:/rhgs/bricks/brick1/bricksrv1
Brick3: n2-gluster1-qh2:/rhgs/bricks/brick1/bricksrv1
Brick4: n3-gluster1-qh2:/rhgs/bricks/brick1/bricksrv1
Brick5: n4-gluster1-qh2:/rhgs/bricks/brick1/bricksrv1
Brick6: n5-gluster1-qh2:/rhgs/bricks/brick1/bricksrv1
Brick7: n6-gluster1-qh2:/rhgs/bricks/brick1/bricksrv1
Brick8: n7-gluster1-qh2:/rhgs/bricks/brick1/bricksrv1
Brick9: n8-gluster1-qh2:/rhgs/bricks/brick1/bricksrv1
Brick10: n9-gluster1-qh2:/rhgs/bricks/brick1/bricksrv1
Brick11: n10-gluster1-qh2:/rhgs/bricks/brick1/bricksrv1
Brick12: n11-gluster1-qh2:/rhgs/bricks/brick1/bricksrv1
Brick13: n10-gluster1-qh2:/rhgs/bricks/brick4/bricksrv4
Brick14: n11-gluster1-qh2:/rhgs/bricks/brick4/bricksrv4
Brick15: n8-gluster1-qh2:/rhgs/bricks/brick4/bricksrv4
Brick16: n9-gluster1-qh2:/rhgs/bricks/brick4/bricksrv4
Brick17: n6-gluster1-qh2:/rhgs/bricks/brick4/bricksrv4
Brick18: n7-gluster1-qh2:/rhgs/bricks/brick4/bricksrv4
Brick19: n4-gluster1-qh2:/rhgs/bricks/brick4/bricksrv4
Brick20: n5-gluster1-qh2:/rhgs/bricks/brick4/bricksrv4
Options Reconfigured:
diagnostics.client-log-level: INFO
cluster.quorum-type: auto
cluster.server-quorum-type: server
performance.readdir-ahead: on
performance.cache-size: 1GB
features.cache-invalidation: off
ganesha.enable: on
nfs.disable: on
performance.read-ahead-page-count: 8
cluster.read-hash-mode: 2
client.event-threads: 4
server.event-threads: 4
server.outstanding-rpc-limit: 256
performance.io-thread-count: 64
network.ping-timeout: 42
features.uss: disable
features.barrier: disable
features.quota: on
features.inode-quota: on
features.quota-deem-statfs: on
cluster.self-heal-daemon: on
nfs.outstanding-rpc-limit: 16
diagnostics.brick-log-level: INFO
nfs-ganesha: enable
cluster.enable-shared-storage: enable
snap-activate-on-create: enable
auto-delete: enable
cluster.server-quorum-ratio: 51%

Comment 2 Riyas Abdulrasak 2017-01-18 15:09:40 UTC

Created attachment 1242190 [details]
Brick logs from the n6 server

Comment 3 Riyas Abdulrasak 2017-01-18 15:13:12 UTC

Created attachment 1242197 [details]
glustershd.log from n7 server

Comment 7 Riyas Abdulrasak 2017-01-27 06:53:40 UTC

Created attachment 1245019 [details]
tcpdump from the source server

Comment 22 Sanoj Unnikrishnan 2017-04-06 06:18:42 UTC

Created attachment 1269167 [details]
gdb script to print ancestry

I was not able to do it with systemtap,
attaching gdb script for the same.

The script prints the dentry list and gfid on which inode->parent call failed.

Comment 25 Sanoj Unnikrishnan 2017-04-07 05:15:34 UTC

An update on the RCA of issue. 
The issue was seen when hard link is attempted to a symlink file.

Attempting same scenario with above script:
sym1 is a symlink under "/1" and sym3, sym4, sym9, sym11... were hard links created for the sym1 under "/2"

0x7eff987a5a40]
 --> [0x7effa00f8a30]/<GFID:00000000000000000000000000000001>
 --> [0x7effa00f2020]2<GFID:04b5d81439b045cf9824d1f2adadd4ef>
 --> [0x7effa00f03b0]sym3<GFID:1c613a545ed341f7853ffe2f184fd783>
 --> [0x7effa00e7de0]sym4<GFID:1c613a545ed341f7853ffe2f184fd783>
 --> [0x7effa00ad1e0]sym9<GFID:1c613a545ed341f7853ffe2f184fd783>
 --> [0x7effa00f2e40]sym11<GFID:1c613a545ed341f7853ffe2f184fd783>
 --> [0x7effa00d9550]sym12<GFID:1c613a545ed341f7853ffe2f184fd783>
 --> [0x7effa0003ab0]sym13<GFID:1c613a545ed341f7853ffe2f184fd783>
 --> [0x7effa00d8fb0]sym17<GFID:1c613a545ed341f7853ffe2f184fd783>
 --> [0x7effa00f1250]/<GFID:00000000000000000000000000000001>
 --> [0x7effa00f1350]sym1<GFID:1c613a545ed341f7853ffe2f184fd783>
 --> [0x7effa0001990]sym2<GFID:1c613a545ed341f7853ffe2f184fd783>

Quota_build_ancestry_cbk expects successive entries to be ancestors along the path and attempts to link them. So, in the above case we will attempt linking sym4 to sym3 , sym9 to sym4 and so on.

In the inode_link code we have,
...
                if (parent->ia_type != IA_IFDIR) {
                        GF_ASSERT (!"link attempted on non-directory parent");
                        return NULL;
                }
...
So the parent is not linked. 
This seems to be an issue that needs to be fixed.
However, this must have errored out in quota_build_ancestry_cbk code and not reached the statement where "parent is null" is logged. Looking further into this.

Comment 30 Sanoj Unnikrishnan 2017-08-04 09:55:17 UTC

The issue is solved by https://review.gluster.org/#/c/17730/ merged in upstream.

Comment 34 Vinayak Papnoi 2018-04-19 10:02:52 UTC

Build Number : glusterfs-3.12.2-7.el7rhgs.x86_64

With quota enabled on a distribute-replicate volume, volume heal is successful with files having symlinks and said symlinks having hardlinks.

Hence, moving bug to verified.

Comment 36 errata-xmlrpc 2018-09-04 06:32:03 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:2607

Comment 37 Red Hat Bugzilla 2023-09-14 03:52:26 UTC

The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days