1328451 – observing " Too many levels of symbolic links" after adding bricks and then issuing a replace brick

Bug 1328451 - observing " Too many levels of symbolic links" after adding bricks and then issuing a replace brick

Summary: observing " Too many levels of symbolic links" after adding bricks and then i...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	gluster-nfs
Sub Component:
Version:	rhgs-3.1
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	RHGS 3.2.0
Assignee:	Soumya Koduri
QA Contact:	Nag Pavan Chilakam
Docs Contact:
URL:
Whiteboard:
Depends On:	1356068
Blocks:	1351522
TreeView+	depends on / blocked

Reported:	2016-04-19 12:11 UTC by Nag Pavan Chilakam
Modified:	2018-01-23 10:56 UTC (History)
CC List:	11 users (show)
Fixed In Version:	glusterfs-3.8.4-1
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Clones:	1356068 (view as bug list)
Environment:
Last Closed:	2017-03-23 05:28:53 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Red Hat Bugzilla	770250	high	CLOSED	readdir64_r calls fail with ELOOP	2021-02-22 00:41:40 UTC
Red Hat Bugzilla	1309186	high	CLOSED	file creates fail with " failed to open '<filename>': Too many levels of symbolic links for file create/write when resta...	2021-02-22 00:41:40 UTC
Red Hat Bugzilla	1344841	unspecified	CLOSED	[nfs]: cp fails with error "Too many levels of symbolic links"	2021-02-22 00:41:40 UTC
Red Hat Product Errata	RHSA-2017:0486	normal	SHIPPED_LIVE	Moderate: Red Hat Gluster Storage 3.2.0 security, bug fix, and enhancement update	2017-03-23 09:18:45 UTC

Internal Links: 770250 1309186 1344841

Description Nag Pavan Chilakam 2016-04-19 12:11:27 UTC

Description of problem:
======================
As part of bug verification of bz#1278284, I was trying to do a add brick on an ec volume followed by a replace brick. On doing that I observed " Too many levels of symbolic links" on rhel6.7 client

When We do a add-brick to a pure ec volume, and then follow that with a new dir creates and then file creates under new dir, and then do a replace brick of any of the bricks (old set or new set),we see the above mentioned error on client.

I observerd this only on one of the clients(seen on rhel 6.7)

Was able to reproduce consistently and didn't observe on a replace brick issued on a pure ec volume.

Version-Release number of selected component (if applicable):
===================
server side:
[root@network bricks]# rpm -qa|grep gluster
glusterfs-client-xlators-3.7.9-1.el7rhgs.x86_64
glusterfs-server-3.7.9-1.el7rhgs.x86_64
python-gluster-3.7.5-19.el7rhgs.noarch
gluster-nagios-addons-0.2.5-1.el7rhgs.x86_64
vdsm-gluster-4.16.30-1.3.el7rhgs.noarch
glusterfs-3.7.9-1.el7rhgs.x86_64
glusterfs-api-3.7.9-1.el7rhgs.x86_64
glusterfs-cli-3.7.9-1.el7rhgs.x86_64
glusterfs-geo-replication-3.7.9-1.el7rhgs.x86_64
glusterfs-debuginfo-3.7.9-1.el7rhgs.x86_64
gluster-nagios-common-0.2.3-1.el7rhgs.noarch
glusterfs-libs-3.7.9-1.el7rhgs.x86_64
glusterfs-fuse-3.7.9-1.el7rhgs.x86_64
glusterfs-rdma-3.7.9-1.el7rhgs.x86_64

[root@network bricks]# cat /etc/redhat-*
cat: /etc/redhat-access-insights: Is a directory
Red Hat Enterprise Linux Server release 7.2 (Maipo)


client side:
[root@nchilaka-rhel6 67]# cat /etc/redhat-*
cat: /etc/redhat-access-insights: Is a directory
cat: /etc/redhat-lsb: Is a directory
Red Hat Enterprise Linux Server release 6.7 (Santiago)

(another client being used: rhel7.2)




Steps to Reproduce:
==================
1.create a pure-ec volume (or even distributed)(say 1x(4+2))
2.mount volume on two clients in my case one was rhel7.2 and other was rhel6.7 both using different nfs servers
3.now create some files and dirs from both mnts
4. Now add bricks in this case the (4+2) ie 6 bricks
5. Now create a new dir and in that dir create another dir for each clients and pump in IOs from both clients into their respective dirs(use dd command)
6. while Ios are in progress replace any of the bricks and it can be seen that on rhel 6.7 client you get too many levels of symblic links




Attached are sos reports

Comment 2 Nag Pavan Chilakam 2016-04-19 12:13:11 UTC

by mistake added "3.1.3" to fixed in version.... I should be putting that for internal whiteboard tracking. hence making the req. changes

Comment 3 Nag Pavan Chilakam 2016-04-19 12:25:51 UTC

logs available @ 
rhsqe-repo.lab.eng.blr.redhat.com:/home/repo/sosreports/nchilaka/bug.1328451


I have even put snippet of nfs server log using which the vol was mounted on rhel6.7

there is also a log of commands executed in same location

Comment 7 Nag Pavan Chilakam 2016-05-09 10:54:49 UTC

I was able to reproduce the issue(this time client being 7.2)

Cluster nodes:10.70.35.191/27/98/64/44/144
nfs clients:6.7:10.70.42.106 (nfs server was 10.70.35.114)
            7.2:10.70.35.103 (nfs server was 10.70.35.64)
Replaced below brick:
gluster v replace-brick pure-ec 10.70.35.27:/rhs/brick4/pure-ec 10.70.35.27:/rhs/brick3/pure-ec  commit force

1024000 bytes (1.0 MB) copied, 0.081829 s, 12.5 MB/s
1000+0 records in
1000+0 records out
1024000 bytes (1.0 MB) copied, 0.0848642 s, 12.1 MB/s
1000+0 records in
1000+0 records out
1024000 bytes (1.0 MB) copied, 6.112 s, 168 kB/s
dd: failed to open ‘replace72.177’: Too many levels of symbolic links
1000+0 records in
1000+0 records out
1024000 bytes (1.0 MB) copied, 0.111717 s, 9.2 MB/s
1000+0 records in

Comment 11 Ashish Pandey 2016-05-24 09:59:35 UTC

This issue is happening with plain distribute volume too.
Somehow, nfs is propagating wrong errno, ELOOP, instead of ENOTCON which gives 
this error message.

Comment 12 Soumya Koduri 2016-06-13 11:16:30 UTC

gluster-NFS server doesn't seem to be sending ELOOP error anytime. It is not a standard NFSv3 error. Most probably this error is generated on the NFS-client. One of the reasons could be when the NFS-client detects a loop during readdir from bug770250 .i.e, if the directory cookie (dir_cookie) of a child directory is same as any of the parent directory's cookie in the client page cache. 

I am guessing probably when gluster-NFS server is restarted, the cookie generated and sent by the server post reboot could have collided with the value it might had sent before. A pkt trace collected from the client should help us here.

Also please check if there is below log message in the kernel logs on the client machine.

[root@skoduri n]# dmesg | grep readdir
[21615.224218] NFS: directory /n contains a readdir loop.Please contact your server vendor.  The file: . has duplicate cookie 1457221137058955403
[root@skoduri n]#

Comment 14 Ashish Pandey 2016-06-15 09:32:05 UTC

Soumya,

This issue is happening with plain distributed volume also. No need of afr or ec volume.
Just create a distributed volume , mount it using nfs and do IO . Replace a brick as mentioned above and you can hit the issue.

I could have provided all the info you asked but I think if you recreate it, you can check all the things from nfs-ganesha point of you in real time.

Comment 15 Kaleb KEITHLEY 2016-06-15 13:50:24 UTC

Please provide a test case

Comment 16 Soumya Koduri 2016-07-12 10:17:20 UTC

After doing some experiments, here is what seem to have happened. Client is throwing this error when the server doesn't respond to its lookup request for any file. I ran following test to confirm that -

Created 1*2 volume -

Status of volume: vol1
Gluster process                             TCP Port  RDMA Port  Online  Pid
------------------------------------------------------------------------------
Brick 192.168.122.201:/vol1                 49152     0          Y       4249 
Brick 192.168.122.201:/vol1_r               49153     0          Y       4254 
Self-heal Daemon on localhost               N/A       N/A        N       N/A  
 


[root@skoduri mnt]# for i in `seq 1 100`; do mkdir dir.$i ; for j in `seq 1 100`; do dd if=/dev/urandom of=dir.$i/testfile.$i.$j bs=1k count=$j & done ;wait ; done


Now put a breakpoint in the nfs process in lookup code path "nfs_lookup()"

(gdb) b nfs_lookup
Breakpoint 1 at 0x7f603dae81a0: file nfs-generics.c, line 82.
(gdb) c
Continuing.
[Switching to Thread 0x7f603e48e700 (LWP 7737)]

Breakpoint 1, nfs_lookup (nfsx=0x7f6038019610, xl=0x7f603800ebc0, nfu=nfu@entry=0x7f603e48bae0, 
    pathloc=pathloc@entry=0x7f60340b9d10, cbk=cbk@entry=0x7f603daf4030 <nfs3svc_lookup_cbk>, 
    local=local@entry=0x7f60340b9860) at nfs-generics.c:82
warning: Source file is more recent than executable.
82	        if ((!nfsx) || (!xl) || (!pathloc) || (!nfu))
(gdb) p *pathloc
$1 = {path = 0x7f60381d1d60 "/dir.4/testfile.4.10", name = 0x7f60381d1d67 "testfile.4.10", inode = 0x7f603515c9f4, 
  parent = 0x7f603515bfac, gfid = '\000' <repeats 15 times>, pargfid = "[Z-\317$/LV\262\374\005\214\360Y\246g"}
(gdb)   

Now restart nfs process -

[root@dhcp35-197 glusterfs]# gluster v start vol1 force
volume start: vol1: success
[root@dhcp35-197 glusterfs]# 


I could see ELOOP error thrown -

[skoduri@skoduri ~]$ sudo tail -f /var/log/messages | grep nfs | grep loop

Jul 12 15:32:30 skoduri kernel: [149340.627189] VFS: Lookup of 'testfile.4.10' in nfs 0:57 would have caused loop


[root@skoduri skoduri]# tshark -r /tmp/nfs-ELOOP.pcap nfs.name == "testfile.4.10"
Running as user "root" and group "root". This could be dangerous.
11807  21.644349 192.168.122.1 -> 192.168.122.201 NFS 232 V3 LOOKUP Call, DH: 0xac64f17e/testfile.4.10
12032  41.336656 192.168.122.1 -> 192.168.122.201 NFS 14548 V3 WRITE Call, FH: 0x98e32213 Offset: 0 Len: 18432 FILE_SYNC
[root@skoduri skoduri]# 

Note for some weird reason, the second WRITE call seems to be amalgamation of two different packets (1st one being WRITE on FH 0x98e32213 followed by lookup of 0xac64f17e/testfile.4.10).

As can be seen there is not reply sent to the lookup on 0xac64f17e/testfile.4.10

Will attach pkt trace and rpcdebug log messages.

Comment 18 Soumya Koduri 2016-07-13 09:09:09 UTC

Inputs from Niels - 

>>>
The affected file is testfile.4.10. In Wireshark we can find the LOOKUP
procedure with a filter like 'nfs.name == "testfile.4.10"'. This results
in the LOOKUP Call, twice...

- frame 11807 is a plain LOOKUP
- frame 12032 is a WRITE Call, but it also carries the same LOOKUP again

The reply on the LOOKUP, found with "rpc.xid == 0xf4ef4e62" is in frame
12049.

The contents of the LOOKUP Reply in the obj_attributes have the details
from a directory, whereas testfile.4.10 really should be a file, or
probably NFS3ERR_NOENT. The directory has the same 'fileid' (used as
inode by the NFS-client) as the parent directory of testfile.4.10. This
obviously is an incorrect reply.

The fileid/inode from the LOOKUP Reply matches the one in the messages
file too (12897189535768225383).

This looks like a bug in Gluster/NFS to me... Hooray?

If we can capture a tcpdump on the NFS-server, we can see the GlusterFS
protocol too. This then makes it possible to dissect the filehandle in
volume-id + GFID, which we then can match with the replies that the
bricks send.
<<<

Comment 19 Soumya Koduri 2016-07-13 09:37:14 UTC

As suggested by Niels above, have taken pkt trace on the server side (copied to the same machine mentioned above "/root/bug1328451/nfs-ELOOP-server.pcap"

568 152.036090 192.168.122.1 -> 192.168.122.201 NFS 228 V3 LOOKUP Call, DH: 0x84b243c9/testfile.1
569 152.036116 192.168.122.201 -> 192.168.122.1 TCP 68 2049→666 [ACK] Seq=1 Ack=161 Win=30080 Len=0 TSval=4495843 TSecr=10310504
570 152.036717 192.168.122.201 -> 192.168.122.201 GlusterFS 340 V330 LOOKUP Call, Filename: (nameless, by GFID)
571 152.037099 192.168.122.201 -> 192.168.122.201 GlusterFS 360 V330 LOOKUP Reply (Call In 570)
572 152.037126 192.168.122.201 -> 192.168.122.201 TCP 68 49145→49155 [ACK] Seq=2045 Ack=1549 Win=50176 Len=0 TSval=4495845 TSecr=4495845
573 152.037331 192.168.122.201 -> 192.168.122.201 GlusterFS 296 V330 LOOKUP Call, Filename: (nameless, by GFID)
574 152.037643 192.168.122.201 -> 192.168.122.201 GlusterFS 360 V330 LOOKUP Reply (Call In 573)
575 152.037845 192.168.122.201 -> 192.168.122.1 NFS 332 V3 LOOKUP Reply (Call In 568), FH: 0x84b243c9


From the pkt trace, looks like when nfs server received LOOKUP of 0x84b243c9/testfile.1, it performed glusterfs nameless LOOKUP of root gfid followed by dir1 gfid. But there is no subsequent LOOKUP done on testfile. That's the reason we could see dir1 attributes in the NFS LOOKUP reply

From gdb(/code), 

Post nameless lookup of dir1 gfid, 

(gdb) bt
#0  nfs3_lookup_resume (carg=0x7fffe07b906c) at nfs3.c:1438
#1  0x00007fffea5358f4 in nfs3_fh_resolve_entry_hard (cs=cs@entry=0x7fffe07b906c) at nfs3-helpers.c:3789
#2  0x00007fffea535b19 in nfs3_fh_resolve_inode_lookup_cbk (frame=<optimized out>, cookie=<optimized out>, 
    this=0x7fffe4019610, op_ret=<optimized out>, op_errno=<optimized out>, inode=<optimized out>, 
    buf=0x7fffe82a6104, xattr=0x7ffff55581e4, postparent=0x7fffe82a6334) at nfs3-helpers.c:3714
#3  0x00007fffea50d611 in nfs_fop_lookup_cbk (frame=0x7ffff5db8a5c, cookie=0x7fffe4018650, this=<optimized out>, 
    op_ret=<optimized out>, op_errno=<optimized out>, inode=<optimized out>, buf=0x7fffe82a6104, 
    xattr=0x7ffff55581e4, postparent=0x7fffe82a6334) at nfs-fops.c:430
#4  0x00007ffff7e1a897 in io_stats_lookup_cbk (frame=0x7ffff5db8b30, cookie=<optimized out>, this=<optimized out>, 
    op_ret=0, op_errno=117, inode=0x7fffe8f97128, buf=0x7fffe82a6104, xdata=0x7ffff55581e4, 
    postparent=0x7fffe82a6334) at io-stats.c:2116
#5  0x00007fffea59a710 in dht_discover_complete (this=this@entry=0x7fffe4016a30, 
    discover_frame=discover_frame@entry=0x7ffff5db87e0) at dht-common.c:383
#6  0x00007fffea59b38b in dht_discover_cbk (frame=0x7ffff5db87e0, cookie=<optimized out>, this=0x7fffe4016a30, 
    op_ret=<optimized out>, op_errno=22, inode=0x7fffe8f97128, stbuf=0x7fffdaefda70, xattr=0x7ffff55581e4, 
    postparent=0x7fffdaefdae0) at dht-common.c:512
#7  0x00007fffea6832dc in client3_3_lookup_cbk (req=<optimized out>, iov=<optimized out>, count=<optimized out>, 
    myframe=0x7ffff5db8490) at client-rpc-fops.c:2937
#8  0x00007ffff7efde90 in rpc_clnt_handle_reply (clnt=clnt@entry=0x7fffe405b0f0, pollin=pollin@entry=0x7fffd4003aa0)
    at rpc-clnt.c:759
#9  0x00007ffff7efe16c in rpc_clnt_notify (trans=<optimized out>, mydata=0x7fffe405b120, event=<optimized out>, 
    data=0x7fffd4003aa0) at rpc-clnt.c:920
#10 0x00007ffff7efa543 in rpc_transport_notify (this=this@entry=0x7fffe406ae30, 
    event=event@entry=RPC_TRANSPORT_MSG_RECEIVED, data=data@entry=0x7fffd4003aa0) at rpc-transport.c:541
#11 0x00007ffff7e431a3 in socket_event_poll_in (this=this@entry=0x7fffe406ae30) at socket.c:2238
#12 0x00007ffff7e45b68 in socket_event_handler (fd=<optimized out>, idx=9, data=0x7fffe406ae30, poll_in=1, 
    poll_out=0, poll_err=0) at socket.c:2351
#13 0x00007ffff7f8388a in event_dispatch_epoll_handler (event=0x7fffdaefdf10, event_pool=0x44ceb0)
    at event-epoll.c:571
#14 event_dispatch_epoll_worker (data=0x7fffe4041bf0) at event-epoll.c:674
#15 0x00007ffff72a260a in start_thread (arg=0x7fffdaefe700) at pthread_create.c:334
---Type <return> to continue, or q <return> to quit--- 
#16 0x00007ffff6b8fbbd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
(gdb) n
1441	        nfs_user_t                      nfu = {0, };
(gdb) 
1438	{
(gdb) 
1441	        nfs_user_t                      nfu = {0, };
(gdb) 
1438	{
(gdb) 
1445	        if (!carg) {
(gdb) 
1441	        nfs_user_t                      nfu = {0, };
(gdb) 
1443	        struct nfs3_fh                  newfh = {{0},};
(gdb) 
1441	        nfs_user_t                      nfu = {0, };
(gdb) 
1443	        struct nfs3_fh                  newfh = {{0},};
(gdb) 
1441	        nfs_user_t                      nfu = {0, };
(gdb) 
1443	        struct nfs3_fh                  newfh = {{0},};
(gdb) 
1445	        if (!carg) {
(gdb) 
1452	        nfs3_check_fh_auth_status (cs, stat, _gf_false, nfs3err);
(gdb) 
1453	        nfs3_check_fh_resolve_status (cs, stat, nfs3err);
(gdb) p cs->resolventry
$3 = 0x7fffe4131a70 "testfile.1"
(gdb) p cs->resolvefh
$4 = {ident = ":OGL", exportid = "\205\376E\272\361+HI\241\256\060t\005\024\066\242", 
  gfid = "\374K\332D\t\310Ho\263-\263\241%R\260\277", mountid = "\265B\202\327", '\000' <repeats 11 times>, 
  padding = '\000' <repeats 11 times>}
(gdb) n
1454	        cs->parent = cs->resolvefh;
(gdb) 
1456		if (cs->hardresolved) {
(gdb) 
1458			nfs3_fh_build_child_fh (&cs->parent, &cs->stbuf, &newfh);
(gdb) 
1440	        int                             ret = -EFAULT;
(gdb) p cs->hardresolved
$5 = 1
(gdb) n
1457			stat = NFS3_OK;
(gdb) 
1458			nfs3_fh_build_child_fh (&cs->parent, &cs->stbuf, &newfh);
(gdb) 
1459			goto nfs3err;
(gdb) 
1470	                nfs3_log_common_res (rpcsvc_request_xid (cs->req),
(gdb) 
1473	                nfs3_lookup_reply (cs->req, stat, &newfh, &cs->stbuf,
(gdb) 
1475	                nfs3_call_state_wipe (cs);
(gdb) 



In 

int
nfs3_fh_resolve_entry_hard (nfs3_call_state_t *cs)
{
        int             ret = -EFAULT;
        nfs_user_t      nfu = {0, };

        if (!cs)
                return ret;

        nfs_loc_wipe (&cs->resolvedloc);
        nfs_user_root_create (&nfu);
        gf_msg_trace (GF_NFS3, 0, "FH hard resolution: gfid: %s "
                      ", entry: %s", uuid_utoa (cs->resolvefh.gfid),
                      cs->resolventry);

        ret = nfs_entry_loc_fill (cs->nfsx, cs->vol->itable, cs->resolvefh.gfid,
                                  cs->resolventry, &cs->resolvedloc,
                                  NFS_RESOLVE_CREATE);

>>>>> Since entry is not present in the inode table, ret is '-2'.

        if (ret == -2) {
                gf_msg_trace (GF_NFS3, 0, "Entry needs lookup: %s",
                              cs->resolvedloc.path);
                /* If the NFS op is lookup, let the resume callback
                 * handle the sending of the lookup fop. Similarly,
                 * if the NFS op is create, let the create call
                 * go ahead in the resume callback so that an EEXIST gets
                 * handled at posix without an extra fop at this point.
                 */
                if (nfs3_lookup_op (cs) ||
                    (nfs3_create_op (cs) && !nfs3_create_exclusive_op (cs))) {
                        cs->lookuptype = GF_NFS3_FRESH;
                        cs->resolve_ret = 0;
                        nfs3_call_resume (cs);

>>>>> Since it is LOOKUP fop, we call resume_fn - nfs3_lookup_resume() which responds with the stat of pargfid ('dir1' in this case). 

                } else {
                        cs->hardresolved = 1;
                        nfs_lookup (cs->nfsx, cs->vol, &nfu, &cs->resolvedloc,
                                    nfs3_fh_resolve_entry_lookup_cbk, cs);
                }
                ret = 0;


The fix seems to be either fall through above else case for LOOKUP fop or check for cs->resolveentry and resolve it in "nfs3_lookup_resume" as well.

Comment 22 Atin Mukherjee 2016-09-17 14:53:19 UTC

Upstream mainline : http://review.gluster.org/14911
Upstream 3.8 : http://review.gluster.org/14941

And the fix is available in rhgs-3.2.0 as part of rebase to GlusterFS 3.8.4.

Comment 25 Nag Pavan Chilakam 2016-11-18 07:03:07 UTC

QATP:
======
QATP:
===
TC#1(from my understanding of http://review.gluster.org/#/c/14941/)
1)create a distributed volume with 2 bricks and mount volume on two gnfs clients say c1 and c2
2)created a dir say dir1
3)now create files f{1..10}, and in backend check which file hashes to which brick
4)delete all files 
5) now from backend create a file f1 on the right brick(which it hashes to as you got the info from step3)
6)Now from the clients c1 and c2 do a "ls"
expected: ls must not return anything 
7)now do on c1 "ls f1"
expected:ls must not tell file is not there or any other error instead it should display f1 and must be visible on all clients post this op


TC#2:
same as step1,2,3,4 from tc#1
5)now from backend create a file f1 on the brick which it DOESNT hash to
6) now from client c1 and c2 do "ls f1"
expected: lookup must pass
backend brick which the file was supposed to be hashed to must have a link to file created
7)now delete f1 from client
8)then do a ls f1
expected: no eio error must be seen


TC#3:
====
1)mount a distributed volume on gnfs clients c1 and c2
2)now from c1 create a file f1 under dirs a/b/c/d
3)now restart gnfs servers using vol force start
4)now from c2 of a ls a/b/c/d/f1
expected:no too many symbolic links errors must be seen


tc#4(the case on which the bz is raised):
=====
1.create a pure-ec volume (or even distributed)(say 1x(4+2))
2.mount volume on two clients in my case one was rhel7.2 and other was rhel6.7 both using different nfs servers
3.now create some files and dirs from both mnts
4. Now add bricks in this case the (4+2) ie 6 bricks
5. Now create a new dir and in that dir create another dir for each clients and pump in IOs from both clients into their respective dirs(use dd command)
6. while Ios are in progress replace any of the bricks 


Expected: no symbolic link errors must be seen



TC#5:
Just create a distributed volume , mount it using nfs and do IO . Replace a brick while ios are happening
expected:  no symbolic link errors must be seen

Comment 26 Nag Pavan Chilakam 2016-11-18 07:06:39 UTC

Validation of QATP:
=================
TC#1==>passed (however doing just an ls doesnt display the file unlike fuse mount--> will be raising a bug for the same)
TC#2===>fail as doing a rm -rf of that file then displays input/output error on c2
will raise a bug
TC#3===>pass
tc#4===>pass
tc#5==>pass


as 3,4,5 passed, which are mainly the steps for the bug.
hence moving to verified


[root@dhcp35-37 ~]# rpm -qa|grep gluster
gluster-nagios-common-0.2.4-1.el7rhgs.noarch
glusterfs-server-3.8.4-3.el7rhgs.x86_64
glusterfs-ganesha-3.8.4-3.el7rhgs.x86_64
glusterfs-api-3.8.4-3.el7rhgs.x86_64
glusterfs-libs-3.8.4-3.el7rhgs.x86_64
glusterfs-client-xlators-3.8.4-3.el7rhgs.x86_64
nfs-ganesha-gluster-2.3.1-8.el7rhgs.x86_64
glusterfs-cli-3.8.4-3.el7rhgs.x86_64
python-gluster-3.8.4-3.el7rhgs.noarch
glusterfs-devel-3.8.4-3.el7rhgs.x86_64
glusterfs-events-3.8.4-3.el7rhgs.x86_64
glusterfs-geo-replication-3.8.4-3.el7rhgs.x86_64
glusterfs-fuse-3.8.4-3.el7rhgs.x86_64
glusterfs-api-devel-3.8.4-3.el7rhgs.x86_64
glusterfs-rdma-3.8.4-3.el7rhgs.x86_64
gluster-nagios-addons-0.2.7-1.el7rhgs.x86_64
glusterfs-3.8.4-3.el7rhgs.x86_64

Comment 27 Nag Pavan Chilakam 2016-11-18 07:17:04 UTC

1396355 	stat of a deleted file throwing Input/output error 
 1396354	negative case: lookup using ls is not detecting a file created from backend(unlike fuse mount) 
1396113 	 many file related inconsistencies with gnfs (probably due to caching and invalidating cache) 


raised the above bugs

Comment 28 Nag Pavan Chilakam 2017-01-23 14:21:35 UTC

 for EC testcase in polarion (tc#4) refer polarion tc id RHG3-11788 - BZ#1328451 - after adding bricks and then issuing a replace brick must pass without any IO errors like bserving " Too many levels of symbolic links" after adding bricks and then issuing a replace brick

Comment 30 errata-xmlrpc 2017-03-23 05:28:53 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2017-0486.html

Note You need to log in before you can comment on or make changes to this bug.