1177927 – [AFR] getfattr on fuse mount gives error : Software caused connection abort

Bug 1177927 - [AFR] getfattr on fuse mount gives error : Software caused connection abort

Summary: [AFR] getfattr on fuse mount gives error : Software caused connection abort

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	replicate
Sub Component:
Version:	rhgs-3.0
Hardware:	x86_64
OS:	Linux
Priority:	urgent
Severity:	urgent
Target Milestone:	---
Target Release:	RHGS 3.0.3
Assignee:	Krutika Dhananjay
QA Contact:	Anil Shah
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1162694 1178079 1180073
TreeView+	depends on / blocked

Reported:	2014-12-31 11:30 UTC by Anil Shah
Modified:	2016-09-17 12:20 UTC (History)
CC List:	7 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Clones:	1178079 (view as bug list)
Environment:
Last Closed:	2015-01-15 13:43:17 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2015:0038	0	normal	SHIPPED_LIVE	Red Hat Storage 3.0 enhancement and bug fix update #3	2015-01-15 18:35:28 UTC

Description Anil Shah 2014-12-31 11:30:31 UTC

Description of problem:

executing getfattr on fuse mount gives error " Software caused connection abort" , and  running ls gives output "Transport endpoint is not connected"  

Version-Release number of selected component (if applicable):

[root@node1 b1]# rpm -qa | grep glusterfs
glusterfs-rdma-3.6.0.40-1.el6rhs.x86_64
glusterfs-3.6.0.40-1.el6rhs.x86_64
glusterfs-libs-3.6.0.40-1.el6rhs.x86_64
glusterfs-geo-replication-3.6.0.40-1.el6rhs.x86_64
samba-glusterfs-3.6.509-169.4.el6rhs.x86_64
glusterfs-api-3.6.0.40-1.el6rhs.x86_64
glusterfs-fuse-3.6.0.40-1.el6rhs.x86_64
glusterfs-server-3.6.0.40-1.el6rhs.x86_64
glusterfs-cli-3.6.0.40-1.el6rhs.x86_64

How reproducible:

100%

Steps to Reproduce:

1.create 2*2 distribute replicate volume
2. Do fuse mount
2. set the volume options 'metadata-self-heal' , 'entry-self-heal' and 'data-self-heal' to value “off”
3. set self-heal-daemon off
4 Create file on mount point
5 do getfattr to the file

Actual results:

[root@client glusterfs]# getfattr -d -m . -e hex testfile 
getfattr: testfile: Software caused connection abort
[root@client glusterfs]# ll
ls: cannot open directory .: Transport endpoint is not connected
================================================
logs form /var/log/glusterfs/mnt-glusterfs-.log

package-string: glusterfs 3.6.0.40
/usr/lib64/libglusterfs.so.0(_gf_msg_backtrace_nomem+0xb6)[0x38b3620106]
/usr/lib64/libglusterfs.so.0(gf_print_trace+0x33f)[0x38b363ad5f]
/lib64/libc.so.6[0x352d2326a0]
/lib64/libc.so.6[0x352d28aab6]
/lib64/libc.so.6[0x352d275710]
/lib64/libc.so.6(vsscanf+0x65)[0x352d2696b5]
/lib64/libc.so.6(_IO_sscanf+0x88)[0x352d263728]
/usr/lib64/glusterfs/3.6.0.40/xlator/features/snapview-client.so(svc_getxattr+0xdf)[0x7fa7cddd1e8f]
/usr/lib64/glusterfs/3.6.0.40/xlator/debug/io-stats.so(io_stats_getxattr+0x167)[0x7fa7cdbb3707]
/usr/lib64/libglusterfs.so.0(default_getxattr+0x7b)[0x38b3625a8b]
/usr/lib64/glusterfs/3.6.0.40/xlator/mount/fuse.so(fuse_listxattr_resume+0x4c1)[0x7fa7d1527061]
/usr/lib64/glusterfs/3.6.0.40/xlator/mount/fuse.so(+0x88a6)[0x7fa7d15218a6]
/usr/lib64/glusterfs/3.6.0.40/xlator/mount/fuse.so(+0x85d6)[0x7fa7d15215d6]
/usr/lib64/glusterfs/3.6.0.40/xlator/mount/fuse.so(+0x88ee)[0x7fa7d15218ee]
/usr/lib64/glusterfs/3.6.0.40/xlator/mount/fuse.so(fuse_resolve_continue+0x41)[0x7fa7d1521971]
/usr/lib64/glusterfs/3.6.0.40/xlator/mount/fuse.so(fuse_resolve_gfid_cbk+0x1c1)[0x7fa7d1521c41]
/usr/lib64/glusterfs/3.6.0.40/xlator/debug/io-stats.so(io_stats_lookup_cbk+0x113)[0x7fa7cdbbde73]
/usr/lib64/glusterfs/3.6.0.40/xlator/features/snapview-client.so(svc_lookup_cbk+0x218)[0x7fa7cddd3d48]
/usr/lib64/glusterfs/3.6.0.40/xlator/performance/md-cache.so(mdc_lookup_cbk+0x14c)[0x7fa7cdfdf7cc]
/usr/lib64/glusterfs/3.6.0.40/xlator/cluster/distribute.so(dht_discover_complete+0x173)[0x7fa7ce41aa53]
/usr/lib64/glusterfs/3.6.0.40/xlator/cluster/distribute.so(dht_discover_cbk+0x273)[0x7fa7ce4226c3]
/usr/lib64/glusterfs/3.6.0.40/xlator/cluster/replicate.so(afr_lookup_cbk+0x558)[0x7fa7ce6a1e08]
/usr/lib64/glusterfs/3.6.0.40/xlator/protocol/client.so(client3_3_lookup_cbk+0x647)[0x7fa7ce8e1307]
/usr/lib64/libgfrpc.so.0(rpc_clnt_handle_reply+0xa5)[0x38b320e775]
/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x142)[0x38b320fc02]
/usr/lib64/libgfrpc.so.0(rpc_transport_notify+0x28)[0x38b320b3e8]
/usr/lib64/glusterfs/3.6.0.40/rpc-transport/socket.so(+0x92ed)[0x7fa7c7df62ed]
/usr/lib64/glusterfs/3.6.0.40/rpc-transport/socket.so(+0xaced)[0x7fa7c7df7ced]
/usr/lib64/libglusterfs.so.0[0x38b3676be7]
/usr/sbin/glusterfs(main+0x603)[0x407e93]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x352d21ed5d]
/usr/sbin/glusterfs[0x4049a9]




Expected results:

getfattr on clinet should be successfull

Additional info:

[root@node1 b1]# gluster v info
 
Volume Name: testvol
Type: Distributed-Replicate
Volume ID: a6941fe7-ecc0-4b45-91c4-73fd1e37795f
Status: Started
Snap Volume: no
Number of Bricks: 2 x 2 = 4
Transport-type: tcp
Bricks:
Brick1: 10.70.47.143:/rhs/brick1/b1
Brick2: 10.70.47.145:/rhs/brick1/b2
Brick3: 10.70.47.150:/rhs/brick1/b3
Brick4: 10.70.47.151:/rhs/brick1/b4
Options Reconfigured:
features.barrier: disable
cluster.self-heal-daemon: off
features.uss: on
performance.open-behind: off
performance.quick-read: off
performance.io-cache: off
performance.read-ahead: off
performance.write-behind: off
cluster.entry-self-heal: off
cluster.data-self-heal: off
cluster.metadata-self-heal: off
performance.readdir-ahead: on
auto-delete: disable
snap-max-soft-limit: 90
snap-max-hard-limit: 256

Comment 2 Sachidananda Urs 2014-12-31 12:36:59 UTC

Enabling uss and running gefattr on a file causes the crash.

Comment 6 Anil Shah 2015-01-02 09:05:28 UTC

The behaviour is happening when uss is turned on.
Yes, this is part of regression, I checked, behaviour  is happening on build .38 too.

Comment 7 senaik 2015-01-02 09:46:39 UTC

It worked fine in the earlier build glusterfs 3.6.0.35

Comment 8 Krutika Dhananjay 2015-01-02 10:34:35 UTC

Patch merged

Comment 10 Sachidananda Urs 2015-01-02 19:13:05 UTC

Verified on the release 3.6.0.41. Thanks to Seema Naik <senaik> and Anil Shah <ashah>

Comment 12 errata-xmlrpc 2015-01-15 13:43:17 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-0038.html

Note You need to log in before you can comment on or make changes to this bug.