762489 – (GLUSTER-757) [NFS-Alpha] Crash in nfs3_call_state_wipe

Bug 762489 (GLUSTER-757) - [NFS-Alpha] Crash in nfs3_call_state_wipe

Summary: [NFS-Alpha] Crash in nfs3_call_state_wipe

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	GLUSTER-757
Product:	GlusterFS
Classification:	Community
Component:	nfs
Sub Component:
Version:	nfs-alpha
Hardware:	All
OS:	Linux
Priority:	low
Severity:	medium
Target Milestone:	---
Assignee:	Shehjar Tikoo
QA Contact:
Docs Contact:
URL:
Whiteboard:
Duplicates (5):	GLUSTER-726 GLUSTER-731 GLUSTER-732 GLUSTER-738 GLUSTER-817 (view as bug list)
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2010-03-25 08:59 UTC by Anush Shetty
Modified:	2015-12-01 16:45 UTC (History)
CC List:	2 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:
Regression:	RTP
Mount Type:	nfs
Documentation:	---
CRM:
Verified Versions:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Shehjar Tikoo 2010-03-25 06:52:10 UTC

How is this nfs over posix? I see performance translators loaded too.

Confirm please and more importantly build your glusterfs binaries with debug symbols using -g -O0. Thats what makes backtraces useful.

Comment 1 Anush Shetty 2010-03-25 08:59:58 UTC

I ran dd dd ( if=/dev/zero of=ddfile bs=128K count=10000 ) on simple posix+ nfs setup and got this crash. Log files and complete core is at /share/tickets/bugid/


bt:
#0  nfs3_call_state_wipe (cs=0x2aaaab0c13e0) at nfs3.c:195
#1  0x00002ac60833d4ed in nfs3svc_write_cbk (frame=<value optimized out>, cookie=<value optimized out>, this=<value optimized out>, 
    op_ret=<value optimized out>, op_errno=0, prebuf=0x7ffff970e980, postbuf=0x7ffff970e8f0) at nfs3.c:1634
#2  0x00002ac60832df7b in nfs_fop_writev_cbk (frame=0x2aaab0000c18, cookie=0xbf76760, this=0xbf76760, op_ret=65536, op_errno=0, prebuf=0x7ffff970e980, 
    postbuf=0x7ffff970e8f0) at nfs-fops.c:1104
#3  0x00002ac608119f6c in ra_writev_cbk (frame=0x2aaab0000c80, cookie=<value optimized out>, this=<value optimized out>, op_ret=65536, op_errno=0, 
    prebuf=0x7ffff970e980, postbuf=0x7ffff970e8f0) at read-ahead.c:635
#4  0x00002ac607f0f24d in qr_writev_cbk (frame=0x2aaab0000ce0, cookie=<value optimized out>, this=<value optimized out>, op_ret=65536, op_errno=0, 
    prebuf=0x7ffff970e980, postbuf=0x7ffff970e8f0) at quick-read.c:984
#5  0x00002ac607d04c8f in ioc_writev_cbk (frame=0x2aaab0000d40, cookie=<value optimized out>, this=<value optimized out>, op_ret=65536, op_errno=0, 
    prebuf=0x7ffff970e980, postbuf=0x7ffff970e8f0) at io-cache.c:1097
#6  0x00002ac607af7b8d in wb_writev_cbk (frame=0x2aaab0000e70, cookie=<value optimized out>, this=<value optimized out>, op_ret=65536, op_errno=0, 
    prebuf=0x7ffff970e980, postbuf=0x7ffff970e8f0) at write-behind.c:1832
#7  0x00002ac6078ebb69 in client_write_cbk (frame=0x2aaab0000f30, hdr=<value optimized out>, hdrlen=<value optimized out>, iobuf=<value optimized out>)
    at client-protocol.c:4455
#8  0x00002ac6078d80ba in protocol_client_pollin (this=0xbf73e60, trans=0xbf78c70) at client-protocol.c:6827
#9  0x00002ac6078df4d2 in notify (this=0x2aaaac08e400, event=2, data=0xbf78c70) at client-protocol.c:6946
#10 0x00002ac606e171c3 in ?? () from /opt/glusterfs/gnfs/lib/libglusterfs.so.0

Comment 2 Anand Avati 2010-04-13 14:01:52 UTC

PATCH: http://patches.gluster.com/patch/3136 in master (nfs3: Use nfs3state in call_state to avoid getting from rpc request)

Comment 3 Shehjar Tikoo 2010-04-21 03:00:00 UTC

*** Bug 726 has been marked as a duplicate of this bug. ***

Comment 4 Shehjar Tikoo 2010-04-21 03:00:17 UTC

*** Bug 731 has been marked as a duplicate of this bug. ***

Comment 5 Shehjar Tikoo 2010-04-21 03:00:25 UTC

*** Bug 732 has been marked as a duplicate of this bug. ***

Comment 6 Shehjar Tikoo 2010-04-21 03:00:47 UTC

*** Bug 738 has been marked as a duplicate of this bug. ***

Comment 7 Shehjar Tikoo 2010-04-21 03:02:07 UTC

*** Bug 817 has been marked as a duplicate of this bug. ***

Comment 8 Shehjar Tikoo 2010-05-31 08:14:25 UTC

Regression Test:
Going by other reports of the bugs which are duplicates of this, I think the simplest way to test this bug's regression is by writing a 1Tb file using dd.

Test case:
1. Export a simple posix+io-threads through nfsx.

2. On the NFS client mount point run the following test command:
dd if=/dev/zero of=testfile bs=64k count=16777216

Result:
The nfs server process should not crash throughout the dd run.

Comment 9 Lakshmipathi G 2010-06-18 01:59:23 UTC

(In reply to comment #8)
> Regression Test:
> Going by other reports of the bugs which are duplicates of this, I think the
> simplest way to test this bug's regression is by writing a 1Tb file using dd.
> 
> Test case:
> 1. Export a simple posix+io-threads through nfsx.
> 
> 2. On the NFS client mount point run the following test command:
> dd if=/dev/zero of=testfile bs=64k count=16777216
> 
> Result:
> The nfs server process should not crash throughout the dd run.

While reproducing, we'll be testing for a server crash. If the server does crash, the dd under normal circumstances will hang.

In order to force the NFS kernel module to return an error after a timeout, use the following mount command:

mount <nfsserver>:<export> -o soft,intr <mountpoint>

This ensures that on a timeout, an EIO is returned to the application.

Comment 10 Lakshmipathi G 2010-06-29 08:41:07 UTC

without the patch (http://patches.gluster.com/patch/3136 ) ,creating 300GB file using dd results in crash- but now it's fixed in nfs-beta-rc7.

Note You need to log in before you can comment on or make changes to this bug.