Bug 762489 (GLUSTER-757) - [NFS-Alpha] Crash in nfs3_call_state_wipe
Summary: [NFS-Alpha] Crash in nfs3_call_state_wipe
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: GLUSTER-757
Product: GlusterFS
Classification: Community
Component: nfs
Version: nfs-alpha
Hardware: All
OS: Linux
low
medium
Target Milestone: ---
Assignee: Shehjar Tikoo
QA Contact:
URL:
Whiteboard:
: GLUSTER-726 GLUSTER-731 GLUSTER-732 GLUSTER-738 GLUSTER-817 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-03-25 08:59 UTC by Anush Shetty
Modified: 2015-12-01 16:45 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:
Regression: RTP
Mount Type: nfs
Documentation: ---
CRM:
Verified Versions:
Embargoed:


Attachments (Terms of Use)

Description Shehjar Tikoo 2010-03-25 06:52:10 UTC
How is this nfs over posix? I see performance translators loaded too.

Confirm please and more importantly build your glusterfs binaries with debug symbols using -g -O0. Thats what makes backtraces useful.

Comment 1 Anush Shetty 2010-03-25 08:59:58 UTC
I ran dd dd ( if=/dev/zero of=ddfile bs=128K count=10000 ) on simple posix+ nfs setup and got this crash. Log files and complete core is at /share/tickets/bugid/


bt:
#0  nfs3_call_state_wipe (cs=0x2aaaab0c13e0) at nfs3.c:195
#1  0x00002ac60833d4ed in nfs3svc_write_cbk (frame=<value optimized out>, cookie=<value optimized out>, this=<value optimized out>, 
    op_ret=<value optimized out>, op_errno=0, prebuf=0x7ffff970e980, postbuf=0x7ffff970e8f0) at nfs3.c:1634
#2  0x00002ac60832df7b in nfs_fop_writev_cbk (frame=0x2aaab0000c18, cookie=0xbf76760, this=0xbf76760, op_ret=65536, op_errno=0, prebuf=0x7ffff970e980, 
    postbuf=0x7ffff970e8f0) at nfs-fops.c:1104
#3  0x00002ac608119f6c in ra_writev_cbk (frame=0x2aaab0000c80, cookie=<value optimized out>, this=<value optimized out>, op_ret=65536, op_errno=0, 
    prebuf=0x7ffff970e980, postbuf=0x7ffff970e8f0) at read-ahead.c:635
#4  0x00002ac607f0f24d in qr_writev_cbk (frame=0x2aaab0000ce0, cookie=<value optimized out>, this=<value optimized out>, op_ret=65536, op_errno=0, 
    prebuf=0x7ffff970e980, postbuf=0x7ffff970e8f0) at quick-read.c:984
#5  0x00002ac607d04c8f in ioc_writev_cbk (frame=0x2aaab0000d40, cookie=<value optimized out>, this=<value optimized out>, op_ret=65536, op_errno=0, 
    prebuf=0x7ffff970e980, postbuf=0x7ffff970e8f0) at io-cache.c:1097
#6  0x00002ac607af7b8d in wb_writev_cbk (frame=0x2aaab0000e70, cookie=<value optimized out>, this=<value optimized out>, op_ret=65536, op_errno=0, 
    prebuf=0x7ffff970e980, postbuf=0x7ffff970e8f0) at write-behind.c:1832
#7  0x00002ac6078ebb69 in client_write_cbk (frame=0x2aaab0000f30, hdr=<value optimized out>, hdrlen=<value optimized out>, iobuf=<value optimized out>)
    at client-protocol.c:4455
#8  0x00002ac6078d80ba in protocol_client_pollin (this=0xbf73e60, trans=0xbf78c70) at client-protocol.c:6827
#9  0x00002ac6078df4d2 in notify (this=0x2aaaac08e400, event=2, data=0xbf78c70) at client-protocol.c:6946
#10 0x00002ac606e171c3 in ?? () from /opt/glusterfs/gnfs/lib/libglusterfs.so.0

Comment 2 Anand Avati 2010-04-13 14:01:52 UTC
PATCH: http://patches.gluster.com/patch/3136 in master (nfs3: Use nfs3state in call_state to avoid getting from rpc request)

Comment 3 Shehjar Tikoo 2010-04-21 03:00:00 UTC
*** Bug 726 has been marked as a duplicate of this bug. ***

Comment 4 Shehjar Tikoo 2010-04-21 03:00:17 UTC
*** Bug 731 has been marked as a duplicate of this bug. ***

Comment 5 Shehjar Tikoo 2010-04-21 03:00:25 UTC
*** Bug 732 has been marked as a duplicate of this bug. ***

Comment 6 Shehjar Tikoo 2010-04-21 03:00:47 UTC
*** Bug 738 has been marked as a duplicate of this bug. ***

Comment 7 Shehjar Tikoo 2010-04-21 03:02:07 UTC
*** Bug 817 has been marked as a duplicate of this bug. ***

Comment 8 Shehjar Tikoo 2010-05-31 08:14:25 UTC
Regression Test:
Going by other reports of the bugs which are duplicates of this, I think the simplest way to test this bug's regression is by writing a 1Tb file using dd.

Test case:
1. Export a simple posix+io-threads through nfsx.

2. On the NFS client mount point run the following test command:
dd if=/dev/zero of=testfile bs=64k count=16777216

Result:
The nfs server process should not crash throughout the dd run.

Comment 9 Lakshmipathi G 2010-06-18 01:59:23 UTC
(In reply to comment #8)
> Regression Test:
> Going by other reports of the bugs which are duplicates of this, I think the
> simplest way to test this bug's regression is by writing a 1Tb file using dd.
> 
> Test case:
> 1. Export a simple posix+io-threads through nfsx.
> 
> 2. On the NFS client mount point run the following test command:
> dd if=/dev/zero of=testfile bs=64k count=16777216
> 
> Result:
> The nfs server process should not crash throughout the dd run.

While reproducing, we'll be testing for a server crash. If the server does crash, the dd under normal circumstances will hang.

In order to force the NFS kernel module to return an error after a timeout, use the following mount command:

mount <nfsserver>:<export> -o soft,intr <mountpoint>

This ensures that on a timeout, an EIO is returned to the application.

Comment 10 Lakshmipathi G 2010-06-29 08:41:07 UTC
without the patch (http://patches.gluster.com/patch/3136 ) ,creating 300GB file using dd results in crash- but now it's fixed in nfs-beta-rc7.


Note You need to log in before you can comment on or make changes to this bug.