Bug 762489 - (GLUSTER-757) [NFS-Alpha] Crash in nfs3_call_state_wipe
[NFS-Alpha] Crash in nfs3_call_state_wipe
Status: CLOSED CURRENTRELEASE
Product: GlusterFS
Classification: Community
Component: nfs (Show other bugs)
nfs-alpha
All Linux
low Severity medium
: ---
: ---
Assigned To: Shehjar Tikoo
:
: GLUSTER-726 GLUSTER-731 GLUSTER-732 GLUSTER-738 GLUSTER-817 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2010-03-25 04:59 EDT by Anush Shetty
Modified: 2015-12-01 11:45 EST (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: ---
Regression: RTP
Mount Type: nfs
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)

  None (edit)
Description Shehjar Tikoo 2010-03-25 02:52:10 EDT
How is this nfs over posix? I see performance translators loaded too.

Confirm please and more importantly build your glusterfs binaries with debug symbols using -g -O0. Thats what makes backtraces useful.
Comment 1 Anush Shetty 2010-03-25 04:59:58 EDT
I ran dd dd ( if=/dev/zero of=ddfile bs=128K count=10000 ) on simple posix+ nfs setup and got this crash. Log files and complete core is at /share/tickets/bugid/


bt:
#0  nfs3_call_state_wipe (cs=0x2aaaab0c13e0) at nfs3.c:195
#1  0x00002ac60833d4ed in nfs3svc_write_cbk (frame=<value optimized out>, cookie=<value optimized out>, this=<value optimized out>, 
    op_ret=<value optimized out>, op_errno=0, prebuf=0x7ffff970e980, postbuf=0x7ffff970e8f0) at nfs3.c:1634
#2  0x00002ac60832df7b in nfs_fop_writev_cbk (frame=0x2aaab0000c18, cookie=0xbf76760, this=0xbf76760, op_ret=65536, op_errno=0, prebuf=0x7ffff970e980, 
    postbuf=0x7ffff970e8f0) at nfs-fops.c:1104
#3  0x00002ac608119f6c in ra_writev_cbk (frame=0x2aaab0000c80, cookie=<value optimized out>, this=<value optimized out>, op_ret=65536, op_errno=0, 
    prebuf=0x7ffff970e980, postbuf=0x7ffff970e8f0) at read-ahead.c:635
#4  0x00002ac607f0f24d in qr_writev_cbk (frame=0x2aaab0000ce0, cookie=<value optimized out>, this=<value optimized out>, op_ret=65536, op_errno=0, 
    prebuf=0x7ffff970e980, postbuf=0x7ffff970e8f0) at quick-read.c:984
#5  0x00002ac607d04c8f in ioc_writev_cbk (frame=0x2aaab0000d40, cookie=<value optimized out>, this=<value optimized out>, op_ret=65536, op_errno=0, 
    prebuf=0x7ffff970e980, postbuf=0x7ffff970e8f0) at io-cache.c:1097
#6  0x00002ac607af7b8d in wb_writev_cbk (frame=0x2aaab0000e70, cookie=<value optimized out>, this=<value optimized out>, op_ret=65536, op_errno=0, 
    prebuf=0x7ffff970e980, postbuf=0x7ffff970e8f0) at write-behind.c:1832
#7  0x00002ac6078ebb69 in client_write_cbk (frame=0x2aaab0000f30, hdr=<value optimized out>, hdrlen=<value optimized out>, iobuf=<value optimized out>)
    at client-protocol.c:4455
#8  0x00002ac6078d80ba in protocol_client_pollin (this=0xbf73e60, trans=0xbf78c70) at client-protocol.c:6827
#9  0x00002ac6078df4d2 in notify (this=0x2aaaac08e400, event=2, data=0xbf78c70) at client-protocol.c:6946
#10 0x00002ac606e171c3 in ?? () from /opt/glusterfs/gnfs/lib/libglusterfs.so.0
Comment 2 Anand Avati 2010-04-13 10:01:52 EDT
PATCH: http://patches.gluster.com/patch/3136 in master (nfs3: Use nfs3state in call_state to avoid getting from rpc request)
Comment 3 Shehjar Tikoo 2010-04-20 23:00:00 EDT
*** Bug 726 has been marked as a duplicate of this bug. ***
Comment 4 Shehjar Tikoo 2010-04-20 23:00:17 EDT
*** Bug 731 has been marked as a duplicate of this bug. ***
Comment 5 Shehjar Tikoo 2010-04-20 23:00:25 EDT
*** Bug 732 has been marked as a duplicate of this bug. ***
Comment 6 Shehjar Tikoo 2010-04-20 23:00:47 EDT
*** Bug 738 has been marked as a duplicate of this bug. ***
Comment 7 Shehjar Tikoo 2010-04-20 23:02:07 EDT
*** Bug 817 has been marked as a duplicate of this bug. ***
Comment 8 Shehjar Tikoo 2010-05-31 04:14:25 EDT
Regression Test:
Going by other reports of the bugs which are duplicates of this, I think the simplest way to test this bug's regression is by writing a 1Tb file using dd.

Test case:
1. Export a simple posix+io-threads through nfsx.

2. On the NFS client mount point run the following test command:
dd if=/dev/zero of=testfile bs=64k count=16777216

Result:
The nfs server process should not crash throughout the dd run.
Comment 9 Lakshmipathi G 2010-06-17 21:59:23 EDT
(In reply to comment #8)
> Regression Test:
> Going by other reports of the bugs which are duplicates of this, I think the
> simplest way to test this bug's regression is by writing a 1Tb file using dd.
> 
> Test case:
> 1. Export a simple posix+io-threads through nfsx.
> 
> 2. On the NFS client mount point run the following test command:
> dd if=/dev/zero of=testfile bs=64k count=16777216
> 
> Result:
> The nfs server process should not crash throughout the dd run.

While reproducing, we'll be testing for a server crash. If the server does crash, the dd under normal circumstances will hang.

In order to force the NFS kernel module to return an error after a timeout, use the following mount command:

mount <nfsserver>:<export> -o soft,intr <mountpoint>

This ensures that on a timeout, an EIO is returned to the application.
Comment 10 Lakshmipathi G 2010-06-29 04:41:07 EDT
without the patch (http://patches.gluster.com/patch/3136 ) ,creating 300GB file using dd results in crash- but now it's fixed in nfs-beta-rc7.

Note You need to log in before you can comment on or make changes to this bug.