Bug 763901 - (GLUSTER-2169) NFS crash in nfs-fops due to failed fop from subvolume
NFS crash in nfs-fops due to failed fop from subvolume
Status: CLOSED CURRENTRELEASE
Product: GlusterFS
Classification: Community
Component: nfs (Show other bugs)
mainline
All Linux
low Severity medium
: ---
: ---
Assigned To: Shehjar Tikoo
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2010-12-01 01:07 EST by Shehjar Tikoo
Modified: 2015-12-01 11:45 EST (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: ---
Regression: RTP
Mount Type: nfs
Documentation: DNR
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Shehjar Tikoo 2010-12-01 01:07:28 EST
In a 12-hour test case, glusterfs process crashed with a 800M+ core file on a system that had just 1G mem. The fop being processed failed in qr_readv because of ENOMEM. The stack trace was:

Program terminated with signal 11, Segmentation fault.
#0  nfs_fop_readv_cbk (frame=0x7f7ccb2dc904, cookie=0x1acd140, this=0x1ad1b40, op_ret=-1, op_errno=22, vector=0x0, count=-1, stbuf=0x7f7cc9473c70, iobref=0x0)
    at nfs-fops.c:1273
1273    nfs-fops.c: No such file or directory.
        in nfs-fops.c
(gdb) bt
#0  nfs_fop_readv_cbk (frame=0x7f7ccb2dc904, cookie=0x1acd140, this=0x1ad1b40, op_ret=-1, op_errno=22, vector=0x0, count=-1, stbuf=0x7f7cc9473c70, iobref=0x0)
    at nfs-fops.c:1273
#1  0x00007f7cc995c77f in io_stats_readv_cbk (frame=0x7f7ccb4f9570, cookie=<value optimized out>, this=<value optimized out>, op_ret=-1, op_errno=<value optimized out>,
    vector=0x0, count=-1, buf=0x7f7cc9473c70, iobref=0x0) at io-stats.c:516
#2  0x00007f7cc9b6a028 in qr_readv (frame=<value optimized out>, this=<value optimized out>, fd=<value optimized out>, size=<value optimized out>,
    offset=<value optimized out>) at quick-read.c:1064
#3  0x00007f7cc9956eb4 in io_stats_readv (frame=<value optimized out>, this=0x1acd140, fd=0x7f7cc666b6b4, size=65536, offset=262144) at io-stats.c:1200
#4  0x00007f7cc94e6d04 in nfs_fop_read (nfsx=<value optimized out>, xl=0x1acd140, nfu=<value optimized out>, fd=0x7f7cc666b6b4, size=65536, offset=0,
    cbk=0x7f7cc94f83e0 <nfs3svc_read_cbk>, local=0x7f7cc3842c5c) at nfs-fops.c:1298
#5  0x00007f7cc94f8337 in nfs3_read_fd_resume (carg=0x7f7cc3842c5c) at nfs3.c:1669
#6  0x00007f7cc9504184 in nfs3_file_open_and_resume (cs=0x7f7cc3842c5c, resume=<value optimized out>) at nfs3-helpers.c:2223
#7  0x00007f7cc94f8218 in nfs3_read_resume (carg=0x7f7cc9473c70) at nfs3.c:1697
#8  0x00007f7cc950273f in nfs3_fh_resolve_inode_done (cs=0x7f7cc3842c5c, inode=<value optimized out>) at nfs3-helpers.c:2508
#9  0x00007f7cc95027c3 in nfs3_fh_resolve_inode (cs=0x7f7cc3842c5c) at nfs3-helpers.c:3064
#10 0x00007f7cc94fb13c in nfs3_read (req=0x7f7cbc010b68, fh=0x7f7cc9473f90, offset=262144, count=65536) at nfs3.c:1736
#11 0x00007f7cc94fb416 in nfs3svc_read (req=0x7f7cbc010b68) at nfs3.c:1770
#12 0x00007f7cc950cb0c in nfs_rpcsvc_handle_rpc_call (conn=0x7f7c90b1b850) at ../../../../xlators/nfs/lib/src/rpcsvc.c:1984
#13 0x00007f7cc950d2fd in nfs_rpcsvc_record_update_state (conn=0x7f7c90b1b850, dataread=0) at ../../../../xlators/nfs/lib/src/rpcsvc.c:2469
#14 0x00007f7cc950d648 in nfs_rpcsvc_conn_data_handler (fd=<value optimized out>, idx=28102976, data=0x7f7c90b1b850, poll_in=-1, poll_out=22, poll_err=0)
    at ../../../../xlators/nfs/lib/src/rpcsvc.c:2641
#15 0x00007f7ccc3c8b92 in event_dispatch_epoll_handler (i=<value optimized out>, events=<value optimized out>, event_pool=<value optimized out>) at event.c:812
#16 event_dispatch_epoll (i=<value optimized out>, events=<value optimized out>, event_pool=<value optimized out>) at event.c:876
#17 0x00007f7cc950eb72 in nfs_rpcsvc_stage_proc (arg=<value optimized out>) at ../../../../xlators/nfs/lib/src/rpcsvc.c:64
#18 0x0000003871c0685a in start_thread (arg=<value optimized out>) at pthread_create.c:297
#19 0x00000038710de22d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#20 0x0000000000000000 in ?? ()


quick-read returns an error because memory allocation failed in its readv. It returns a op_ret -1 with other arguments pointing to undefined addresses. In this case, NFS touches @stbuf even when op_ret is -1.

nfs_fop_readv_cbk (frame=0x7f7ccb2dc904, cookie=0x1acd140, this=0x1ad1b40, op_ret=-1, op_errno=22, vector=0x0, count=-1, stbuf=0x7f7cc9473c70, iobref=0x0)
    at nfs-fops.c:1273
Comment 1 Anand Avati 2010-12-27 20:51:03 EST
PATCH: http://patches.gluster.com/patch/5915 in master (nfs: Do not touch iatt on failed fops)

Note You need to log in before you can comment on or make changes to this bug.