Bug 763449 (GLUSTER-1717) - dht_attr_cbk does not propagate op_ret on failed fop causing nfs crash
Summary: dht_attr_cbk does not propagate op_ret on failed fop causing nfs crash
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: GLUSTER-1717
Product: GlusterFS
Classification: Community
Component: distribute
Version: mainline
Hardware: All
OS: Linux
low
high
Target Milestone: ---
Assignee: Shehjar Tikoo
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-09-28 04:04 UTC by Shehjar Tikoo
Modified: 2015-12-01 16:45 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:
Regression: RTP
Mount Type: nfs
Documentation: ---
CRM:
Verified Versions:


Attachments (Terms of Use)

Description Shehjar Tikoo 2010-09-28 04:04:04 UTC
The crash trace reported by Harsha:

package-string: glusterfs nfs_beta_rc14_ac84ead9f25c
/lib64/libc.so.6[0x33484332f0]
/lib64/libpthread.so.0(pthread_spin_lock+0x0)[0x334900bbd0]
/usr/lib64/libglusterfs.so.0(fd_unref+0x73)[0x7fad0b334083]
/usr/lib64/glusterfs/nfs_beta_rc14_ac84ead9f25c/xlator/nfs/server.so(nfs3_call_state_wipe+0xae)[0x7fad0a403257]
/usr/lib64/glusterfs/nfs_beta_rc14_ac84ead9f25c/xlator/nfs/server.so(nfs3svc_readdir_fstat_cbk+0x2c5)[0x7fad0a40e929]
/usr/lib64/glusterfs/nfs_beta_rc14_ac84ead9f25c/xlator/nfs/server.so(nfs_fop_fstat_cbk+0xbe)[0x7fad0a3f57a0]
/usr/lib64/glusterfs/nfs_beta_rc14_ac84ead9f25c/xlator/cluster/distribute.so(dht_attr_cbk+0x263)[0x7fad0a641de5]
/usr/lib64/glusterfs/nfs_beta_rc14_ac84ead9f25c/xlator/protocol/client.so(client_fstat_cbk+0x1a6)[0x7fad0a87f40c]
/usr/lib64/glusterfs/nfs_beta_rc14_ac84ead9f25c/xlator/protocol/client.so(saved_frames_unwind+0x1a9)[0x7fad0a887d2c]
/usr/lib64/glusterfs/nfs_beta_rc14_ac84ead9f25c/xlator/protocol/client.so(saved_frames_destroy+0x4c)[0x7fad0a887dd3]
/usr/lib64/glusterfs/nfs_beta_rc14_ac84ead9f25c/xlator/protocol/client.so(protocol_client_cleanup+0x114)[0x7fad0a885999]
/usr/lib64/glusterfs/nfs_beta_rc14_ac84ead9f25c/xlator/protocol/client.so(notify+0x17c)[0x7fad0a886f58]
/usr/lib64/libglusterfs.so.0(xlator_notify+0xd8)[0x7fad0b30fab9]
/usr/lib64/glusterfs/nfs_beta_rc14_ac84ead9f25c/transport/socket.so(socket_event_poll_err+0x81)[0x7fad093d1684]
/usr/lib64/glusterfs/nfs_beta_rc14_ac84ead9f25c/transport/socket.so(socket_event_handler+0xdf)[0x7fad093d23c0]
/usr/lib64/libglusterfs.so.0[0x7fad0b336ea4]
/usr/lib64/libglusterfs.so.0[0x7fad0b337096]
/usr/lib64/libglusterfs.so.0(event_dispatch+0x74)[0x7fad0b3373b8]
glusterfs(main+0x10a8)[0x4065a4]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x334841ea4d]
glusterfs[0x402769]
---------

and backtrace

(gdb) bt
#0  0x000000334900bbd0 in pthread_spin_lock () from /lib64/libpthread.so.0
#1  0x00007fad0b334083 in fd_unref (fd=0x7fad047bfdc0) at fd.c:447
#2  0x00007fad0a403257 in nfs3_call_state_wipe (cs=0x7fad09081460) at nfs3.c:214
#3  0x00007fad0a40e929 in nfs3svc_readdir_fstat_cbk (frame=0x10e6588, cookie=0xf950c0,
    this=0xf98610, op_ret=0, op_errno=107, buf=0x10e6678) at nfs3.c:3737

##############################################################
NFS sees everything is A-ok
##############################################################
#4  0x00007fad0a3f57a0 in nfs_fop_fstat_cbk (frame=0x10e6588, cookie=0xf950c0, this=0xf98610,
    op_ret=0, op_errno=107, buf=0x10e6678) at nfs-fops.c:355

##############################################################
dht receives -1
##############################################################
#5  0x00007fad0a641de5 in dht_attr_cbk (frame=0x10b9450, cookie=0x10bb5e0, this=0xf950c0, op_ret=-1,
    op_errno=107, stbuf=0x7fff33bb8d50) at dht-common.c:1006
#6  0x00007fad0a87f40c in client_fstat_cbk (frame=0x10bb5e0, hdr=0x7fff33bb8e40, hdrlen=108,
    iobuf=0x0) at client-protocol.c:4121
#7  0x00007fad0a887d2c in saved_frames_unwind (this=0xf7b580, saved_frames=0xfb2dd0, head=0xfb2dd8,
    gf_ops=0x7fad0aa8e8a0, gf_op_list=0x7fad0b553b00) at saved-frames.c:174
#8  0x00007fad0a887dd3 in saved_frames_destroy (this=0xf7b580, frames=0xfb2dd0,
    gf_fops=0x7fad0aa8e8a0, gf_mops=0x7fad0aa8ea20, gf_cbks=0x7fad0aa8ea60) at saved-frames.c:186
#9  0x00007fad0a885999 in protocol_client_cleanup (trans=0xfb1710) at client-protocol.c:6004
#10 0x00007fad0a886f58 in notify (this=0xf7b580, event=4, data=0xfb1710) at client-protocol.c:6562
#11 0x00007fad0b30fab9 in xlator_notify (xl=0xf7b580, event=4, data=0xfb1710) at xlator.c:919
#12 0x00007fad093d1684 in socket_event_poll_err (this=0xfb1710) at socket.c:437
#13 0x00007fad093d23c0 in socket_event_handler (fd=72, idx=65, data=0xfb1710, poll_in=1, poll_out=0,
    poll_err=16) at socket.c:835
#14 0x00007fad0b336ea4 in event_dispatch_epoll_handler (event_pool=0xf372c0, events=0xfebb60, i=18)
    at event.c:804
#15 0x00007fad0b337096 in event_dispatch_epoll (event_pool=0xf372c0) at event.c:867
#16 0x00007fad0b3373b8 in event_dispatch (event_pool=0xf372c0) at event.c:975
#17 0x00000000004065a4 in main (argc=5, argv=0x7fff33bb9a88) at glusterfsd.c:1494


Setting to blocker because of customer PoC needs.

Comment 1 Vijay Bellur 2010-09-28 08:50:51 UTC
PATCH: http://patches.gluster.com/patch/5030 in master (distribute: Propagate -1 op_ret on failed fop)


Note You need to log in before you can comment on or make changes to this bug.