Bug 763807 (GLUSTER-2075)

Summary: inode_from_path silently consumes 99% CPU causing nfs client disconnects
Product: [Community] GlusterFS Reporter: Shehjar Tikoo <shehjart>
Component: coreAssignee: Raghavendra G <raghavendra>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: medium Docs Contact:
Priority: low    
Version: mainlineCC: gluster-bugs, vijaykumar
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: RTP Mount Type: nfs
Documentation: DNR CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Shehjar Tikoo 2010-11-10 08:00:19 UTC
Raghu thinks there is a potential for an loop from the following code block in qr_writev:

out:
        if (need_unwind) {
                QR_STACK_UNWIND (writev, frame, op_ret, op_errno, NULL,
                                     NULL);
        } else if (can_wind) {
                STACK_WIND (frame, qr_writev_cbk, FIRST_CHILD (this),
                            FIRST_CHILD (this)->fops->writev, fd, vector, count,
                            off, iobref);
        } else if (need_open) {
                op_ret = qr_loc_fill (&loc, fd->inode, path);
                if (op_ret == -1) {
                        qr_resume_pending_ops (qr_fd_ctx);
                        goto out;
                }

                STACK_WIND (frame, qr_open_cbk, FIRST_CHILD(this),
                            FIRST_CHILD(this)->fops->open, &loc, flags, fd,
                            qr_fd_ctx->wbflags);

                qr_loc_wipe (&loc);
        }

He'll be looking into it.

Comment 1 Shehjar Tikoo 2010-11-10 10:38:17 UTC
On 3.1.1qa5, while running a fsstress load: fsstress -d /home/shehjart/mount/ -l 1 -n 1000 -p 2 -r

inode_from_path continues consuming CPU even after fsstress is killed on the command line.

Running gluster nfs daemon in valgrind --tool=callgrind does not help because the same behavior is not seen there.

############################################################
###############FIRST BACKTRACE FROM GDB#####################
#############################################################
#0  gf_strdup (src=0x7f89780e1af8 "/p1/d0", 'X' <repeats 194 times>...) at mem-pool.h:93
#1  0x00007f898ca8a695 in inode_from_path (itable=0x6bd738, path=0x7f89780e1af8 "/p1/d0", 'X' <repeats 194 times>...) at inode.c:1134
#2  0x00007f8989fbd05c in qr_loc_fill (loc=0x7f8989ae9890, inode=0x7f8988c66cdc, path=0x7f89780ca3a8 "/p1/d0", 'X' <repeats 194 times>...)
    at quick-read.c:103
#3  0x00007f8989fc17e7 in qr_writev (frame=0x7f898bba1a04, this=0x6b8ec8, fd=0x7f89883522dc, vector=0x7f89868f8850, count=1, off=327680, 
    iobref=0x7f89780e4a08) at quick-read.c:1263
#4  0x00007f8989db425c in io_stats_writev (frame=0x7f898bba21c0, this=0x6ba038, fd=0x7f89883522dc, vector=0x7f89868f8850, count=1, offset=327680, 
    iobref=0x7f89780e4a08) at io-stats.c:1224
#5  0x00007f8989b64152 in nfs_fop_write (nfsx=0x6bb1f8, xl=0x6ba038, nfu=0x7f8989ae9b60, fd=0x7f89883522dc, srciob=0x71f4f0, vector=0x7f89868f8850, 
    count=1, offset=327680, cbk=0x7f8989b71136 <nfs3svc_write_cbk>, local=0x7f89868f84a0) at nfs-fops.c:1232
#6  0x00007f8989b67321 in nfs_write (nfsx=0x6bb1f8, xl=0x6ba038, nfu=0x7f8989ae9b60, fd=0x7f89883522dc, srciob=0x71f4f0, vector=0x7f89868f8850, count=1, 
    offset=327680, cbk=0x7f8989b71136 <nfs3svc_write_cbk>, local=0x7f89868f84a0) at nfs-generics.c:164
#7  0x00007f8989b71518 in __nfs3_write_resume (cs=0x7f89868f84a0) at nfs3.c:1929
#8  0x00007f8989b71592 in nfs3_write_resume (carg=0x7f89868f84a0) at nfs3.c:1949
#9  0x00007f8989b82362 in nfs3_file_open_and_resume (cs=0x7f89868f84a0, resume=0x7f8989b71528 <nfs3_write_resume>) at nfs3-helpers.c:2223
#10 0x00007f8989b71697 in nfs3_write_open_resume (carg=0x7f89868f84a0) at nfs3.c:1976
#11 0x00007f8989b83510 in nfs3_fh_resolve_inode_done (cs=0x7f89868f84a0, inode=0x7f8988c66cdc) at nfs3-helpers.c:2508
#12 0x00007f8989b85410 in nfs3_fh_resolve_inode (cs=0x7f89868f84a0) at nfs3-helpers.c:3059
#13 0x00007f8989b85568 in nfs3_fh_resolve_and_resume (cs=0x7f89868f84a0, fh=0x7f8989ae9df0, entry=0x0, resum_fn=0x7f8989b71623 <nfs3_write_open_resume>)
    at nfs3-helpers.c:3103
#14 0x00007f8989b71ae4 in nfs3_write (req=0x7f897800d030, fh=0x7f8989ae9df0, offset=327680, count=65536, stable=UNSTABLE, payload=
      {iov_base = 0x7f8986540000, iov_len = 131072}, iob=0x71f4f0) at nfs3.c:2022
#15 0x00007f8989b71e70 in nfs3svc_write_vec (req=0x7f897800d030, iob=0x71f4f0) at nfs3.c:2110
#16 0x00007f8989b8dae0 in nfs_rpcsvc_record_vectored_call_actor (conn=0x7f897800c298) at ../../../../xlators/nfs/lib/src/rpcsvc.c:2282
#17 0x00007f8989b8dd64 in nfs_rpcsvc_update_vectored_state (conn=0x7f897800c298) at ../../../../xlators/nfs/lib/src/rpcsvc.c:2339
#18 0x00007f8989b8de5d in nfs_rpcsvc_handle_vectored_frag (conn=0x7f897800c298, dataread=0) at ../../../../xlators/nfs/lib/src/rpcsvc.c:2398
#19 0x00007f8989b8df4c in nfs_rpcsvc_record_update_state (conn=0x7f897800c298, dataread=65536) at ../../../../xlators/nfs/lib/src/rpcsvc.c:2435
#20 0x00007f8989b8e25b in nfs_rpcsvc_conn_data_poll_in (conn=0x7f897800c298) at ../../../../xlators/nfs/lib/src/rpcsvc.c:2504
#21 0x00007f8989b8e6d3 in nfs_rpcsvc_conn_data_handler (fd=15, idx=3, data=0x7f897800c298, poll_in=1, poll_out=0, poll_err=0)
    at ../../../../xlators/nfs/lib/src/rpcsvc.c:2633
#22 0x00007f898ca9eeb9 in event_dispatch_epoll_handler (event_pool=0x6bd3f8, events=0x7f89780008f8, i=0) at event.c:812
#23 0x00007f898ca9f0c9 in event_dispatch_epoll (event_pool=0x6bd3f8) at event.c:876
#24 0x00007f898ca9f431 in event_dispatch (event_pool=0x6bd3f8) at event.c:984
#25 0x00007f8989b88c4e in nfs_rpcsvc_stage_proc (arg=0x6bd3b8) at ../../../../xlators/nfs/lib/src/rpcsvc.c:64
#26 0x0000003871c0685a in start_thread (arg=<value optimized out>) at pthread_create.c:297
#27 0x00000038710de22d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#28 0x0000000000000000 in ?? ()


#############################################################
###############SECOND BACKTRACE FROM GDB#####################
#############################################################
#0  strtok_r () at ../sysdeps/x86_64/strtok.S:110
#1  0x00007f898ca8a7a4 in inode_from_path (itable=0x6bd738, path=0x7f89780e43c8 "/p1/d0", 'X' <repeats 194 times>...) at inode.c:1156
#2  0x00007f8989fbd05c in qr_loc_fill (loc=0x7f8989ae9890, inode=0x7f8988c66cdc, path=0x7f89780ca3a8 "/p1/d0", 'X' <repeats 194 times>...)
    at quick-read.c:103
#3  0x00007f8989fc17e7 in qr_writev (frame=0x7f898bba1a04, this=0x6b8ec8, fd=0x7f89883522dc, vector=0x7f89868f8850, count=1, off=327680, 
    iobref=0x7f89780e4a08) at quick-read.c:1263
#4  0x00007f8989db425c in io_stats_writev (frame=0x7f898bba21c0, this=0x6ba038, fd=0x7f89883522dc, vector=0x7f89868f8850, count=1, offset=327680, 
    iobref=0x7f89780e4a08) at io-stats.c:1224
#5  0x00007f8989b64152 in nfs_fop_write (nfsx=0x6bb1f8, xl=0x6ba038, nfu=0x7f8989ae9b60, fd=0x7f89883522dc, srciob=0x71f4f0, vector=0x7f89868f8850, 
    count=1, offset=327680, cbk=0x7f8989b71136 <nfs3svc_write_cbk>, local=0x7f89868f84a0) at nfs-fops.c:1232
#6  0x00007f8989b67321 in nfs_write (nfsx=0x6bb1f8, xl=0x6ba038, nfu=0x7f8989ae9b60, fd=0x7f89883522dc, srciob=0x71f4f0, vector=0x7f89868f8850, count=1, 
    offset=327680, cbk=0x7f8989b71136 <nfs3svc_write_cbk>, local=0x7f89868f84a0) at nfs-generics.c:164
#7  0x00007f8989b71518 in __nfs3_write_resume (cs=0x7f89868f84a0) at nfs3.c:1929
#8  0x00007f8989b71592 in nfs3_write_resume (carg=0x7f89868f84a0) at nfs3.c:1949
#9  0x00007f8989b82362 in nfs3_file_open_and_resume (cs=0x7f89868f84a0, resume=0x7f8989b71528 <nfs3_write_resume>) at nfs3-helpers.c:2223
#10 0x00007f8989b71697 in nfs3_write_open_resume (carg=0x7f89868f84a0) at nfs3.c:1976
#11 0x00007f8989b83510 in nfs3_fh_resolve_inode_done (cs=0x7f89868f84a0, inode=0x7f8988c66cdc) at nfs3-helpers.c:2508
#12 0x00007f8989b85410 in nfs3_fh_resolve_inode (cs=0x7f89868f84a0) at nfs3-helpers.c:3059
#13 0x00007f8989b85568 in nfs3_fh_resolve_and_resume (cs=0x7f89868f84a0, fh=0x7f8989ae9df0, entry=0x0, resum_fn=0x7f8989b71623 <nfs3_write_open_resume>)
    at nfs3-helpers.c:3103
#14 0x00007f8989b71ae4 in nfs3_write (req=0x7f897800d030, fh=0x7f8989ae9df0, offset=327680, count=65536, stable=UNSTABLE, payload=
      {iov_base = 0x7f8986540000, iov_len = 131072}, iob=0x71f4f0) at nfs3.c:2022
#15 0x00007f8989b71e70 in nfs3svc_write_vec (req=0x7f897800d030, iob=0x71f4f0) at nfs3.c:2110
#16 0x00007f8989b8dae0 in nfs_rpcsvc_record_vectored_call_actor (conn=0x7f897800c298) at ../../../../xlators/nfs/lib/src/rpcsvc.c:2282
#17 0x00007f8989b8dd64 in nfs_rpcsvc_update_vectored_state (conn=0x7f897800c298) at ../../../../xlators/nfs/lib/src/rpcsvc.c:2339
#18 0x00007f8989b8de5d in nfs_rpcsvc_handle_vectored_frag (conn=0x7f897800c298, dataread=0) at ../../../../xlators/nfs/lib/src/rpcsvc.c:2398
#19 0x00007f8989b8df4c in nfs_rpcsvc_record_update_state (conn=0x7f897800c298, dataread=65536) at ../../../../xlators/nfs/lib/src/rpcsvc.c:2435
#20 0x00007f8989b8e25b in nfs_rpcsvc_conn_data_poll_in (conn=0x7f897800c298) at ../../../../xlators/nfs/lib/src/rpcsvc.c:2504
#21 0x00007f8989b8e6d3 in nfs_rpcsvc_conn_data_handler (fd=15, idx=3, data=0x7f897800c298, poll_in=1, poll_out=0, poll_err=0)
    at ../../../../xlators/nfs/lib/src/rpcsvc.c:2633
#22 0x00007f898ca9eeb9 in event_dispatch_epoll_handler (event_pool=0x6bd3f8, events=0x7f89780008f8, i=0) at event.c:812
#23 0x00007f898ca9f0c9 in event_dispatch_epoll (event_pool=0x6bd3f8) at event.c:876
#24 0x00007f898ca9f431 in event_dispatch (event_pool=0x6bd3f8) at event.c:984
#25 0x00007f8989b88c4e in nfs_rpcsvc_stage_proc (arg=0x6bd3b8) at ../../../../xlators/nfs/lib/src/rpcsvc.c:64
#26 0x0000003871c0685a in start_thread (arg=<value optimized out>) at pthread_create.c:297
#27 0x00000038710de22d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#28 0x0000000000000000 in ?? ()


#############################################################
############### THIRD BACKTRACE FROM GDB#####################
#############################################################
#0  strtok_r () at ../sysdeps/x86_64/strtok.S:110
#1  0x00007f898ca8a72c in inode_from_path (itable=0x6bd738, path=0x7f89780e1af8 "/p1/d0", 'X' <repeats 194 times>...) at inode.c:1142
#2  0x00007f8989fbd05c in qr_loc_fill (loc=0x7f8989ae9890, inode=0x7f8988c66cdc, path=0x7f89780ca3a8 "/p1/d0", 'X' <repeats 194 times>...)
    at quick-read.c:103
#3  0x00007f8989fc17e7 in qr_writev (frame=0x7f898bba1a04, this=0x6b8ec8, fd=0x7f89883522dc, vector=0x7f89868f8850, count=1, off=327680, 
    iobref=0x7f89780e4a08) at quick-read.c:1263
#4  0x00007f8989db425c in io_stats_writev (frame=0x7f898bba21c0, this=0x6ba038, fd=0x7f89883522dc, vector=0x7f89868f8850, count=1, offset=327680, 
    iobref=0x7f89780e4a08) at io-stats.c:1224
#5  0x00007f8989b64152 in nfs_fop_write (nfsx=0x6bb1f8, xl=0x6ba038, nfu=0x7f8989ae9b60, fd=0x7f89883522dc, srciob=0x71f4f0, vector=0x7f89868f8850, 
    count=1, offset=327680, cbk=0x7f8989b71136 <nfs3svc_write_cbk>, local=0x7f89868f84a0) at nfs-fops.c:1232
#6  0x00007f8989b67321 in nfs_write (nfsx=0x6bb1f8, xl=0x6ba038, nfu=0x7f8989ae9b60, fd=0x7f89883522dc, srciob=0x71f4f0, vector=0x7f89868f8850, count=1, 
    offset=327680, cbk=0x7f8989b71136 <nfs3svc_write_cbk>, local=0x7f89868f84a0) at nfs-generics.c:164
#7  0x00007f8989b71518 in __nfs3_write_resume (cs=0x7f89868f84a0) at nfs3.c:1929
#8  0x00007f8989b71592 in nfs3_write_resume (carg=0x7f89868f84a0) at nfs3.c:1949
#9  0x00007f8989b82362 in nfs3_file_open_and_resume (cs=0x7f89868f84a0, resume=0x7f8989b71528 <nfs3_write_resume>) at nfs3-helpers.c:2223
#10 0x00007f8989b71697 in nfs3_write_open_resume (carg=0x7f89868f84a0) at nfs3.c:1976
#11 0x00007f8989b83510 in nfs3_fh_resolve_inode_done (cs=0x7f89868f84a0, inode=0x7f8988c66cdc) at nfs3-helpers.c:2508
#12 0x00007f8989b85410 in nfs3_fh_resolve_inode (cs=0x7f89868f84a0) at nfs3-helpers.c:3059
#13 0x00007f8989b85568 in nfs3_fh_resolve_and_resume (cs=0x7f89868f84a0, fh=0x7f8989ae9df0, entry=0x0, resum_fn=0x7f8989b71623 <nfs3_write_open_resume>)
    at nfs3-helpers.c:3103
#14 0x00007f8989b71ae4 in nfs3_write (req=0x7f897800d030, fh=0x7f8989ae9df0, offset=327680, count=65536, stable=UNSTABLE, payload=
      {iov_base = 0x7f8986540000, iov_len = 131072}, iob=0x71f4f0) at nfs3.c:2022
#15 0x00007f8989b71e70 in nfs3svc_write_vec (req=0x7f897800d030, iob=0x71f4f0) at nfs3.c:2110
#16 0x00007f8989b8dae0 in nfs_rpcsvc_record_vectored_call_actor (conn=0x7f897800c298) at ../../../../xlators/nfs/lib/src/rpcsvc.c:2282
#17 0x00007f8989b8dd64 in nfs_rpcsvc_update_vectored_state (conn=0x7f897800c298) at ../../../../xlators/nfs/lib/src/rpcsvc.c:2339
#18 0x00007f8989b8de5d in nfs_rpcsvc_handle_vectored_frag (conn=0x7f897800c298, dataread=0) at ../../../../xlators/nfs/lib/src/rpcsvc.c:2398
#19 0x00007f8989b8df4c in nfs_rpcsvc_record_update_state (conn=0x7f897800c298, dataread=65536) at ../../../../xlators/nfs/lib/src/rpcsvc.c:2435
#20 0x00007f8989b8e25b in nfs_rpcsvc_conn_data_poll_in (conn=0x7f897800c298) at ../../../../xlators/nfs/lib/src/rpcsvc.c:2504
#21 0x00007f8989b8e6d3 in nfs_rpcsvc_conn_data_handler (fd=15, idx=3, data=0x7f897800c298, poll_in=1, poll_out=0, poll_err=0)
    at ../../../../xlators/nfs/lib/src/rpcsvc.c:2633
#22 0x00007f898ca9eeb9 in event_dispatch_epoll_handler (event_pool=0x6bd3f8, events=0x7f89780008f8, i=0) at event.c:812
#23 0x00007f898ca9f0c9 in event_dispatch_epoll (event_pool=0x6bd3f8) at event.c:876
#24 0x00007f898ca9f431 in event_dispatch (event_pool=0x6bd3f8) at event.c:984
#25 0x00007f8989b88c4e in nfs_rpcsvc_stage_proc (arg=0x6bd3b8) at ../../../../xlators/nfs/lib/src/rpcsvc.c:64
#26 0x0000003871c0685a in start_thread (arg=<value optimized out>) at pthread_create.c:297
#27 0x00000038710de22d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#28 0x0000000000000000 in ?? ()

#############################################################
############### FOURTH BACKTRACE FROM GDB####################
#############################################################
(gdb) bt
#0  qr_resume_pending_ops (qr_fd_ctx=0x7f89780cef38) at quick-read.c:140
#1  0x00007f8989fc17ff in qr_writev (frame=0x7f898bba1a04, this=0x6b8ec8, fd=0x7f89883522dc, vector=0x7f89868f8850, count=1, off=327680, 
    iobref=0x7f89780e4a08) at quick-read.c:1265
#2  0x00007f8989db425c in io_stats_writev (frame=0x7f898bba21c0, this=0x6ba038, fd=0x7f89883522dc, vector=0x7f89868f8850, count=1, offset=327680, 
    iobref=0x7f89780e4a08) at io-stats.c:1224
#3  0x00007f8989b64152 in nfs_fop_write (nfsx=0x6bb1f8, xl=0x6ba038, nfu=0x7f8989ae9b60, fd=0x7f89883522dc, srciob=0x71f4f0, vector=0x7f89868f8850, 
    count=1, offset=327680, cbk=0x7f8989b71136 <nfs3svc_write_cbk>, local=0x7f89868f84a0) at nfs-fops.c:1232
#4  0x00007f8989b67321 in nfs_write (nfsx=0x6bb1f8, xl=0x6ba038, nfu=0x7f8989ae9b60, fd=0x7f89883522dc, srciob=0x71f4f0, vector=0x7f89868f8850, count=1, 
    offset=327680, cbk=0x7f8989b71136 <nfs3svc_write_cbk>, local=0x7f89868f84a0) at nfs-generics.c:164
#5  0x00007f8989b71518 in __nfs3_write_resume (cs=0x7f89868f84a0) at nfs3.c:1929
#6  0x00007f8989b71592 in nfs3_write_resume (carg=0x7f89868f84a0) at nfs3.c:1949
#7  0x00007f8989b82362 in nfs3_file_open_and_resume (cs=0x7f89868f84a0, resume=0x7f8989b71528 <nfs3_write_resume>) at nfs3-helpers.c:2223
#8  0x00007f8989b71697 in nfs3_write_open_resume (carg=0x7f89868f84a0) at nfs3.c:1976
#9  0x00007f8989b83510 in nfs3_fh_resolve_inode_done (cs=0x7f89868f84a0, inode=0x7f8988c66cdc) at nfs3-helpers.c:2508
#10 0x00007f8989b85410 in nfs3_fh_resolve_inode (cs=0x7f89868f84a0) at nfs3-helpers.c:3059
#11 0x00007f8989b85568 in nfs3_fh_resolve_and_resume (cs=0x7f89868f84a0, fh=0x7f8989ae9df0, entry=0x0, resum_fn=0x7f8989b71623 <nfs3_write_open_resume>)
    at nfs3-helpers.c:3103
#12 0x00007f8989b71ae4 in nfs3_write (req=0x7f897800d030, fh=0x7f8989ae9df0, offset=327680, count=65536, stable=UNSTABLE, payload=
      {iov_base = 0x7f8986540000, iov_len = 131072}, iob=0x71f4f0) at nfs3.c:2022
#13 0x00007f8989b71e70 in nfs3svc_write_vec (req=0x7f897800d030, iob=0x71f4f0) at nfs3.c:2110
#14 0x00007f8989b8dae0 in nfs_rpcsvc_record_vectored_call_actor (conn=0x7f897800c298) at ../../../../xlators/nfs/lib/src/rpcsvc.c:2282
#15 0x00007f8989b8dd64 in nfs_rpcsvc_update_vectored_state (conn=0x7f897800c298) at ../../../../xlators/nfs/lib/src/rpcsvc.c:2339
#16 0x00007f8989b8de5d in nfs_rpcsvc_handle_vectored_frag (conn=0x7f897800c298, dataread=0) at ../../../../xlators/nfs/lib/src/rpcsvc.c:2398
#17 0x00007f8989b8df4c in nfs_rpcsvc_record_update_state (conn=0x7f897800c298, dataread=65536) at ../../../../xlators/nfs/lib/src/rpcsvc.c:2435
#18 0x00007f8989b8e25b in nfs_rpcsvc_conn_data_poll_in (conn=0x7f897800c298) at ../../../../xlators/nfs/lib/src/rpcsvc.c:2504
#19 0x00007f8989b8e6d3 in nfs_rpcsvc_conn_data_handler (fd=15, idx=3, data=0x7f897800c298, poll_in=1, poll_out=0, poll_err=0)
    at ../../../../xlators/nfs/lib/src/rpcsvc.c:2633
#20 0x00007f898ca9eeb9 in event_dispatch_epoll_handler (event_pool=0x6bd3f8, events=0x7f89780008f8, i=0) at event.c:812
#21 0x00007f898ca9f0c9 in event_dispatch_epoll (event_pool=0x6bd3f8) at event.c:876
#22 0x00007f898ca9f431 in event_dispatch (event_pool=0x6bd3f8) at event.c:984
#23 0x00007f8989b88c4e in nfs_rpcsvc_stage_proc (arg=0x6bd3b8) at ../../../../xlators/nfs/lib/src/rpcsvc.c:64
#24 0x0000003871c0685a in start_thread (arg=<value optimized out>) at pthread_create.c:297
#25 0x00000038710de22d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#26 0x0000000000000000 in ?? ()


#############################################################
############### FIFTH BACKTRACE FROM GDB####################
#############################################################
(gdb) bt
#0  0x00007f898ca89323 in __dentry_grep (table=0x6bd738, parent=0x7f8988c612ac, name=0x7f89780e1a6c "d4", 'X' <repeats 11 times>) at inode.c:522
#1  0x00007f898ca8944d in inode_grep (table=0x6bd738, parent=0x7f8988c612ac, name=0x7f89780e1a6c "d4", 'X' <repeats 11 times>) at inode.c:549
#2  0x00007f898ca8a767 in inode_from_path (itable=0x6bd738, path=0x7f89780e43c8 "/p1/d0", 'X' <repeats 194 times>...) at inode.c:1149
#3  0x00007f8989fbd05c in qr_loc_fill (loc=0x7f8989ae9890, inode=0x7f8988c66cdc, path=0x7f89780ca3a8 "/p1/d0", 'X' <repeats 194 times>...)
    at quick-read.c:103
#4  0x00007f8989fc17e7 in qr_writev (frame=0x7f898bba1a04, this=0x6b8ec8, fd=0x7f89883522dc, vector=0x7f89868f8850, count=1, off=327680, 
    iobref=0x7f89780e4a08) at quick-read.c:1263
#5  0x00007f8989db425c in io_stats_writev (frame=0x7f898bba21c0, this=0x6ba038, fd=0x7f89883522dc, vector=0x7f89868f8850, count=1, offset=327680, 
    iobref=0x7f89780e4a08) at io-stats.c:1224
#6  0x00007f8989b64152 in nfs_fop_write (nfsx=0x6bb1f8, xl=0x6ba038, nfu=0x7f8989ae9b60, fd=0x7f89883522dc, srciob=0x71f4f0, vector=0x7f89868f8850, 
    count=1, offset=327680, cbk=0x7f8989b71136 <nfs3svc_write_cbk>, local=0x7f89868f84a0) at nfs-fops.c:1232
#7  0x00007f8989b67321 in nfs_write (nfsx=0x6bb1f8, xl=0x6ba038, nfu=0x7f8989ae9b60, fd=0x7f89883522dc, srciob=0x71f4f0, vector=0x7f89868f8850, count=1, 
    offset=327680, cbk=0x7f8989b71136 <nfs3svc_write_cbk>, local=0x7f89868f84a0) at nfs-generics.c:164
#8  0x00007f8989b71518 in __nfs3_write_resume (cs=0x7f89868f84a0) at nfs3.c:1929
#9  0x00007f8989b71592 in nfs3_write_resume (carg=0x7f89868f84a0) at nfs3.c:1949
#10 0x00007f8989b82362 in nfs3_file_open_and_resume (cs=0x7f89868f84a0, resume=0x7f8989b71528 <nfs3_write_resume>) at nfs3-helpers.c:2223
#11 0x00007f8989b71697 in nfs3_write_open_resume (carg=0x7f89868f84a0) at nfs3.c:1976
#12 0x00007f8989b83510 in nfs3_fh_resolve_inode_done (cs=0x7f89868f84a0, inode=0x7f8988c66cdc) at nfs3-helpers.c:2508
#13 0x00007f8989b85410 in nfs3_fh_resolve_inode (cs=0x7f89868f84a0) at nfs3-helpers.c:3059
#14 0x00007f8989b85568 in nfs3_fh_resolve_and_resume (cs=0x7f89868f84a0, fh=0x7f8989ae9df0, entry=0x0, resum_fn=0x7f8989b71623 <nfs3_write_open_resume>)
    at nfs3-helpers.c:3103
#15 0x00007f8989b71ae4 in nfs3_write (req=0x7f897800d030, fh=0x7f8989ae9df0, offset=327680, count=65536, stable=UNSTABLE, payload=
      {iov_base = 0x7f8986540000, iov_len = 131072}, iob=0x71f4f0) at nfs3.c:2022
#16 0x00007f8989b71e70 in nfs3svc_write_vec (req=0x7f897800d030, iob=0x71f4f0) at nfs3.c:2110
#17 0x00007f8989b8dae0 in nfs_rpcsvc_record_vectored_call_actor (conn=0x7f897800c298) at ../../../../xlators/nfs/lib/src/rpcsvc.c:2282
#18 0x00007f8989b8dd64 in nfs_rpcsvc_update_vectored_state (conn=0x7f897800c298) at ../../../../xlators/nfs/lib/src/rpcsvc.c:2339
#19 0x00007f8989b8de5d in nfs_rpcsvc_handle_vectored_frag (conn=0x7f897800c298, dataread=0) at ../../../../xlators/nfs/lib/src/rpcsvc.c:2398
#20 0x00007f8989b8df4c in nfs_rpcsvc_record_update_state (conn=0x7f897800c298, dataread=65536) at ../../../../xlators/nfs/lib/src/rpcsvc.c:2435
#21 0x00007f8989b8e25b in nfs_rpcsvc_conn_data_poll_in (conn=0x7f897800c298) at ../../../../xlators/nfs/lib/src/rpcsvc.c:2504
#22 0x00007f8989b8e6d3 in nfs_rpcsvc_conn_data_handler (fd=15, idx=3, data=0x7f897800c298, poll_in=1, poll_out=0, poll_err=0)
    at ../../../../xlators/nfs/lib/src/rpcsvc.c:2633
#23 0x00007f898ca9eeb9 in event_dispatch_epoll_handler (event_pool=0x6bd3f8, events=0x7f89780008f8, i=0) at event.c:812
#24 0x00007f898ca9f0c9 in event_dispatch_epoll (event_pool=0x6bd3f8) at event.c:876
#25 0x00007f898ca9f431 in event_dispatch (event_pool=0x6bd3f8) at event.c:984
#26 0x00007f8989b88c4e in nfs_rpcsvc_stage_proc (arg=0x6bd3b8) at ../../../../xlators/nfs/lib/src/rpcsvc.c:64
#27 0x0000003871c0685a in start_thread (arg=<value optimized out>) at pthread_create.c:297
#28 0x00000038710de22d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#29 0x0000000000000000 in ?? ()


#############################################################
############### SIXTH BACKTRACE FROM GDB####################
#############################################################
#0  0x00007f898ca671b8 in __gf_calloc@plt () from /usr/local/lib/libglusterfs.so.0
#1  0x00007f898ca8831b in gf_strdup (src=0x7f89780e1af8 "/p1/d0", 'X' <repeats 194 times>...) at mem-pool.h:91
#2  0x00007f898ca8a695 in inode_from_path (itable=0x6bd738, path=0x7f89780e1af8 "/p1/d0", 'X' <repeats 194 times>...) at inode.c:1134
#3  0x00007f8989fbd05c in qr_loc_fill (loc=0x7f8989ae9890, inode=0x7f8988c66cdc, path=0x7f89780ca3a8 "/p1/d0", 'X' <repeats 194 times>...)
    at quick-read.c:103
#4  0x00007f8989fc17e7 in qr_writev (frame=0x7f898bba1a04, this=0x6b8ec8, fd=0x7f89883522dc, vector=0x7f89868f8850, count=1, off=327680, 
    iobref=0x7f89780e4a08) at quick-read.c:1263
#5  0x00007f8989db425c in io_stats_writev (frame=0x7f898bba21c0, this=0x6ba038, fd=0x7f89883522dc, vector=0x7f89868f8850, count=1, offset=327680, 
    iobref=0x7f89780e4a08) at io-stats.c:1224
#6  0x00007f8989b64152 in nfs_fop_write (nfsx=0x6bb1f8, xl=0x6ba038, nfu=0x7f8989ae9b60, fd=0x7f89883522dc, srciob=0x71f4f0, vector=0x7f89868f8850, 
    count=1, offset=327680, cbk=0x7f8989b71136 <nfs3svc_write_cbk>, local=0x7f89868f84a0) at nfs-fops.c:1232
#7  0x00007f8989b67321 in nfs_write (nfsx=0x6bb1f8, xl=0x6ba038, nfu=0x7f8989ae9b60, fd=0x7f89883522dc, srciob=0x71f4f0, vector=0x7f89868f8850, count=1, 
    offset=327680, cbk=0x7f8989b71136 <nfs3svc_write_cbk>, local=0x7f89868f84a0) at nfs-generics.c:164
#8  0x00007f8989b71518 in __nfs3_write_resume (cs=0x7f89868f84a0) at nfs3.c:1929
#9  0x00007f8989b71592 in nfs3_write_resume (carg=0x7f89868f84a0) at nfs3.c:1949
#10 0x00007f8989b82362 in nfs3_file_open_and_resume (cs=0x7f89868f84a0, resume=0x7f8989b71528 <nfs3_write_resume>) at nfs3-helpers.c:2223
#11 0x00007f8989b71697 in nfs3_write_open_resume (carg=0x7f89868f84a0) at nfs3.c:1976
#12 0x00007f8989b83510 in nfs3_fh_resolve_inode_done (cs=0x7f89868f84a0, inode=0x7f8988c66cdc) at nfs3-helpers.c:2508
#13 0x00007f8989b85410 in nfs3_fh_resolve_inode (cs=0x7f89868f84a0) at nfs3-helpers.c:3059
#14 0x00007f8989b85568 in nfs3_fh_resolve_and_resume (cs=0x7f89868f84a0, fh=0x7f8989ae9df0, entry=0x0, resum_fn=0x7f8989b71623 <nfs3_write_open_resume>)
    at nfs3-helpers.c:3103
#15 0x00007f8989b71ae4 in nfs3_write (req=0x7f897800d030, fh=0x7f8989ae9df0, offset=327680, count=65536, stable=UNSTABLE, payload=
      {iov_base = 0x7f8986540000, iov_len = 131072}, iob=0x71f4f0) at nfs3.c:2022
#16 0x00007f8989b71e70 in nfs3svc_write_vec (req=0x7f897800d030, iob=0x71f4f0) at nfs3.c:2110
#17 0x00007f8989b8dae0 in nfs_rpcsvc_record_vectored_call_actor (conn=0x7f897800c298) at ../../../../xlators/nfs/lib/src/rpcsvc.c:2282
#18 0x00007f8989b8dd64 in nfs_rpcsvc_update_vectored_state (conn=0x7f897800c298) at ../../../../xlators/nfs/lib/src/rpcsvc.c:2339
#19 0x00007f8989b8de5d in nfs_rpcsvc_handle_vectored_frag (conn=0x7f897800c298, dataread=0) at ../../../../xlators/nfs/lib/src/rpcsvc.c:2398
#20 0x00007f8989b8df4c in nfs_rpcsvc_record_update_state (conn=0x7f897800c298, dataread=65536) at ../../../../xlators/nfs/lib/src/rpcsvc.c:2435
#21 0x00007f8989b8e25b in nfs_rpcsvc_conn_data_poll_in (conn=0x7f897800c298) at ../../../../xlators/nfs/lib/src/rpcsvc.c:2504
#22 0x00007f8989b8e6d3 in nfs_rpcsvc_conn_data_handler (fd=15, idx=3, data=0x7f897800c298, poll_in=1, poll_out=0, poll_err=0)
    at ../../../../xlators/nfs/lib/src/rpcsvc.c:2633
#23 0x00007f898ca9eeb9 in event_dispatch_epoll_handler (event_pool=0x6bd3f8, events=0x7f89780008f8, i=0) at event.c:812
#24 0x00007f898ca9f0c9 in event_dispatch_epoll (event_pool=0x6bd3f8) at event.c:876
#25 0x00007f898ca9f431 in event_dispatch (event_pool=0x6bd3f8) at event.c:984
#26 0x00007f8989b88c4e in nfs_rpcsvc_stage_proc (arg=0x6bd3b8) at ../../../../xlators/nfs/lib/src/rpcsvc.c:64
#27 0x0000003871c0685a in start_thread (arg=<value optimized out>) at pthread_create.c:297
#28 0x00000038710de22d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#29 0x0000000000000000 in ?? ()


These snapshots were taken over 15 mins while the nfs process continued using 99% CPU. In all backtraces, the frame address in qr_writev is always 0x7f898bba1a04, which points to a infinite loop that is hit when handling this frame, preventing further progress.

Comment 2 Anand Avati 2010-11-13 12:03:00 UTC
PATCH: http://patches.gluster.com/patch/5675 in master (performance/quick-read: handle the errors in qr_loc_fill properly when an fop needs to do open before continuing.)

Comment 3 Vijaykumar 2011-08-31 06:39:42 UTC
Its very difficult to reproduce the bug. According to the developer fix has gone properly. So i am closing it.