Bug 762757 (GLUSTER-1025)

Summary: mistargeted memory allocation in NFS
Product: [Community] GlusterFS Reporter: Csaba Henk <csaba>
Component: coreAssignee: Shehjar Tikoo <shehjart>
Status: CLOSED CURRENTRELEASE QA Contact:
Severity: medium Docs Contact:
Priority: low    
Version: mainlineCC: gluster-bugs, shehjart, vbellur
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: RTP Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Shehjar Tikoo 2010-06-24 08:08:56 UTC
Just so nobody panics, let me emphasise that this is in mainline not nfs-beta. Probably caused due to a bug in integrating NFS with mem-accounting changes.

Comment 1 Csaba Henk 2010-06-24 10:58:55 UTC
nfs server over posix aborts upon being "ls"-d as follows:


(gdb) bt
#0  0xb7764424 in __kernel_vsyscall ()
#1  0xb75c2d01 in __open_catalog () from /lib/libc.so.6
#2  0xb75c458e in sigvec () from /lib/libc.so.6
#3  0xb75bb638 in ?? () from /lib/libc.so.6
#4  0xb774fe5d in gf_mem_set_acct_info (xl=0x9c768b8, alloc_ptr=0xb64c5210, size=124, type=68) at mem-pool.c:88
#5  0xb77500d8 in __gf_calloc (nmemb=1, size=124, type=68) at mem-pool.c:140
#6  0xb652b962 in nfs3_fill_entryp3 (entry=0x9ce5b30, dirfh=0xb5a9a03c) at nfs3-helpers.c:848
#7  0xb652bd61 in nfs3_fill_readdirp3res (res=0xb64c5424, stat=NFS3_OK, dirfh=0xb5a9a03c, cverf=164518632, dirstat=0xb64c55ac, entries=0x9ce5b30, dircount=512,
    maxcount=4096, is_eof=1) at nfs3-helpers.c:966
#8  0xb6525e61 in nfs3_readdirp_reply (req=0xb59a2020, stat=NFS3_OK, dirfh=0xb5a9a03c, cverf=164518632, dirstat=0xb64c55ac, entries=0xb5a9a130, dircount=512,
    maxcount=4096, is_eof=1) at nfs3.c:3628
#9  0xb65261bc in nfs3svc_readdir_fstat_cbk (frame=0x9cee04c, cookie=0x9c77828, this=0x9c768b8, op_ret=0, op_errno=0, buf=0xb64c55ac) at nfs3.c:3690
#10 0xb6510d60 in nfs_fop_fstat_cbk (frame=0x9cee04c, cookie=0x9c77828, this=0x9c768b8, op_ret=0, op_errno=0, buf=0xb64c55ac) at nfs-fops.c:331
#11 0xb654b836 in posix_fstat (frame=0x9ce5c70, this=0x9c768b8, fd=0x9ce5ae8) at posix.c:3848
#12 0xb65110aa in nfs_fop_fstat (nfsx=0x9c77828, xl=0x9c768b8, nfu=0xb64c5718, fd=0x9ce5ae8, cbk=0xb6525f66 <nfs3svc_readdir_fstat_cbk>, local=0xb5a9a020)
    at nfs-fops.c:354
#13 0xb6518030 in nfs_fstat (nfsx=0x9c77828, xl=0x9c768b8, nfu=0xb64c5718, fd=0x9ce5ae8, cbk=0xb6525f66 <nfs3svc_readdir_fstat_cbk>, local=0xb5a9a020)
    at nfs-generics.c:45
#14 0xb6526373 in nfs3svc_readdir_cbk (frame=0x9c726a4, cookie=0x9c77828, this=0x9c768b8, op_ret=3, op_errno=2, entries=0xb64c5980) at nfs3.c:3725
#15 0xb6511919 in nfs_fop_readdirp_cbk (frame=0x9c726a4, cookie=0x9c77828, this=0x9c768b8, op_ret=3, op_errno=2, entries=0xb64c5980) at nfs-fops.c:463
#16 0xb654c6d9 in posix_do_readdir (frame=0x9ce5d60, this=0x9c768b8, fd=0x9ce5ae8, size=512, off=0, whichop=41) at posix.c:4080
#17 0xb654c7b4 in posix_readdirp (frame=0x9ce5d60, this=0x9c768b8, fd=0x9ce5ae8, size=512, off=0) at posix.c:4101
#18 0xb6511c61 in nfs_fop_readdirp (nfsx=0x9c77828, xl=0x9c768b8, nfu=0xb64c5bb0, dirfd=0x9ce5ae8, bufsize=512, offset=0, cbk=0xb652629d <nfs3svc_readdir_cbk>,
    local=0xb5a9a020) at nfs-fops.c:487
#19 0xb651812c in nfs_readdirp (nfsx=0x9c77828, xl=0x9c768b8, nfu=0xb64c5bb0, dirfd=0x9ce5ae8, bufsize=512, offset=0, cbk=0xb652629d <nfs3svc_readdir_cbk>,
    local=0xb5a9a020) at nfs-generics.c:72
#20 0xb6526597 in nfs3_readdir_process (cs=0xb5a9a020) at nfs3.c:3768
#21 0xb6526696 in nfs3_readdir_read_resume (carg=0xb5a9a020) at nfs3.c:3793
#22 0xb652d999 in nfs3_dir_open_cbk (frame=0x9ce5eb4, cookie=0x9c77828, this=0x9c768b8, op_ret=0, op_errno=22, fd=0x9ce5ae8) at nfs3-helpers.c:1754
#23 0xb6517e10 in nfs_inode_opendir_cbk (frame=0x9ce5eb4, cookie=0x9c77828, this=0x9c768b8, op_ret=0, op_errno=22, fd=0x9ce5ae8) at nfs-inodes.c:539
#24 0xb6511173 in nfs_fop_opendir_cbk (frame=0x9ce5eb4, cookie=0x9c77828, this=0x9c768b8, op_ret=0, op_errno=22, fd=0x9ce5ae8) at nfs-fops.c:377
#25 0xb653fbe5 in posix_opendir (frame=0x9ce5dc0, this=0x9c768b8, loc=0xb5a9a37c, fd=0x9ce5ae8) at posix.c:956
#26 0xb65114b0 in nfs_fop_opendir (nfsx=0x9c77828, xl=0x9c768b8, nfu=0xb64c5ec0, pathloc=0xb5a9a37c, dirfd=0x9ce5ae8, cbk=0xb6517d63 <nfs_inode_opendir_cbk>,
    local=0x9c793d8) at nfs-fops.c:398
#27 0xb6517f97 in nfs_inode_opendir (nfsx=0x9c77828, xl=0x9c768b8, nfu=0xb64c5ec0, loc=0xb5a9a37c, cbk=0xb652d88d <nfs3_dir_open_cbk>, local=0xb5a9a020)
    at nfs-inodes.c:563
#28 0xb65189b0 in nfs_opendir (nfsx=0x9c77828, fopxl=0x9c768b8, nfu=0xb64c5ec0, pathloc=0xb5a9a37c, cbk=0xb652d88d <nfs3_dir_open_cbk>, local=0xb5a9a020)
    at nfs-generics.c:318
#29 0xb652da2b in __nfs3_dir_open_and_resume (cs=0xb5a9a020) at nfs3-helpers.c:1770
#30 0xb652db92 in nfs3_dir_open_and_resume (cs=0xb5a9a020, resume=0xb65265a8 <nfs3_readdir_read_resume>) at nfs3-helpers.c:1797
#31 0xb6526852 in nfs3_readdir_open_resume (carg=0xb5a9a020) at nfs3.c:3828
#32 0xb652f838 in nfs3_fh_resolve_inode_done (cs=0xb5a9a020, inode=0x9ccf6d8) at nfs3-helpers.c:2383
#33 0xb6531376 in nfs3_fh_resolve_inode (cs=0xb5a9a020) at nfs3-helpers.c:2880
#34 0xb65314f9 in nfs3_fh_resolve_and_resume (cs=0xb5a9a020, fh=0xb64c6110, entry=0x0, resum_fn=0xb65267de <nfs3_readdir_open_resume>) at nfs3-helpers.c:2917
#35 0xb6526d06 in nfs3_readdir (req=0xb59a2020, fh=0xb64c6110, cookie=0, cverf=0, dircount=512, maxcount=4096) at nfs3.c:3879
#36 0xb6527108 in nfs3svc_readdirp (req=0xb59a2020) at nfs3.c:3956
#37 0xb77151e0 in rpcsvc_handle_rpc_call (conn=0x9ce0838) at rpcsvc.c:1876
---Type <return> to continue, or q <return> to quit---
#38 0xb7716098 in rpcsvc_record_update_state (conn=0x9ce0838, dataread=0) at rpcsvc.c:2356
#39 0xb77161f2 in rpcsvc_conn_data_poll_in (conn=0x9ce0838) at rpcsvc.c:2399
#40 0xb7716612 in rpcsvc_conn_data_handler (fd=5, idx=3, data=0x9ce0838, poll_in=1, poll_out=0, poll_err=0) at rpcsvc.c:2528
#41 0xb774f751 in event_dispatch_epoll_handler (event_pool=0x9c79270, events=0x9cd5398, i=0) at event.c:809
#42 0xb774f924 in event_dispatch_epoll (event_pool=0x9c79270) at event.c:873
#43 0xb774fc64 in event_dispatch (event_pool=0x9c79270) at event.c:981
#44 0xb7711461 in rpcsvc_stage_proc (arg=0x9c77248) at rpcsvc.c:64
#45 0xb76e9e60 in start_thread () from /lib/libpthread.so.0
#46 0xb766725e in error_at_line () from /lib/libc.so.6
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

One can see:

(gdb) fr 6
#6  0xb652b962 in nfs3_fill_entryp3 (entry=0x9ce5b30, dirfh=0xb5a9a03c) at nfs3-helpers.c:848
848             ent = GF_CALLOC (1, sizeof (*ent), gf_nfs_mt_entryp3);
(gdb) fr 4
#4  0xb774fe5d in gf_mem_set_acct_info (xl=0x9c768b8, alloc_ptr=0xb64c5210, size=124, type=68) at mem-pool.c:88
88                      assert (0);
(gdb) l
83              if (!(xl->mem_acct.rec)) {
84                      assert (0);
85              }
86
87              if (type > xl->mem_acct.num_types) {
88                      assert (0);
89              }
90
91              LOCK(&xl->mem_acct.rec[type].lock);
92              {
(gdb) p xl->type
$1 = 0x9c77270 "storage/posix"

That is, problem is that GF_CALLOC done for an nfs type while THIS is a posix instance.

Comment 2 Csaba Henk 2010-10-05 08:19:16 UTC
Shehjar, please take a look at it.

IIRC it has been fixed since then, but I cannot recall the bug id.

If that's the case I'd think best option is to mark it as duplicate of that one.

Comment 3 Shehjar Tikoo 2010-10-05 08:20:52 UTC
Yeah, consider it resolved. This was from the days when NFS did not behave properly with mem-accounting.