Description of problem: Executing the command "rpcinfo -T tcp <server> nfs" results in a SEGV. Version-Release number of selected component (if applicable): rpcbind-0.1.7-1.fc10.x86_64 libtirpc-0.1.10-2.fc10.x86_64 How reproducible: Always Additional info: gdb backtrace of core gives: Core was generated by `rpcinfo -T tcp shark.themaw.net nfs'. Program terminated with signal 11, Segmentation fault. [New process 22395] #0 strlen () at ../sysdeps/x86_64/strlen.S:37 37 0: cmpb $0x0,(%rax) /* is byte NUL? */ (gdb) bt #0 strlen () at ../sysdeps/x86_64/strlen.S:37 #1 0x000000000082eafb in xdr_string (xdrs=0x1de34a8, cpp=0x7fff7112a128, maxsize=4294967295) at xdr.c:674 #2 0x0000000000828281 in xdr_rpcb (xdrs=0x1de34a8, objp=0x7fff7112a120) at rpcb_prot.c:57 #3 0x00000000008219f1 in clnt_vc_call (cl=0x1de2290, proc=3, xdr_args=0x828230 <xdr_rpcb>, args_ptr=0x7fff7112a120, xdr_results=0x82ebe0 <xdr_wrapstring>, results_ptr=0x7fff7112a178, timeout={tv_sec = 60, tv_usec = 0}) at clnt_vc.c:365 #4 0x0000000000827638 in __rpcb_findaddr_timed (program=100003, version=0, nconf=0x1de1580, host=0x7fff7112b62f "shark.themaw.net", clpp=0x7fff7112a1e8, tp=0xa39010) at rpcb_clnt.c:785 #5 0x000000000081f9b3 in clnt_tp_create_timed (hostname=0x7fff71129c10 "", prog=100003, vers=0, nconf=0x1de1580, tp=<value optimized out>) at clnt_generic.c:300 #6 0x000000000024f66a in progping () at rpcinfo.c:1625 #7 main (argc=<value optimized out>, argv=0x7fff7112a488) at rpcinfo.c:301 Current language: auto; currently asm
Works fine under i386. Smells like a tirpc / legacy rpc data type size mismatch...
Created attachment 333608 [details] patch -- set r_netid to value of nc_netid This libtirpc patch fixes the problem for me, and looks correct. We're not actually setting r_netid and leaving it uninitialized and that causes the segfault. Before I propose this upstream though, I still want to do some investigation to determine why I don't see a segfault on i386.
Ok, playing with this on i386 it's still not clear to me why we don't see a crash there, but the data structures involved still don't look right: (gdb) p parms $12 = {r_prog = 100003, r_vers = 0, r_netid = 0xb7fef2a8 "", r_addr = 0x76bb20 "192.168.1.2.0.111", r_owner = 0xd6cfc4 "�\016\002"} (gdb) info symbol 0xb7fef2a8 No symbol matches 0xb7fef2a8. ...no real symbol there... (gdb) x 0xb7fef2a8 0xb7fef2a8: 0x005de000 ...not even a completely blank string, but has some other junk after it. I think it's just luck that we're not hitting a segfault here. I'll plan to push the patch in comment #2 out this week as I'm pretty sure it's correct.
Yes, it is strange the problem isn't seen on i386. Thanks Jeff.
Created attachment 333754 [details] patch -- set r_netid and r_owner in RPCB Updated patch -- also set r_owner. This patch is a little more complete. I'd never seen it segfault on r_owner not being set, but I think that's just luck. Sent this patch to upstream libtirpc list (which hasn't had a post since 2007), and directly to steved. Also sent a cleanup patch to remove some bogus files from the upstream git repo.
Patches committed upstream... This should be fixed now in rawhide with libtirpc-0.1.10-6.fc11