Bug 206483 - rpc.mountd segfaults when NFS mount is attempted from remote system
rpc.mountd segfaults when NFS mount is attempted from remote system
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: glibc (Show other bugs)
4.4
x86_64 Linux
medium Severity urgent
: ---
: ---
Assigned To: Jakub Jelinek
Brian Brock
:
: 208718 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2006-09-14 13:04 EDT by sa@tmt.ca.boeing.com
Modified: 2017-04-28 02:39 EDT (History)
5 users (show)

See Also:
Fixed In Version: RHBA-2007-0210
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2007-05-01 19:07:03 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
Core file for segfault of rpc.mountd (292.00 KB, application/octet-stream)
2006-10-09 10:59 EDT, sa@tmt.ca.boeing.com
no flags Details

  None (edit)
Description sa@tmt.ca.boeing.com 2006-09-14 13:04:00 EDT
Description of problem:
After booting, NFS starts properly with a rpc.mountd daemon
running, but as soon as a remote RHEL4 system attempts NFS
mount an exported filesystem rpc.mountd exits with a segfault
and we are unable to mount any of the filesystems.

Version-Release number of selected component (if applicable):
RHEL4 Update 4

How reproducible:
Consistent

Steps to Reproduce:
1.Boot System A running RHEL4 Update4
2.Mount systema:/directory from a RHEL4 Update2 system
3.rpc.mountd segfaults on System A
  
Actual results:
Should have NFS mounted systema:/directory

Expected results:
No mount (RPC error)

Additional info:
kernel: rpc.mountd[5515]: segfault at 0000000000000000 rip 0000002a958fd560 rsp
0000007fbfffc648 error 4
from /var/log/messages
Comment 1 sa@tmt.ca.boeing.com 2006-09-14 13:37:09 EDT
One additional tidbit of information.  The exports file on this
RHEL4 Update 4 system contains netgroups.  If we remove the netgroups
and just add individual node names rpc.mountd does NOT crash.
As a test we added a netgroup that contained only two nodenames,
and it still segfaults.  So...it would appear the problem seems to
lie in the inability of rpc.mountd to process netgroups.

Versions of NFS packages loaded:
nfs-utils-1.0.6-70.EL4
nfs-utils-lib-1.0.6-3

running kernel 2.6.9-42.0.2.ELsmp
Comment 2 sa@tmt.ca.boeing.com 2006-09-14 15:51:00 EDT
We've found a workaround, so we can probably drop the severity level of this
down to High.

In /etc/nsswitch.conf, the default entry for netgroup is:
netgroup:   files nisplus nis

if we change this to be:
netgroup:   nis

then rpc.mountd will not segfault when we NFS mount a filesystem.

I also thought I'd mention that I did, one time only (not reproduceable)
get a segfault with exportfs during a reboot:

exportfs[3923]: segfault at 0000000000000000 rip 0000003f0fc70560 rsp
0000007fbfffd888 error 4

which might be worth a look as well.  Thanks!
Comment 3 Steve Dickson 2006-09-22 20:54:41 EDT
Is selinux enabled?
Comment 4 sa@tmt.ca.boeing.com 2006-09-25 10:31:35 EDT
No. selinux is disabled.
Comment 5 Steve Dickson 2006-09-26 08:38:00 EDT
Would it be possible to install the nfs-utils-debuginfo patch and
then start up rpc.mountd through gdb (i.e. gdb rpc.mountd)
then type r at the gdb prompt... hopefully this will show where
the segfault is happening... 
Comment 6 sa@tmt.ca.boeing.com 2006-09-26 11:49:11 EDT
Steve:
Can you enlighten me on how to go about installing the nfs-utils-debuginfo
patch and I'll give it a shot?  Thanks!
Comment 7 sa@tmt.ca.boeing.com 2006-09-26 12:12:01 EDT
Steve:  After some google-searching I found and installed nfs-utils-debuginfo
but it didn't give the info as hoped.  Here's the output:

gdb /usr/sbin/rpc.mountd
GNU gdb Red Hat Linux (6.3.0.0-1.132.EL4rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu"...(no debugging symbols found)
Using host libthread_db library "/lib64/tls/libthread_db.so.1".

(gdb) r
Starting program: /usr/sbin/rpc.mountd
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
Detaching after fork from child process 6184.

Program exited normally.
(gdb)


Then when I did an NFS mount, while the mount failed, there was no
debug info in shell window where gdb rpc.mountd was running.  The only
thing that showed up was the segfault in messages file:

Sep 26 08:56:10 xyz kernel: rpc.mountd[4315]: segfault at 0000000000000000 rip
0000002a958fd560 rsp 0000007fbfffca98 error 4

Anything else I should be doing in gdb to get traceback info?  Other
than installing the nfs-utils-debuginfo RPM, is there anything else
I need to get debugger enabled?
Comment 8 Steve Dickson 2006-10-03 10:29:50 EDT
Well congratulation! It appears you uncover another bug :-\
Im sure if you do a 'rpm -qli nfs-utils-debuginfo' it will show *no* files...

Please grab the rpms out of http://people.redhat.com/steved/bz209121/
to see if the problem goes a way and if not, install the debuginfo from
that directory and start up rpc.mountd using gdb...
Comment 9 sa@tmt.ca.boeing.com 2006-10-04 14:00:57 EDT
Sorry, new version failed to solve the problem.

What we have loaded:
nfs-utils-debuginfo-1.0.6-72.EL4
nfs-utils-lib-1.0.6-3
nfs-utils-1.0.6-72.EL4

Rebooted, then killed rcp.mountd, and manually started it thusly:
gdb /usr/sbin/rpc.mountd
GNU gdb Red Hat Linux (6.3.0.0-1.132.EL4rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu"...Using host libthread_db
library "/lib64/tls/libthread_db.so.1".
(gdb) r
Starting program: /usr/sbin/rpc.mountd
Detaching after fork from child process 6023.

Program exited normally.
(gdb)

From another shell window logged into a remote system and
tried (and failed) to mount filesystem on local machine.

The following error appeared in messages:
Oct  4 00:25:22 xyz kernel: rpc.mountd[6023]: segfault at 0000000000000000 rip
0000002a958fd560 rsp 0000007fbfffc318 error 4

Put netgroup configuration back to nis as the only entry
in nsswitch.conf and restarted rpc.mountd and NFS mounts
are working again.

Am I starting the debug version of rcp.mountd incorrectly
as I'm puzzled why I'm not getting any useful traceback
information for you?
Comment 10 Steve Dickson 2006-10-09 09:53:00 EDT
try using the -f flag when you start rpc.mount from the debugger.
Comment 11 Steve Dickson 2006-10-09 09:55:56 EDT
Also try getting a core dump by setting core file size 
to unlimited (i.e. ulimit -c unlimited). Once you 
get the core file, using gdb, you should be able to
get a backtrace...
Comment 12 sa@tmt.ca.boeing.com 2006-10-09 10:59:31 EDT
Created attachment 138041 [details]
Core file for segfault of rpc.mountd
Comment 13 sa@tmt.ca.boeing.com 2006-10-09 11:01:38 EDT
I'm still unable to get a stack trace while running gdb /usr/sbin/rpc.mountd -f,
but I did get a core dump, which I've attached.
Comment 14 Steve Dickson 2006-10-09 15:35:14 EDT
Unfortunately, for some strange reason, I can't read that core
So please try:

gdb /usr/sbin/rpc.mountd core.10895
gdb> bt

to see if you can get a backtrace... 
Comment 15 sa@tmt.ca.boeing.com 2006-10-09 18:36:41 EDT
Sorry, I deleted the core file so I've generated a new one (which I can send
if interested).  Here is the bt info from this latest core file.

Core was generated by `/usr/sbin/rpc.mountd'.
Program terminated with signal 11, Segmentation fault.
Loaded symbols for /usr/sbin/rpc.mountd
Reading symbols from /usr/lib64/libwrap.so.0...done.
Loaded symbols for /usr/lib64/libwrap.so.0
Reading symbols from /lib64/libnsl.so.1...done.
Loaded symbols for /lib64/libnsl.so.1
Reading symbols from /lib64/tls/libc.so.6...done.
Loaded symbols for /lib64/tls/libc.so.6
Reading symbols from /lib64/ld-linux-x86-64.so.2...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Reading symbols from /lib64/libnss_files.so.2...done.
Loaded symbols for /lib64/libnss_files.so.2
Reading symbols from /lib64/libnss_nisplus.so.2...done.
Loaded symbols for /lib64/libnss_nisplus.so.2
Reading symbols from /lib64/libnss_nis.so.2...done.
Loaded symbols for /lib64/libnss_nis.so.2
#0  0x0000002a958fd560 in strlen () from /lib64/tls/libc.so.6
(gdb) bt
#0  0x0000002a958fd560 in strlen () from /lib64/tls/libc.so.6
#1  0x0000002a958fd2a6 in strdup () from /lib64/tls/libc.so.6
#2  0x0000002a9578111c in nis_list () from /lib64/libnsl.so.1
#3  0x0000002a95bd7ddb in _nss_nisplus_setnetgrent () from
/lib64/libnss_nisplus.so.2
#4  0x0000002a9596d67c in innetgr () from /lib64/tls/libc.so.6
#5  0x000000552aab1bc2 in client_check (clp=Variable "clp" is not available.
) at client.c:368
#6  0x000000552aab1d69 in client_compose (addr=Variable "addr" is not available.
) at client.c:255
#7  0x000000552aaaf523 in auth_authenticate (what=0x552aab7c81 "mount",
caller=0x552abc45b4, path=0x7fbfffdd50 "/home") at auth.c:83
#8  0x000000552aaae4a7 in get_rootfh (rqstp=Variable "rqstp" is not available.
) at mountd.c:302
#9  0x000000552aaae7f8 in mount_mnt_3_svc (rqstp=0x7fbffff030,
path=0x7fbfffef18, res=0x7fbfffef20) at mountd.c:267
#10 0x000000552aab5b8f in rpc_dispatch (rqstp=0x7fbffff030, transp=0x552abc45a0,
dtable=Variable "dtable" is not available.
) at rpcdispatch.c:53
#11 0x000000552aaaf331 in mount_dispatch (rqstp=0x7fbffff030,
transp=0x552abc45a0) at mount_dispatch.c:81
#12 0x0000002a95976a36 in svc_getreq_common_internal () from /lib64/tls/libc.so.6
#13 0x0000002a959766cd in svc_getreqset_internal () from /lib64/tls/libc.so.6
#14 0x000000552aab0f1b in my_svc_run () at svc_run.c:86
#15 0x000000552aaaf06a in main (argc=Variable "argc" is not available.
) at mountd.c:636
Comment 16 Jakub Jelinek 2006-10-10 09:08:21 EDT
Can you please also install glibc-debuginfo-2.3.4-2.25.x86_64.rpm and get
the backtrace once again, so that we can see the exact arguments and source
locations in the backtrace?  Thanks.
Comment 17 sa@tmt.ca.boeing.com 2006-10-10 12:45:22 EDT
After installing glibc-debuginfo and rebooting, here's the latest backtrace.

gdb /usr/sbin/rpc.mountd core.6245
GNU gdb Red Hat Linux (6.3.0.0-1.132.EL4rh)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu"...Using host libthread_db
library "/lib64/tls/libthread_db.so.1".

Core was generated by `/usr/sbin/rpc.mountd --foreground'.
Program terminated with signal 11, Segmentation fault.
Loaded symbols for /usr/sbin/rpc.mountd
Reading symbols from /usr/lib64/libwrap.so.0...done.
Loaded symbols for /usr/lib64/libwrap.so.0
Reading symbols from /lib64/libnsl.so.1...Reading symbols from
/usr/lib/debug/lib64/libnsl-2.3.4.so.debug...done.
done.
Loaded symbols for /lib64/libnsl.so.1
Reading symbols from /lib64/tls/libc.so.6...Reading symbols from
/usr/lib/debug/lib64/tls/libc-2.3.4.so.debug...done.
done.
Loaded symbols for /lib64/tls/libc.so.6
Reading symbols from /lib64/ld-linux-x86-64.so.2...Reading symbols from
/usr/lib/debug/lib64/ld-2.3.4.so.debug...done.
done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Reading symbols from /lib64/libnss_files.so.2...Reading symbols from
/usr/lib/debug/lib64/libnss_files-2.3.4.so.debug...done.
done.
Loaded symbols for /lib64/libnss_files.so.2
Reading symbols from /lib64/libnss_nisplus.so.2...Reading symbols from
/usr/lib/debug/lib64/libnss_nisplus-2.3.4.so.debug...done.
done.
Loaded symbols for /lib64/libnss_nisplus.so.2
Reading symbols from /lib64/libnss_nis.so.2...Reading symbols from
/usr/lib/debug/lib64/libnss_nis-2.3.4.so.debug...done.
done.
Loaded symbols for /lib64/libnss_nis.so.2
#0  0x0000002a958fd560 in strlen () from /lib64/tls/libc.so.6
(gdb) bt
#0  0x0000002a958fd560 in strlen () from /lib64/tls/libc.so.6
#1  0x0000002a958fd2a6 in strdup () from /lib64/tls/libc.so.6
#2  0x0000002a9578111c in *__GI_nis_list (name=Variable "name" is not available.
) at nis_table.c:250
#3  0x0000002a95bd7ddb in _nss_nisplus_setnetgrent (group=0x552abc8ff9 "SGI",
netgrp=0x7fbfffcb50) at nss_nisplus/nisplus-netgrp.c:162
#4  0x0000002a9596d67c in *__GI_innetgr (netgroup=0x552abc8ff9 "SGI",
host=0x552abccba0 "go.ca.boeing.com", user=0x0, domain=0x0) at getnetgrent_r.c:354
#5  0x000000552aab1bc2 in client_check (clp=Variable "clp" is not available.
) at client.c:368
#6  0x000000552aab1d69 in client_compose (addr=Variable "addr" is not available.
) at client.c:255
#7  0x000000552aaaf523 in auth_authenticate (what=0x552aab7c81 "mount",
caller=0x552abc45b4, path=0x7fbfffdd10 "/home") at auth.c:83
#8  0x000000552aaae4a7 in get_rootfh (rqstp=Variable "rqstp" is not available.
) at mountd.c:302
#9  0x000000552aaae7f8 in mount_mnt_3_svc (rqstp=0x7fbfffeff0,
path=0x7fbfffeed8, res=0x7fbfffeee0) at mountd.c:267
#10 0x000000552aab5b8f in rpc_dispatch (rqstp=0x7fbfffeff0, transp=0x552abc45a0,
dtable=Variable "dtable" is not available.
) at rpcdispatch.c:53
#11 0x000000552aaaf331 in mount_dispatch (rqstp=0x7fbfffeff0,
transp=0x552abc45a0) at mount_dispatch.c:81
#12 0x0000002a95976a36 in svc_getreq_common (fd=Variable "fd" is not available.
) at svc.c:465
#13 0x0000002a959766cd in svc_getreqset (readfds=Variable "readfds" is not
available.
) at svc.c:376
#14 0x000000552aab0f1b in my_svc_run () at svc_run.c:86
#15 0x000000552aaaf06a in main (argc=Variable "argc" is not available.
) at mountd.c:636
Comment 18 Jakub Jelinek 2006-10-10 15:17:50 EDT
Thanks, I managed to reproduce this with a simple:
#include <netdb.h>

int
main (void)
{
  innetgr ("baz", "foo.bar.com", 0, 0);
  return 0;
}

with
netgroup:   nisplus
in /etc/nsswitch.conf and no NIS+ configured at all.  Not sure if this is
nis_list's fault (should check if nis_getnames (ibreq->ibr_name)[0] == NULL)
or nis_getnames' fault yet.
Comment 20 Jakub Jelinek 2006-10-10 15:48:45 EDT
If you don't have NIS+ configured, the best workaround would be to remove
nisplus from netgroup entry in /etc/nsswitch.conf.
Comment 23 RHEL Product and Program Management 2006-10-11 02:16:31 EDT
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.
Comment 26 Jakub Jelinek 2006-10-23 03:09:00 EDT
*** Bug 208718 has been marked as a duplicate of this bug. ***
Comment 31 Red Hat Bugzilla 2007-05-01 19:07:03 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2007-0210.html

Note You need to log in before you can comment on or make changes to this bug.