Bug 505642 - su killed with SIGPIPE with nss_ldap and no nscd
Summary: su killed with SIGPIPE with nss_ldap and no nscd
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: coreutils
Version: 5.2
Hardware: All
OS: Linux
low
low
Target Milestone: rc
: ---
Assignee: Ondrej Vasik
QA Contact: BaseOS QE
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2009-06-12 18:01 UTC by Bowe Strickland
Modified: 2009-10-20 19:05 UTC (History)
0 users

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-10-20 19:05:58 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Bowe Strickland 2009-06-12 18:01:35 UTC
When su'ing to a LDAP defined user, if nscd is not running, the child quickly is killed by SIGPIPE. 

--------------------------------------------------------------------

[root@server110 ~]# su - superman
[root@server110 ~]# id superman
uid=2050(superman) gid=2050(superman) groups=2050(superman) context=root:system_r:unconfined_t:SystemLow-SystemHigh
[root@server110 ~]# getenforce 
Permissive
[root@server110 ~]# strace -f 2> /tmp/trace su - superman

----------------------------------------------------------------------------

Examining the strace, the relevent lines reveal the child is writhing to an unconnected socket (file descriptor 6 duped to 3):

clone(Process 2629 attached
child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidp
tr=0xb7fe9918) = 2629
[pid  2628] rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1],  <unfinished ...>
[pid  2629] getsockname(3,  <unfinished ...>
[pid  2628] <... rt_sigprocmask resumed> NULL, 8) = 0
[pid  2629] <... getsockname resumed> {sa_family=AF_INET, sin_port=htons(45065),
 sin_addr=inet_addr("192.168.0.110")}, [16]) = 0
[pid  2628] rt_sigaction(SIGTERM, {0x53d6c0, [], 0}, NULL, 8) = 0
[pid  2628] rt_sigprocmask(SIG_UNBLOCK, [ALRM TERM], NULL, 8) = 0
[pid  2628] waitpid(-1, Process 2628 suspended
 <unfinished ...>
[pid  2629] getpeername(3, {sa_family=AF_INET, sin_port=htons(389), sin_addr=ine
t_addr("192.168.0.110")}, [16]) = 0
[pid  2629] fcntl64(3, F_GETFD)         = 0x1 (flags FD_CLOEXEC)
[pid  2629] dup(3)                      = 5
[pid  2629] fcntl64(5, F_SETFD, FD_CLOEXEC) = 0
[pid  2629] socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 6
[pid  2629] close(3)                    = 0
[pid  2629] fcntl64(6, F_GETFD)         = 0
[pid  2629] dup2(6, 3)                  = 3
[pid  2629] fcntl64(3, F_SETFD, 0)      = 0
[pid  2629] close(6)                    = 0
[pid  2629] write(3, "\25\3\1\0 \245\321\266\322f1A\2524\352\22\253\377\347b\315
\346\271\233\303\261HB\'\4$M"..., 37) = -1 EPIPE (Broken pipe)
[pid  2629] --- SIGPIPE (Broken pipe) @ 0 (0) ---

Comparing to a similar configuration that works, the socket is apparently supposed to be connected to nscd UNIX socket:

...
clone(Process 2663 attached
child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidp
tr=0xb7efe918) = 2663
[pid  2662] rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1], NULL, 8) = 0
[pid  2662] rt_sigaction(SIGTERM, {0x6546c0, [], 0}, NULL, 8) = 0
[pid  2662] rt_sigprocmask(SIG_UNBLOCK, [ALRM TERM], NULL, 8) = 0
[pid  2662] waitpid(-1, Process 2662 suspended
 <unfinished ...>
[pid  2663] open("/proc/sys/kernel/ngroups_max", O_RDONLY) = 4
[pid  2663] read(4, "65536\n", 31)      = 6
[pid  2663] close(4)                    = 0
[pid  2663] socket(PF_FILE, SOCK_STREAM, 0) = 4
[pid  2663] fcntl64(4, F_GETFL)         = 0x2 (flags O_RDWR)
[pid  2663] fcntl64(4, F_SETFL, O_RDWR|O_NONBLOCK) = 0
[pid  2663] connect(4, {sa_family=AF_FILE, path="/var/run/nscd/socket"}, 110) =
0
[pid  2663] poll([{fd=4, events=POLLOUT|POLLERR|POLLHUP, revents=POLLOUT}], 1, 5
000) = 1
[pid  2663] send(4, "\2\0\0\0\f\0\0\0\6\0\0\0group\0", 18, MSG_NOSIGNAL) = 18
[pid  2663] poll([{fd=4, events=POLLIN|POLLERR|POLLHUP, revents=POLLIN|POLLHUP}]
, 1, 5000) = 1
...

------------------------------------------------------------------------

When I start nscd on the failing system, su works as expected.

[root@server110 ~]# /etc/rc.d/init.d/nscd start
Starting nscd:                                             [  OK  ]
[root@server110 ~]# su - superman
-bash-3.1$

Comment 1 Bowe Strickland 2009-06-12 18:02:31 UTC
-bash-3.1$ cat /etc/redhat-release 
Red Hat Enterprise Linux Server release 5.1 (Tikanga)
-bash-3.1$ uname -r
2.6.18-53.el5xen
-bash-3.1$ rpm -qf /bin/su
coreutils-5.97-12.1.el5

Comment 2 Ondrej Vasik 2009-06-16 14:24:16 UTC
Thanks for report, could you please attach full strace of the failure? It would be easier for me to find out where the failure actually occurs.

Comment 3 Bowe Strickland 2009-06-16 16:28:58 UTC
ack... have shut down that classroom, so don't have full trace, and in new classroom w/ similar release/kernel/coreutils as comment 1, cannot reproduce :(.

We hit this consistently (every student station, every time) last week.  Must be a subtle difference in network environment...

Comment 4 Ondrej Vasik 2009-10-20 19:05:58 UTC
Cleaning bugzillas, as it is not possible to reproduce the issue and it was probably somerhing strange in network environment, I'll close that bugzilla insufficient data. 

Feel free to reopen it, if you will be able to reproduce it in future. Please provide full straces in that case.


Note You need to log in before you can comment on or make changes to this bug.