Bug 505642 - su killed with SIGPIPE with nss_ldap and no nscd
su killed with SIGPIPE with nss_ldap and no nscd
Status: CLOSED INSUFFICIENT_DATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: coreutils (Show other bugs)
5.2
All Linux
low Severity low
: rc
: ---
Assigned To: Ondrej Vasik
BaseOS QE
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2009-06-12 14:01 EDT by Bowe Strickland
Modified: 2009-10-20 15:05 EDT (History)
0 users

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-10-20 15:05:58 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Bowe Strickland 2009-06-12 14:01:35 EDT
When su'ing to a LDAP defined user, if nscd is not running, the child quickly is killed by SIGPIPE. 

--------------------------------------------------------------------

[root@server110 ~]# su - superman
[root@server110 ~]# id superman
uid=2050(superman) gid=2050(superman) groups=2050(superman) context=root:system_r:unconfined_t:SystemLow-SystemHigh
[root@server110 ~]# getenforce 
Permissive
[root@server110 ~]# strace -f 2> /tmp/trace su - superman

----------------------------------------------------------------------------

Examining the strace, the relevent lines reveal the child is writhing to an unconnected socket (file descriptor 6 duped to 3):

clone(Process 2629 attached
child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidp
tr=0xb7fe9918) = 2629
[pid  2628] rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1],  <unfinished ...>
[pid  2629] getsockname(3,  <unfinished ...>
[pid  2628] <... rt_sigprocmask resumed> NULL, 8) = 0
[pid  2629] <... getsockname resumed> {sa_family=AF_INET, sin_port=htons(45065),
 sin_addr=inet_addr("192.168.0.110")}, [16]) = 0
[pid  2628] rt_sigaction(SIGTERM, {0x53d6c0, [], 0}, NULL, 8) = 0
[pid  2628] rt_sigprocmask(SIG_UNBLOCK, [ALRM TERM], NULL, 8) = 0
[pid  2628] waitpid(-1, Process 2628 suspended
 <unfinished ...>
[pid  2629] getpeername(3, {sa_family=AF_INET, sin_port=htons(389), sin_addr=ine
t_addr("192.168.0.110")}, [16]) = 0
[pid  2629] fcntl64(3, F_GETFD)         = 0x1 (flags FD_CLOEXEC)
[pid  2629] dup(3)                      = 5
[pid  2629] fcntl64(5, F_SETFD, FD_CLOEXEC) = 0
[pid  2629] socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 6
[pid  2629] close(3)                    = 0
[pid  2629] fcntl64(6, F_GETFD)         = 0
[pid  2629] dup2(6, 3)                  = 3
[pid  2629] fcntl64(3, F_SETFD, 0)      = 0
[pid  2629] close(6)                    = 0
[pid  2629] write(3, "\25\3\1\0 \245\321\266\322f1A\2524\352\22\253\377\347b\315
\346\271\233\303\261HB\'\4$M"..., 37) = -1 EPIPE (Broken pipe)
[pid  2629] --- SIGPIPE (Broken pipe) @ 0 (0) ---

Comparing to a similar configuration that works, the socket is apparently supposed to be connected to nscd UNIX socket:

...
clone(Process 2663 attached
child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidp
tr=0xb7efe918) = 2663
[pid  2662] rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1], NULL, 8) = 0
[pid  2662] rt_sigaction(SIGTERM, {0x6546c0, [], 0}, NULL, 8) = 0
[pid  2662] rt_sigprocmask(SIG_UNBLOCK, [ALRM TERM], NULL, 8) = 0
[pid  2662] waitpid(-1, Process 2662 suspended
 <unfinished ...>
[pid  2663] open("/proc/sys/kernel/ngroups_max", O_RDONLY) = 4
[pid  2663] read(4, "65536\n", 31)      = 6
[pid  2663] close(4)                    = 0
[pid  2663] socket(PF_FILE, SOCK_STREAM, 0) = 4
[pid  2663] fcntl64(4, F_GETFL)         = 0x2 (flags O_RDWR)
[pid  2663] fcntl64(4, F_SETFL, O_RDWR|O_NONBLOCK) = 0
[pid  2663] connect(4, {sa_family=AF_FILE, path="/var/run/nscd/socket"}, 110) =
0
[pid  2663] poll([{fd=4, events=POLLOUT|POLLERR|POLLHUP, revents=POLLOUT}], 1, 5
000) = 1
[pid  2663] send(4, "\2\0\0\0\f\0\0\0\6\0\0\0group\0", 18, MSG_NOSIGNAL) = 18
[pid  2663] poll([{fd=4, events=POLLIN|POLLERR|POLLHUP, revents=POLLIN|POLLHUP}]
, 1, 5000) = 1
...

------------------------------------------------------------------------

When I start nscd on the failing system, su works as expected.

[root@server110 ~]# /etc/rc.d/init.d/nscd start
Starting nscd:                                             [  OK  ]
[root@server110 ~]# su - superman
-bash-3.1$
Comment 1 Bowe Strickland 2009-06-12 14:02:31 EDT
-bash-3.1$ cat /etc/redhat-release 
Red Hat Enterprise Linux Server release 5.1 (Tikanga)
-bash-3.1$ uname -r
2.6.18-53.el5xen
-bash-3.1$ rpm -qf /bin/su
coreutils-5.97-12.1.el5
Comment 2 Ondrej Vasik 2009-06-16 10:24:16 EDT
Thanks for report, could you please attach full strace of the failure? It would be easier for me to find out where the failure actually occurs.
Comment 3 Bowe Strickland 2009-06-16 12:28:58 EDT
ack... have shut down that classroom, so don't have full trace, and in new classroom w/ similar release/kernel/coreutils as comment 1, cannot reproduce :(.

We hit this consistently (every student station, every time) last week.  Must be a subtle difference in network environment...
Comment 4 Ondrej Vasik 2009-10-20 15:05:58 EDT
Cleaning bugzillas, as it is not possible to reproduce the issue and it was probably somerhing strange in network environment, I'll close that bugzilla insufficient data. 

Feel free to reopen it, if you will be able to reproduce it in future. Please provide full straces in that case.

Note You need to log in before you can comment on or make changes to this bug.