Bug 839022

Summary: Fedora17 - rcp connection is failing between two fedora17 nodes.
Product: [Fedora] Fedora Reporter: IBM Bug Proxy <bugproxy>
Component: rshAssignee: Michal Sekletar <msekleta>
Status: CLOSED WONTFIX QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: 17CC: helmut.schlattl, j.faithw, jkachuck, johannbg, lnykryn, metherid, mschmidt, msekleta, notting, ovasik, plautrba, systemd-maint, vpavlin, wgomerin
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-08-01 17:52:10 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
rsh client strace
none
rsh client strace
none
rsh server strace
none
rcp client strace
none
rcp server strace none

Description IBM Bug Proxy 2012-07-10 17:30:47 UTC
-- Problem Description --
I have 2 F17 systems c57f1ju0204.ppd.pok.ibm.com and c57f1ju0203.ppd.pok.ibm.com. Whereas we tried to copy certain files from c57f1ju0203 to c57f1ju0204.

 Steps we followed in c57f1ju0204

1. We installed rsh and rsh-server.

2. We mentioned the name the remote host "c57f1ju0203.ppd.pok.ibm.com" in /etc/hosts.equiv of c57f1ju0204.
3. We made disable equals to no in /etc/xinet.d/rsh.
4. We edited the ~/.rhosts file to update the remote client 
5. We commented the below line in /etc/pam.d/remote file 
auth       required     pam_securetty.so
6. We edited /etc/securrety by adding a line  "rsh"
7. Then we restarted xinetd service.

We verified rsh in c57f1ju0203 to c57f1ju0204, which is successful

But is we do rcp, it always says "rcp: lost connection"
[root@c57f1ju0203 ~]# rcp abc.test root@c57f1ju0204://home/
rcp: lost connection

We took a call trace of that command
execve("/bin/rcp", ["rcp", "abc.test", "c57f1ju0204://home/"], [/* 28 vars */]) = 0
brk(0)                                  = 0x1001b500000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=64318, ...}) = 0
mmap(NULL, 64318, PROT_READ, MAP_PRIVATE, 3, 0) = 0xfffb1680000
close(3)                                = 0
open("/lib64/libcrypt.so.1", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\2\1\0\0\0\0\0\0\0\0\0\0\3\0\25\0\0\0\1\0\0\0\0\0\2\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=75568, ...}) = 0
mmap(NULL, 321008, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xfffb1630000
mmap(0xfffb1640000, 131072, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0) = 0xfffb1640000
mmap(0xfffb1660000, 124400, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0xfffb1660000
close(3)                                = 0
open("/lib64/libutil.so.1", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\2\1\0\0\0\0\0\0\0\0\0\0\3\0\25\0\0\0\1\0\0\0\0\0\2\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=74224, ...}) = 0
mmap(NULL, 132216, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xfffb1600000
mmap(0xfffb1610000, 131072, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0) = 0xfffb1610000
close(3)                                = 0
open("/lib64/power6/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\2\1\0\0\0\0\0\0\0\0\0\0\3\0\25\0\0\0\1\0\0\0\0\0\36\26\20"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=2238616, ...}) = 0
mmap(NULL, 2048432, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xfffb1400000
mmap(0xfffb15d0000, 196608, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1c0000) = 0xfffb15d0000
close(3)                                = 0
open("/lib64/libfreebl3.so", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\2\1\0\0\0\0\0\0\0\0\0\0\3\0\25\0\0\0\1\0\0\0\0\0\10\0 "..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=468768, ...}) = 0
mmap(NULL, 551312, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xfffb1370000
mmap(0xfffb13e0000, 131072, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x60000) = 0xfffb13e0000
close(3)                                = 0
open("/lib64/libdl.so.2", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\2\1\0\0\0\0\0\0\0\0\0\0\3\0\25\0\0\0\1\0\0\0\0\0\2\0\0"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=74720, ...}) = 0
mmap(NULL, 132504, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xfffb1340000
mmap(0xfffb1350000, 131072, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0) = 0xfffb1350000
close(3)                                = 0
mprotect(0xfffb15d0000, 65536, PROT_READ) = 0
mprotect(0xfffb1350000, 65536, PROT_READ) = 0
mprotect(0xfffb13e0000, 65536, PROT_READ) = 0
mprotect(0xfffb1610000, 65536, PROT_READ) = 0
mprotect(0xfffb1640000, 65536, PROT_READ) = 0
mprotect(0xfffb16e0000, 65536, PROT_READ) = 0
munmap(0xfffb1680000, 64318)            = 0
brk(0)                                  = 0x1001b500000
brk(0x1001b530000)                      = 0x1001b530000
brk(0)                                  = 0x1001b530000
socket(PF_FILE, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 3
connect(3, {sa_family=AF_FILE, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
close(3)                                = 0
socket(PF_FILE, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 3
connect(3, {sa_family=AF_FILE, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
close(3)                                = 0
open("/etc/nsswitch.conf", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=1706, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xfffb1680000
read(3, "#\n# /etc/nsswitch.conf\n#\n# An ex"..., 4096) = 1706
read(3, "", 4096)                       = 0
close(3)                                = 0
munmap(0xfffb1680000, 4096)             = 0
open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=64318, ...}) = 0
mmap(NULL, 64318, PROT_READ, MAP_PRIVATE, 3, 0) = 0xfffb1680000
close(3)                                = 0
open("/lib64/libnss_files.so.2", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\2\1\0\0\0\0\0\0\0\0\0\0\3\0\25\0\0\0\1\0\0\0\0\0\2\1H"..., 832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=82632, ...}) = 0
mmap(NULL, 135024, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xfffb1310000
mmap(0xfffb1320000, 131072, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0) = 0xfffb1320000
close(3)                                = 0
mprotect(0xfffb1320000, 65536, PROT_READ) = 0
munmap(0xfffb1680000, 64318)            = 0
open("/etc/services", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=659878, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xfffb1680000
read(3, "# /etc/services:\n# $Id: services"..., 4096) = 4096
read(3, "-4        74/udp                "..., 4096) = 4096
read(3, "              # Quick Mail Trans"..., 4096) = 4096
read(3, "ps           636/udp            "..., 4096) = 4096
close(3)                                = 0
munmap(0xfffb1680000, 4096)             = 0
getuid()                                = 0
socket(PF_FILE, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 3
connect(3, {sa_family=AF_FILE, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
close(3)                                = 0
socket(PF_FILE, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 3
connect(3, {sa_family=AF_FILE, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
close(3)                                = 0
open("/etc/passwd", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=1697, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xfffb1680000
read(3, "root:x:0:0:root:/root:/bin/bash\n"..., 4096) = 1697
close(3)                                = 0
munmap(0xfffb1680000, 4096)             = 0
rt_sigaction(SIGPIPE, {0x295751b0, [PIPE], SA_RESTART}, {SIG_DFL, [], 0}, 8) = 0
getpid()                                = 25968
socket(PF_FILE, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 3
connect(3, {sa_family=AF_FILE, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
close(3)                                = 0
socket(PF_FILE, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 3
connect(3, {sa_family=AF_FILE, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
close(3)                                = 0
open("/etc/host.conf", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=9, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xfffb1680000
read(3, "multi on\n", 4096)             = 9
read(3, "", 4096)                       = 0
close(3)                                = 0
munmap(0xfffb1680000, 4096)             = 0
open("/etc/resolv.conf", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=272, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xfffb1680000
read(3, "# Generated by NetworkManager\n\n\n"..., 4096) = 272
read(3, "", 4096)                       = 0
close(3)                                = 0
munmap(0xfffb1680000, 4096)             = 0
open("/etc/hosts", O_RDONLY|O_CLOEXEC)  = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=211, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xfffb1680000
read(3, "127.0.0.1   localhost localhost."..., 4096) = 211
read(3, "", 4096)                       = 0
close(3)                                = 0
munmap(0xfffb1680000, 4096)             = 0
rt_sigprocmask(SIG_BLOCK, [URG], [], 8) = 0
socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 3
bind(3, {sa_family=AF_INET, sin_port=htons(1023), sin_addr=inet_addr("0.0.0.0")}, 16) = 0
fcntl(3, F_SETOWN, 25968)               = 0
connect(3, {sa_family=AF_INET, sin_port=htons(514), sin_addr=inet_addr("9.114.32.145")}, 16) = 0
write(3, "\0", 1)                       = 1
writev(3, [{"root\0", 5}, {"root\0", 5}, {"rcp -t //home/\0", 15}], 3) = 25
read(3, "\0", 1)                        = 1
rt_sigprocmask(SIG_SETMASK, [], [URG], 8) = 0
setsockopt(3, SOL_IP, IP_TOS, [8], 4)   = 0
read(3, "", 1)                          = 0
write(2, "rcp: lost connection\n", 21rcp: lost connection
)  = 21
exit_group(1)                           = ?
+++ exited with 1 +++


We found this file is missing
/var/run/nscd/socket

Just verified the corresponding rpm of this file
[root@c57f1ju0204 home]# yum provides /var/run/nscd/socket
nscd-2.15-37.fc17.ppc64 : A Name Service Caching Daemon (nscd).
Repo        : fedora
Matched from:
Filename    : /var/run/nscd/socket

After installing nscd also, we did not find /var/run/nscd/socket file. So because of that reason "rcp lost connection issue" arrives.

Comment 1 IBM Bug Proxy 2012-07-10 17:31:05 UTC
Created attachment 597393 [details]
rsh client strace

Comment 2 IBM Bug Proxy 2012-07-10 17:31:21 UTC
Created attachment 597394 [details]
rsh client strace

Comment 3 IBM Bug Proxy 2012-07-10 17:31:32 UTC
Created attachment 597395 [details]
rsh server strace

Comment 4 IBM Bug Proxy 2012-07-10 17:31:45 UTC
Created attachment 597396 [details]
rcp  client strace

Comment 5 IBM Bug Proxy 2012-07-10 17:31:59 UTC
Created attachment 597397 [details]
rcp server strace

Comment 6 Jeremy Faith 2012-07-13 17:05:32 UTC
I have the same problem with Fedora 17 on i386.
It seems to be the server side(/usr/sbin/in.rshd) that is the problem as
  rcp on a Fedora 17 machine to a Fedora 13 machine works fine
  rcp on a Feodra 13 machine to a Fedora 17 machine fails


I don't think it is anything to do with the nscd/socket lines in the strace output as 
  rsh localhost 'while true; do echo hi;sleep 1;done'
works but still has the nscd/socket ENOENT lines.

towards the end of the in.rshd trace file for the rcp command:-
8652  setuid32(1000)                    = 0
8652  chdir("/home/cem")                = 0
8652  getrlimit(RLIMIT_NOFILE, {rlim_cur=4*1024, rlim_max=4*1024}) = 0
8652  close(4095)                       = -1 EBADF (Bad file descriptor)
8652  close(4094)                       = -1 EBADF (Bad file descriptor)
...

8652  close(8)                          = -1 EBADF (Bad file descriptor)
8652  close(7)                          = -1 EBADF (Bad file descriptor)
8652  close(6)                          = 0
8652  close(5)                          = -1 EBADF (Bad file descriptor)
8652  close(4)                          = -1 EBADF (Bad file descriptor)
8652  close(3)                          = 0
8652  --- {si_signo=SIGHUP, si_code=SI_USER, si_pid=533, si_uid=0} (Hangup) ---
8652  +++ killed by SIGHUP +++


But in the in.rshd trace file I get:-
8625  setuid32(1000)                    = 0
8625  chdir("/home/cem")                = 0
8625  getrlimit(RLIMIT_NOFILE, {rlim_cur=4*1024, rlim_max=4*1024}) = 0
8625  close(4095)                       = -1 EBADF (Bad file descriptor)
8625  close(4094)                       = -1 EBADF (Bad file descriptor)
...
8625  close(8)                          = -1 EBADF (Bad file descriptor)
8625  close(7)                          = 0
8625  close(6)                          = -1 EBADF (Bad file descriptor)
8625  close(5)                          = -1 EBADF (Bad file descriptor)
8625  close(4)                          = 0
8625  close(3)                          = -1 EBADF (Bad file descriptor)
8625  execve("/bin/bash", ["bash", "-c", "while true; do echo hi;sleep 1;d"...], [/* 8 vars */]) = 0
8625  brk(0)                            = 0x9571000
8625  mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7734000


So it looks like whatever in.rshd is supposed to do after the close(3) it is failing.

The Fedora 13 in.rshd trace for rcp looks like:-
5441  setuid32(500)                     = 0
5441  chdir("/home/cem")                = 0
5441  getrlimit(RLIMIT_NOFILE, {rlim_cur=1024, rlim_max=1024}) = 0
5441  close(1023)                       = -1 EBADF (Bad file descriptor)
5441  close(1022)                       = -1 EBADF (Bad file descriptor)
...
5441  close(8)                          = -1 EBADF (Bad file descriptor)
5441  close(7)                          = -1 EBADF (Bad file descriptor)
5441  close(6)                          = -1 EBADF (Bad file descriptor)
5441  close(5)                          = -1 EBADF (Bad file descriptor)
5441  close(4)                          = -1 EBADF (Bad file descriptor)
5441  close(3)                          = 0
5441  execve("/bin/bash", ["bash", "-c", "rcp -t /tmp/xxx"], [/* 6 vars */]) = 0
5441  brk(0)                            = 0x9b98000

Comment 7 Jeremy Faith 2012-07-16 16:19:21 UTC
It seems the execl call in rshd.c is malfunctioning.
If the pam_open_session call is commented out of rshd.c then rcp works ok.
Also if the pam_open_session_call is left in then commenting out the line
  -session     optional      pam_systemd.so
in /etc/pam.d/password-auth allows the rcp to work fine.

So the problem may be in the systemd pam_systemd.so module

Comment 8 Adam Tkac 2012-07-23 13:13:52 UTC
(In reply to comment #7)
> It seems the execl call in rshd.c is malfunctioning.
> If the pam_open_session call is commented out of rshd.c then rcp works ok.
> Also if the pam_open_session_call is left in then commenting out the line
>   -session     optional      pam_systemd.so
> in /etc/pam.d/password-auth allows the rcp to work fine.
> 
> So the problem may be in the systemd pam_systemd.so module

Good catch, the pam_systemd.so module looks like the root cause of the rcp failures. I'm inspecting why...

Comment 9 Adam Tkac 2012-07-23 13:59:15 UTC
After further inspection rcp apparently doesn't work due to systemd intervention.

Strace log on server machine contains following:
...
1920  close(15)                         = -1 EBADF (Bad file descriptor)
1920  close(14)                         = -1 EBADF (Bad file descriptor)
1920  close(13)                         = -1 EBADF (Bad file descriptor)
1920  close(12)                         = -1 EBADF (Bad file descriptor)
1920  close(11)                         = -1 EBADF (Bad file descriptor)
1920  close(10)                         = -1 EBADF (Bad file descriptor)
1920  close(9)                          = -1 EBADF (Bad file descriptor)
1920  close(8)                          = -1 EBADF (Bad file descriptor)
1920  close(7)                          = -1 EBADF (Bad file descriptor)
1920  close(6)                          = 0
1920  --- SIGHUP {si_signo=SIGHUP, si_code=SI_USER, si_pid=288, si_uid=0} ---
1920  +++ killed by SIGHUP +++
...

where 1920 is the main in.rshd process and 288 is the systemd-logind process. I'm really not sure why systemd-logind sends SIGHUP to in.rshd, maybe because pam_systemd opens some arbitrary file descriptors inside in.rshd process and thinks that in.rshd exits when it closes descriptor 6? If yes then this is really not reliable way to detect if process "exits".

Reassigning to systemd for further inspection, especially why SIGHUP is delivered to running in.rshd session.

Comment 10 Michal Schmidt 2012-07-23 18:13:57 UTC
Suppose that the clients connects and sends "0" as the port number to be used for stderr. Then in the server in the function doit() the following happens:
1. doauth() is called and sets up the PAM session (pam_open_session())
2. No forking is done (because !port).
3. File descriptors are closed.
4. The given command is exec()'d.

Notice that the pam_open_session() is not balanced by a pam_close_session().
This is incorrect usage of PAM by rshd.

To fix this, rshd should fork always (even if !port). The parent process should wait for the child to exit and then call pam_close_session().

As to why it causes a problem with systemd: pam_systemd does indeed keep an open file descriptor. It's a FIFO connected to logind. Its closing is interpreted by logind as an abrupt end of the session. The SIGHUP is a consequence of that.

Reassigning to rsh to fix the use of PAM.

Comment 11 Fedora Admin XMLRPC Client 2013-04-02 13:25:12 UTC
This package has changed ownership in the Fedora Package Database.  Reassigning to the new owner of this component.

Comment 12 Fedora End Of Life 2013-07-04 06:20:58 UTC
This message is a reminder that Fedora 17 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 17. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora 
'version' of '17'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version prior to Fedora 17's end of life.

Bug Reporter:  Thank you for reporting this issue and we are sorry that 
we may not be able to fix it before Fedora 17 is end of life. If you 
would still like  to see this bug fixed and are able to reproduce it 
against a later version  of Fedora, you are encouraged  change the 
'version' to a later Fedora version prior to Fedora 17's end of life.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 13 Fedora End Of Life 2013-08-01 17:52:18 UTC
Fedora 17 changed to end-of-life (EOL) status on 2013-07-30. Fedora 17 is 
no longer maintained, which means that it will not receive any further 
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of 
Fedora please feel free to reopen this bug against that version.

Thank you for reporting this bug and we are sorry it could not be fixed.