Bug 839022
Summary: | Fedora17 - rcp connection is failing between two fedora17 nodes. | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | IBM Bug Proxy <bugproxy> | ||||||||||||
Component: | rsh | Assignee: | Michal Sekletar <msekleta> | ||||||||||||
Status: | CLOSED WONTFIX | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||||||||
Severity: | high | Docs Contact: | |||||||||||||
Priority: | unspecified | ||||||||||||||
Version: | 17 | CC: | helmut.schlattl, j.faithw, jkachuck, johannbg, lnykryn, metherid, mschmidt, msekleta, notting, ovasik, plautrba, systemd-maint, vpavlin, wgomerin | ||||||||||||
Target Milestone: | --- | ||||||||||||||
Target Release: | --- | ||||||||||||||
Hardware: | All | ||||||||||||||
OS: | All | ||||||||||||||
Whiteboard: | |||||||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||
Clone Of: | Environment: | ||||||||||||||
Last Closed: | 2013-08-01 17:52:10 UTC | Type: | --- | ||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||
Documentation: | --- | CRM: | |||||||||||||
Verified Versions: | Category: | --- | |||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||
Embargoed: | |||||||||||||||
Attachments: |
|
Description
IBM Bug Proxy
2012-07-10 17:30:47 UTC
Created attachment 597393 [details]
rsh client strace
Created attachment 597394 [details]
rsh client strace
Created attachment 597395 [details]
rsh server strace
Created attachment 597396 [details]
rcp client strace
Created attachment 597397 [details]
rcp server strace
I have the same problem with Fedora 17 on i386. It seems to be the server side(/usr/sbin/in.rshd) that is the problem as rcp on a Fedora 17 machine to a Fedora 13 machine works fine rcp on a Feodra 13 machine to a Fedora 17 machine fails I don't think it is anything to do with the nscd/socket lines in the strace output as rsh localhost 'while true; do echo hi;sleep 1;done' works but still has the nscd/socket ENOENT lines. towards the end of the in.rshd trace file for the rcp command:- 8652 setuid32(1000) = 0 8652 chdir("/home/cem") = 0 8652 getrlimit(RLIMIT_NOFILE, {rlim_cur=4*1024, rlim_max=4*1024}) = 0 8652 close(4095) = -1 EBADF (Bad file descriptor) 8652 close(4094) = -1 EBADF (Bad file descriptor) ... 8652 close(8) = -1 EBADF (Bad file descriptor) 8652 close(7) = -1 EBADF (Bad file descriptor) 8652 close(6) = 0 8652 close(5) = -1 EBADF (Bad file descriptor) 8652 close(4) = -1 EBADF (Bad file descriptor) 8652 close(3) = 0 8652 --- {si_signo=SIGHUP, si_code=SI_USER, si_pid=533, si_uid=0} (Hangup) --- 8652 +++ killed by SIGHUP +++ But in the in.rshd trace file I get:- 8625 setuid32(1000) = 0 8625 chdir("/home/cem") = 0 8625 getrlimit(RLIMIT_NOFILE, {rlim_cur=4*1024, rlim_max=4*1024}) = 0 8625 close(4095) = -1 EBADF (Bad file descriptor) 8625 close(4094) = -1 EBADF (Bad file descriptor) ... 8625 close(8) = -1 EBADF (Bad file descriptor) 8625 close(7) = 0 8625 close(6) = -1 EBADF (Bad file descriptor) 8625 close(5) = -1 EBADF (Bad file descriptor) 8625 close(4) = 0 8625 close(3) = -1 EBADF (Bad file descriptor) 8625 execve("/bin/bash", ["bash", "-c", "while true; do echo hi;sleep 1;d"...], [/* 8 vars */]) = 0 8625 brk(0) = 0x9571000 8625 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7734000 So it looks like whatever in.rshd is supposed to do after the close(3) it is failing. The Fedora 13 in.rshd trace for rcp looks like:- 5441 setuid32(500) = 0 5441 chdir("/home/cem") = 0 5441 getrlimit(RLIMIT_NOFILE, {rlim_cur=1024, rlim_max=1024}) = 0 5441 close(1023) = -1 EBADF (Bad file descriptor) 5441 close(1022) = -1 EBADF (Bad file descriptor) ... 5441 close(8) = -1 EBADF (Bad file descriptor) 5441 close(7) = -1 EBADF (Bad file descriptor) 5441 close(6) = -1 EBADF (Bad file descriptor) 5441 close(5) = -1 EBADF (Bad file descriptor) 5441 close(4) = -1 EBADF (Bad file descriptor) 5441 close(3) = 0 5441 execve("/bin/bash", ["bash", "-c", "rcp -t /tmp/xxx"], [/* 6 vars */]) = 0 5441 brk(0) = 0x9b98000 It seems the execl call in rshd.c is malfunctioning. If the pam_open_session call is commented out of rshd.c then rcp works ok. Also if the pam_open_session_call is left in then commenting out the line -session optional pam_systemd.so in /etc/pam.d/password-auth allows the rcp to work fine. So the problem may be in the systemd pam_systemd.so module (In reply to comment #7) > It seems the execl call in rshd.c is malfunctioning. > If the pam_open_session call is commented out of rshd.c then rcp works ok. > Also if the pam_open_session_call is left in then commenting out the line > -session optional pam_systemd.so > in /etc/pam.d/password-auth allows the rcp to work fine. > > So the problem may be in the systemd pam_systemd.so module Good catch, the pam_systemd.so module looks like the root cause of the rcp failures. I'm inspecting why... After further inspection rcp apparently doesn't work due to systemd intervention. Strace log on server machine contains following: ... 1920 close(15) = -1 EBADF (Bad file descriptor) 1920 close(14) = -1 EBADF (Bad file descriptor) 1920 close(13) = -1 EBADF (Bad file descriptor) 1920 close(12) = -1 EBADF (Bad file descriptor) 1920 close(11) = -1 EBADF (Bad file descriptor) 1920 close(10) = -1 EBADF (Bad file descriptor) 1920 close(9) = -1 EBADF (Bad file descriptor) 1920 close(8) = -1 EBADF (Bad file descriptor) 1920 close(7) = -1 EBADF (Bad file descriptor) 1920 close(6) = 0 1920 --- SIGHUP {si_signo=SIGHUP, si_code=SI_USER, si_pid=288, si_uid=0} --- 1920 +++ killed by SIGHUP +++ ... where 1920 is the main in.rshd process and 288 is the systemd-logind process. I'm really not sure why systemd-logind sends SIGHUP to in.rshd, maybe because pam_systemd opens some arbitrary file descriptors inside in.rshd process and thinks that in.rshd exits when it closes descriptor 6? If yes then this is really not reliable way to detect if process "exits". Reassigning to systemd for further inspection, especially why SIGHUP is delivered to running in.rshd session. Suppose that the clients connects and sends "0" as the port number to be used for stderr. Then in the server in the function doit() the following happens: 1. doauth() is called and sets up the PAM session (pam_open_session()) 2. No forking is done (because !port). 3. File descriptors are closed. 4. The given command is exec()'d. Notice that the pam_open_session() is not balanced by a pam_close_session(). This is incorrect usage of PAM by rshd. To fix this, rshd should fork always (even if !port). The parent process should wait for the child to exit and then call pam_close_session(). As to why it causes a problem with systemd: pam_systemd does indeed keep an open file descriptor. It's a FIFO connected to logind. Its closing is interpreted by logind as an abrupt end of the session. The SIGHUP is a consequence of that. Reassigning to rsh to fix the use of PAM. This package has changed ownership in the Fedora Package Database. Reassigning to the new owner of this component. This message is a reminder that Fedora 17 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 17. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '17'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 17's end of life. Bug Reporter: Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 17 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior to Fedora 17's end of life. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete. Fedora 17 changed to end-of-life (EOL) status on 2013-07-30. Fedora 17 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. Thank you for reporting this bug and we are sorry it could not be fixed. |