Description of Problem: After upgrading from kernel-2.4.9-21 to kernel-2.4.9-31,
ssh returns wrong exit code under some circumstances (seriously munging our
Version-Release number of selected component (if applicable): 2.9p2-12
"ssh localhost rpm -qa </dev/null; echo $?"
should reveal exit code 0 but on most upgraded systems gives exit code 255
removing "ssh localhost" gives exit code 0 which is correct
adding "strace" before either "ssh" or "rpm" appears to show that the ssh/sshd
system is changing the 0 to -1. However "ssh localhost exit 0" works correctly.
The problem is a race in sshd which was exposed by small changes in the kernel.
Here's a simpler test case:
ssh localhost perl -e
Because the fd's are closed before the process exits, the cleanup code in
session.c is used instead of the cleanup code in serverloop.c. The session.c
cleanup code ignores the process exit code, contrary to spec which says that
sshd waits for process exit AND all fd's closed, then passes the exit code to
The effect of the current code is to wait for process exit OR all fd's closed,
which results in the observed race as well as other incorrect behavior.
Until this is fixed, the following workaround seems to work in most cases.
ssh localhost foo
- use this:
ssh localhost 'foo; exit $?'
Bug does not exist in openssh-3.1p1-2 which was released today to fix a serious security bug.