The ssh client in openssh 2.9.1p1-3gss crashes repeatedly when connecting
to "Remote protocol version 1.99, remote software version OpenSSH_2.9p1"
(openssh-server-2.9p1-1 package). Downgrading to openssh-2.9p1-2 appears
to make the problem go away (if it recurs, I'll let you know).
The crashes may be related to window resizes, although I'm not certain
about that. Either that, or they're related to when a large burst of data
comes all at once.
I found the problem. There's code in clientloop.c that's zeroing out an fd_set
structure using memset and assuming that the fd_set has one byte per file
descriptor rather than one *bit* per file descriptor. In fact, it should just
use FD_ZERO to zero out the fd_set. I will attach a patch (which you will of
course forward back to the maintainers of openssh :-).
Created attachment 23349 [details]
Fix memory overrun in clientloop.c
The path I submitted last night was wrong. How was I to know that the things
openssh calls "fd_set"s aren't really "fd_set"s, but are actually instead arrays
of dynamic length? :-)
I'll attach a new patch.
Created attachment 23388 [details]
Another memory overrun fix
I bet this is a bug introduced by gss patches.
The code I patched is clearly buggy, and my patch applies cleanly even
without the GSS-API patch applied first.
I explained specifically what the bug is, and if you read my
explanation and patch, it is clear that the code is wrong and needs to
I'm *sure* that if I used a version of SSH without the GSS-API patch
and without my fix to this bug, SSH would continue to crash on me.
Since I built a new version of ssh with my patch, it hasn't crashed on
me once, even though I've been running it the entire time with
Uhh, sorry for my uneducated guess. :-) One just has to wonder why this hasn't been happening ever to anyone else.
gss patches, in one way or another, could have been a common term...
I can post the patch upstream, see what they think..
I suspect that other people *have* run into this; they probably just chalked it
up to flakies and restarted ssh, as I did for a long time before I finally
decided to track down the problem.
It's also possible that the version of ssh in which this bug was introduced is
not yet widely deployed.
It's also possible that the particular usage paradigm which tickles the bug is
not all that common. I frequently do port forwarding, X forwarding, agent
forwarding, etc. I suspect you need to be using a good number of file
descriptors before this bug kicks in.
The patch looks right to me (the old behavior cleared one byte for each FD, when
the fd_set being packed requires clearing one bit); it will be integrated into
2.9p2-7 and later. Thanks!