Bug 127902
Summary: | sshd cannot open pty for shell session | ||
---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | G.Wolfe Woodbury <redwolfe> |
Component: | openssh | Assignee: | Nalin Dahyabhai <nalin> |
Status: | CLOSED DUPLICATE | QA Contact: | |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | rawhide | CC: | barryn, geoff+fedora |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | i386 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2006-02-21 19:04:27 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
G.Wolfe Woodbury
2004-07-15 04:31:43 UTC
rawhide update of 2004-07-15 (kernel -1.488) seems to have fixed the problem. I'm still seeing this with kernel -1.494. However, it's actually a more general problem -- after a few hours of uptime or so, *all* processes (NOT just sshd!) are unable to obtain pty's... I'm getting this as well (I'm also running 2.6.7-1.494). I'm surprised that the problem is not more widespread (i.e. I haven't seen anything about it in the mailing lists). I think its a kernel bug, I don't remember the last time this was working right so its hard to say when the problem was introduced. I wrote a little test program and strace'd sshd and the attempt to open /dev/ptmx is returning -EIO. The best I can tell from the kernel source the code that tries to get a number for the pty to use is either getting a pty id that is bigger than the max or the function that gets the id (idr_get_new) is returning an error. Since the biggest pty number I got to was maybe 12 I think it is probably the latter. I know there has been a lot of changes in the pty allocation code recently, is there perhaps a missing free/unlock/release or something of that nature? A patch just went into 2.6.8-rc2-mm1 for a pty leak or something like that, FWIW... FWIW, I've seen this problem on 2.6.x kernels but not 2.4.x kernels (and I don't think I saw it on 2.5.x either). That's my best recollection as to when this showed up. This bug at least depends on fixing bug 128154 first -- if it's not a dupe of bug 128154, in fact. That patch is in 2.6.8-rc2-bk6 which if the changelog is true is what linux-2.6.7-1.499 rpm is based on. Hopefully that update will fix it. In any case I'm pretty sure its a dupe since when I have the problem with ssh I also have the problem with opening up new terminal windows. Actually, no, the patch didn't make it into mainline until 2.6.8-rc2-bk8, so it's not in 2.6.7-1.499... Right now I'm compiling RPMS of 2.6.7-1.499 + the patch. Oops, misread the 6 as an 8. I've already been running 1.494 with the patch and the thing just happened again. On the other bugs people say it happens after the system has been running for a certain amount of time (usually they reckon after a day or so). Mine did it after just 11 hours. And I did less pty allocation then ever before since most of those 11 hours I was at work. I only got 3 terminals open pts/0 through 2. Sometimes I've seen it happen after only 2 hours, although it does seem to happen more often as the system uptime increases. Somewhere in bug 128154 there's a procedure for reproducing this bug. Hopefully I'll get a chance to set up a test system and try the reproduction procedure on that. It might be interesting to try a mainline kernel, in case this is being caused by one of Red Hat's patches (not likely IMO, but who knows). Okay, I've thrown some printks around the tty code and the source of the EIO is in the fast_track of init_dev in tty_io.c. Since we are trying to get a new pty, not open an existing one, something must be causing the fast track to be taken in error. The code consults the devpts driver to see if the id returned from the idr functions corresponds to an existing pty. As far as I can tell it means that either the idr functions are returning an in-use id, devpts is messed up, or memory is not being initialized properly. I'm throwing in some more printks to see if I can find anything else out. *** This bug has been marked as a duplicate of 128154 *** Changed to 'CLOSED' state since 'RESOLVED' has been deprecated. |