Bug 54741
Summary: | telnet malfunction after upgrade to util-linux-2.11f-11.7.1 | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Retired] Red Hat Linux | Reporter: | Frank Bures <fbures> | ||||||||||||||||
Component: | util-linux | Assignee: | Elliot Lee <sopwith> | ||||||||||||||||
Status: | CLOSED ERRATA | QA Contact: | Ben Levenson <benl> | ||||||||||||||||
Severity: | medium | Docs Contact: | |||||||||||||||||
Priority: | medium | ||||||||||||||||||
Version: | 7.1 | CC: | a.gormanly, alfredo.maria.ferrari, brett_schwarz, dag, dwmalone, gbailey, hjl, itai.nahshon, j.k.vanamerongen, jwdeve, leob, mgb, mphillips, ncb, pirronem, ralston, rm.riches, shishz, simon1, stelian, vik.heyndrickx | ||||||||||||||||
Target Milestone: | --- | ||||||||||||||||||
Target Release: | --- | ||||||||||||||||||
Hardware: | i686 | ||||||||||||||||||
OS: | Linux | ||||||||||||||||||
Whiteboard: | |||||||||||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||||||||||
Doc Text: | Story Points: | --- | |||||||||||||||||
Clone Of: | Environment: | ||||||||||||||||||
Last Closed: | 2002-03-05 17:59:04 UTC | Type: | --- | ||||||||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||||||||
Documentation: | --- | CRM: | |||||||||||||||||
Verified Versions: | Category: | --- | |||||||||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||||||||
Embargoed: | |||||||||||||||||||
Attachments: |
|
Description
Frank Bures
2001-10-17 16:54:35 UTC
Created attachment 34301 [details]
Another appearance - virtual terminals
What kernel version are you using? I ask because I had gotten the same error as you, but seems to have gone away booting into a different/newer kernel. We're running 2.4.3-12 I take it back. telnet access for me still seems broke. Virtual terminal login seems to ok only for root (and that's what I had tested for earlier), not regular users. With the following packages installed, I cannot reproduce any problems with telnet: glibc-2.2.4-19 util-linux-2.11f-11.7.1 kernel-2.4.3-12 telnet-0.17-18.1 telnet-server-0.17-18.1 OK, upgrading to telnet-0.17-8/telnet-server-0.17-8 fixed telnet, but virtual terminal logins still are biffed for non-root (for me). Can you verify this case? Behaves the same way for us as root an non-root. Both result in the same error message. I can't speak for the telnet issues; we have telnet disabled on our system. But the vritual terminals definitely stopped working correctly immediately after applying the util-linux-2.11f-11.7.1 patch. We've discovered that when we use sh, everything works fine. The problems seems to occur only when we use tcsh. We've tried using the most recent (Rawhide) version of tcsh, but that showed no improvements I can use gnome-terminal (gnome-core-1.2.4-16) and xterm (XFree86-4.0.3-5) just fine, and thus assume that the problem isn't here. The libraries that are in use by xterm belong to the following packages: freetype-2.0.1-4 glibc-2.2.4-19 ncurses-5.2-8 XFree86-libs-4.0.3-5 utempter-0.5.2-4 This is a clean test system that has only 7.1+errata on it. Is there anyone that can reproduce the problem on a similar system and is willing to do some strace'ing to diagnose it? What shell are you using? If you're not using tcsh, change your /etc/passwd file to use /bin/tcsh for your account, and see if you get this error message when opening a terminal __outside of X__. This sounds like a tcsh bug. Can you try the tcsh-6.10-6 from rawhide and see if that helps? We installed tcsh-6.10-6 on one machine, and it exhibits the same behavior as tcsh-6.10-5. I can verify that the problem exists only with tcsh and is not fixed with tcsh-6.10-6 tcsh problem with util-linux errata. I can't reproduce this by running 'xterm -e tcsh -l' on a rawhide system, so it sounds like some 7.1-specific interaction between tcsh and the settings that /bin/login gives the tty.</wild-guess> *** Bug 54747 has been marked as a duplicate of this bug. *** I submitted this as bug#54748 under tcsh. 'xterm -e tcsh -l' works for me too. The problem is when logging in to the CONSOLE (if in X press ALT-F1, ALT-F2... ALT-F6) only. *** Bug 54751 has been marked as a duplicate of this bug. *** notting says that a newer kernel (for which an errata is expected shortly) fixes this problem, in case you are anxious to implement your own solution right away. *** Bug 54746 has been marked as a duplicate of this bug. *** This bug appears for me in 2.4.7 also, but only on virtual terminals 2,4 and 5 (Alt-F2, Alt-F4,...) and not on vt's 1,3 or 6! I spoke too soon. After logging out of all VTs, I now get the error when I log in on any VT. Switching to bash (in passwd) solves this one, but now I'm can't lock the console with vlock: $vlock -a vlock: could not open /dev/tty: No such device or address I have logged a bug against openssh which appears to be this problem as well (54770) together with a patch to login.c that backs out one of the changes between 2.10s and 2.11f which fixes the openssh problem. *** Bug 54770 has been marked as a duplicate of this bug. *** I have also just discovered that Ctrl-C (and possibly Ctrl-Z) don't appear to work correctly (The C key, Z key and ctrl key all work individually) when logged into a VT other than as root. (I didn't spot this yesterday but downgrading to util-linux-2.10s has fixed this) Another update (sorry :-) I have now upgraded two of my machines to kernel 2.4.9 and this has fixed my ssh issue. I don't have telnet installed so I can't check that. I can confirm that also on my system, after upgrading to kernel-2.4.9-6, the problem disappeared both in virtual consoles, with ftp and ssh I've just booted 2.4.9-6 and still see the same problems on the virtual console as in 2.4.3-12 and my homebrew 2.4.7 - this machine has all the 7.1 updates applied It seems that after running 'exec setsid $SHELL' on a virtual console (this returns you to the login prompt), that particular console starts behaving normally at least for one session. I have now upgraded my home machines. Of the three that aren't headless, two work, one doesn't. My patch to login.c (attached below) to back out a change fixes this. Note that this seems to be the same patch as util-linux-2.11f-logingrp-revert.patch. Is there something wrong with the order that the patches are applied? (I have also corrected an obvious bug in my patch. I will submit this as a separate bug if it hasn't already been submitted) Created attachment 34461 [details]
Backs out a change to login.c and also corrects an obvious bug
The util-linux errata broke my ability to run ssh on a 7.1 box running 2.4.13-pre5, 2.4.12-ac3 and 2.4.10-ac11 Reverting to an earlier util-linux fixed this. I am using bash as my shell. I found that downgrading the kernel from 2.4.3-12 to 2.4.2-2 fixed the problem. I installed util-linux-2.11f-11.7.1 and kernel-2.4.3-12 at the same time. Another observation: we have a number of RH7.1+updates boxes and they behave correctly BUT we have also one box showing this wrong behavior that had initially RH7.0 installed on it and later upgraded to 7.1+updates (and few other things have been changed as well so I am not 100% that the 7.0->7.1 upgrade is the reason). Another observation: we have a number of RH7.1+updates boxes and they behave correctly BUT we have also one box showing this wrong behavior that had initially RH7.0 installed on it and later upgraded to 7.1+updates (and few other things have been changed as well so I am not 100% that the 7.0->7.1 upgrade is the reason). I've also been able to reproduce this using /usr/X11R6/bin/resize. I've attached a strace of the process. The jist is that open("/dev/tty", O_RDWR) fails with errno = ENXIO. This seems odd since /dev/tty has permissions: crw-rw-rw- 1 root root 5, 0 Oct 20 23:37 /dev/tty I've attached the strace output. Created attachment 34521 [details]
strace of /usr/X11R6/bin/resize
RHSA-2001:129-05 (kernel-2.4.9-6 and friends) seems to have fixed things. No errors via telnet, rlogin, login via VC, or xterm/gnome-terminal, or /usr/X11R6/bin/resize I mentioned in a previous post. Just to be explicit, I've installed the following rpms: util-linux-2.11f-11.7.1 glibc-2.2.4-19 kernel-2.4.9-6 telnet-0.17-18 telnet-server-0.17-18 rsh-0.17-2.5 rsh-server-0.17-2.5 gnome-core-1.2.4-16 XFree86-4.0.3-5 tcsh-6.10-5 *** Bug 54846 has been marked as a duplicate of this bug. *** *** Bug 54960 has been marked as a duplicate of this bug. *** Created attachment 34802 [details]
A patch
I uploaded a patch which seems to work for me. *** Bug 54748 has been marked as a duplicate of this bug. *** HJ's patch works (although I changed it to not remove the opentty call)... There should be a util-linux-2.11f-13 in rawhide soon that just might solve the problem :) The same malfunction occurrs in RH7.2, both 2.4.7-10 and 2.4.9-7 kernels. util-linux-2.11f-12 does NOT solve the problem. Only downgrade to util-linux-2.10s-13.7 from RH7.1 does. > There should be a util-linux-2.11f-13 in rawhide soon that just might solve the
> problem :)
What happened to this? I can't find it and would like to fix this problem.
I don't think the whole login issue is resolved. Even with the patched login, I can log in as root at the run level 3 and then do # telinit 1 to change to the run level 1. The shell I get at the run level doesn't work at all. It complains about shell has no job control and I cannot type in any command. I think it may have something to do with login now calls ioctl(0, TIOCNOTTY, NULL); I don't know why the change was made. If it was made for a good reason, you should modify SysVinit to handle it. One way to do it may be to call ioctl(0, TIOCSCTTY, (char *) 1); before handing tty to shell. Could someone please change the component to util-linux? This bug has nothing to do with tcsh. I have a RH7.1 Alpha system (clean 7.1 install) plus up2date updates. On Oct. 29, I used up2date to capture everything except the newest kernel, and this problem showed up on the text-mode virtual consoles and my serial terminals. I have users on serial terminals, so the (alleged) lack of job control is significant. Is there a reasonable solution to this for Alpha 7.1? (Having messed up a 7.0 system some months ago by installing stuff from rawhide, I'm a bit leary of rawhide.) Being as this appears to be caused by a defective update package, how about making a fixed package available to the mainstream? The following is a superset of the packages I updated on Oct. 29 that caused the problem to show up: (I now plan to do an rpm -qa and -Va before and after using up2date.) anonftp-4.0-9 cpml_ev6-5.1.0-4 diffutils-2.7-23 e2fsprogs-1.23-1.7.1 e2fsprogs-devel-1.23-1.7.1 filesystem-2.1.0-2.1 ghostscript-5.50-19.rh7.1 glibc-2.2.4-19 glibc-common-2.2.4-19 glibc-devel-2.2.4-19 glibc-profile-2.2.4-19 gpgp-0.4-7 initscripts-5.84.1-1 jdk-1.3.1-1 libots-2.2.7-2 mkinitrd-3.2.6-1 modutils-2.4.6-4 nscd-2.2.4-19 openmotif-2.1.30-8 openmotif-devel-2.1.30-8 openssh-2.9p2-8.7 openssh-askpass-2.9p2-8.7 openssh-askpass-gnome-2.9p2-8.7 openssh-clients-2.9p2-8.7 openssh-server-2.9p2-8.7 pine-4.33-8.71 printconf-0.2.15-2 printconf-gui-0.2.15-2 squid-2.3.STABLE4-10.7.1 util-linux-2.11f-11.7.1 We're running 7.1 with a 2.4.12 kernel. Some one of the recent updates has caused us to see this problem, but strangely we don't see it all the time - it happens sporadically. I'm almost certain that it was an update where I updated: util-linux-2.11f-11.7.1.i386 glibc-2.2.4-19.i686 glibc-common-2.2.4-19.i386 glibc-devel-2.2.4-19.i386 glibc-profile-2.2.4-19.i386 but as the problem is intermittant it is hard to tell. We also saw the problem with a 2.4.10 kernel. Below is a script of a session where I logged in and got the "Inappropriate ioctl for device" emssage and immediately logged in again (getting the same tty) and everything worked fine. 20:40:walton 47% rsh stokes Last login: Tue Oct 30 20:04:36 from walton Warning: no access to tty (Inappropriate ioctl for device). Thus no job control in this shell. 20:40:stokes 1% tty /dev/pts/0 20:41:stokes 2% logout rlogin: connection closed. 20:41:walton 48% rsh stokes Last login: Tue Oct 30 20:40:59 from walton 20:41:stokes 1% tty /dev/pts/0 20:41:stokes 2% uname -a Linux stokes 2.4.12 #2 SMP Mon Oct 22 20:09:41 IST 2001 i686 unknown 20:41:stokes 3% rpm -q util-linux util-linux-2.11f-11.7.1 20:41:stokes 4% I think I know why "telinit 1" no longer works. The problem is login now calls setsid () to create a new process group for shells. But that kills "telinit" since now init won't send SIGTERM/SIGKILL to any shells started by login since they are in different process groups now. When you do # telinet 1 at the console, you get 2 shells. One is your login shell and the other is the shell started by init. It is really a mess. May I ask why the setsid is called by login now? FYI, after applying http://bugzilla.redhat.com/bugzilla/showattachment.cgi?attach_id=34461 "telinit 1" works now. The upgrade also broke my SSH logins : bash says that process control doesn't work. w doesn't list anymore what the user does, so this upgrade broke /bin/login aparently. Upgrading my system from util-linux-2.10s-12.i386.rpm to util-linux-2.11f- 11.7.1.i386.rpm resulted in the following symptom: When login in, no ^C are sent to a program that is run on the console (i.e. I cannot terminate a "ping" by pressing ^C, quite annoying ;-) ). Appearently the login process and all of its children including "bash" have '?' as controlling terminal in the "ps -ax" list. How reproducible: Frequently, but not always Hmm, I applied HJ's second patch against util-linux-2.11f-15 to try to fix the 'telinit 1' part of the problem, but it did not seem to have any effect. Is that patch meant to be applied _instead of_ my util-linux-2.11f-loginctty.patch? What a mess... :) Don't apply my patch. It doesn't solve "telinit 1". See my previous comments. This is getting so confusing - patches on top of patches on top of patches. I just removed all the patches related to this bug, because it was turning into too many layers of crud... So now I will try to work towards a solution from the plain util-linux-2.11f code. Created attachment 37101 [details]
A new patch
I post a new patch based on http://bugzilla.redhat.com/bugzilla/showattachment.cgi?attach_id=34461 It does fix "telinit 1". However, now I have a new problem. I cannot log in from a serial console under kernel 2.4.9-12. An older kernel, 2.4.6, works fine. The main problem with this patch is that it appears to break the whole idea of having the parent process hanging around to handle PAM_END session cleanup. The parent _has_ to stay around and not get killed... I am going to try using your patch, but adding a signal(SIGHUP, SIG_IGN); just before the parent calls wait(). OK, I have something that fixes both the tcsh problem and telinit 1 problem, but I don't know whether it continues to do PAM session termination 100% of the time (although so far it does). I have put the packages at http://people.redhat.com/~sopwith/2.11f-16/ Basically this merges the util-linux-2.11f-{pwent,loginpgrp-revert,loginctty,loginctty2} patches into a util-linux-2.11f-pwent2.patch, which results in greatly reduced confusion. I want to turn this package into an errata sometime soon, so if all those interested could test the package out, I would appreciate it. This seems to fix it for me on all the servers we had the problem. Thanks (although it took a while). It seems to work for me although I can't try it on the machines at work until Monday. Thanks! util-linux-2.11f-16 also includes the fix for bug 55455 I reported so that bug can be closed when/if this is released. Seems to work for me too - thanks! *** Bug 56062 has been marked as a duplicate of this bug. *** util-linux-2.11f-16 works fine Thanks!! The test util-linux-2.11f-16 rpm at http://people.redhat.com/~sopwith/2.11f-16/ seems to fix the problems I noticed, which were: 1. Logins on a vc ignored the intr character (^C). 2. If you logged in on a vc, and did an su to root, bash complained that the shell had no job control. 3. If one started an X session by typing "startx & vlock", shutting down the X session would automatically log one out of the vc that launched it. If I notice any of these problems reoccur with util-linux-2.11f-16, I'll report back, but so far, everything looks good. Created attachment 37943 [details]
Should fix this problem
*** Bug 56551 has been marked as a duplicate of this bug. *** *** Bug 56121 has been marked as a duplicate of this bug. *** *** Bug 55741 has been marked as a duplicate of this bug. *** *** Bug 55181 has been marked as a duplicate of this bug. *** In case anyone cares... After upgrading to util-linux-2.11f-11.7.1.i386.rpm, I was having similar problems with rsh'ing into a system with 'bash' as the shell: ctrl-c nor ctrl-z would work for me. WHen 'su -' into root, it would come up with the 'no job control in shell' message. Applying util-linux-2.11f-16.i386.rpm solved the problem nicely. Thank you. With the apparent fix of this problem in rawhide, can anyone speak to the likelihood of this being released as an official errata? I'm holding off applying a bunch of other updates (and even upgrading to RH 7.2) because it seems like having broken CTRL-C functionality would provoke a lot of questions from users... How long does something like this typically sit in rawhide before it's pushed out as a supported fix? *** Bug 57124 has been marked as a duplicate of this bug. *** Applying this rpm makes the shell usable after shutdown now I encourage an early release of this rpm as an errata Is the util-linux rpm that is available safe to install? When will it be released as an official errata/available through RHN? RHBN:153-06 (util-linux-2.11f-17) was released on Friday. Enjoy... *** Bug 55768 has been marked as a duplicate of this bug. *** *** Bug 57405 has been marked as a duplicate of this bug. *** *** Bug 57067 has been marked as a duplicate of this bug. *** *** Bug 57221 has been marked as a duplicate of this bug. *** *** Bug 57288 has been marked as a duplicate of this bug. *** *** Bug 54865 has been marked as a duplicate of this bug. *** Created attachment 47460 [details]
replacement for ctty and ctty2 patch 38 and 39
the above patch does not fix the telinit 1 problem... the parent login process really should catch SIGTERM and populate it to childPid. The patch also fixes 59029. util-linux-2.11n-8 should fix all the issues for good... *** Bug 56463 has been marked as a duplicate of this bug. *** An errata has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2003-369.html |