This bug may carry over into Red Hat 7.0, I don't know. Quick summary: If you telnet to a Red Hat 6.2 box that is NOT running X-Windows, and then use the "su" command (to become root, or anyone for that matter), there will be a several-second long delay (maybe 30 seconds...not sure) after entering the password. There will be another long delay after typing "exit" to get back to the original user shell. Detailed info: When you first login (via telnet), this error message appears on /var/log/messages: Nov 14 14:20:51 archer pam_console[11662]: can't find device or X11 socket to examine for 4 Now, when I try to "su", there is a very long delay of several seconds. There is another long delay when I log out. I did strace su eic (where eic is a username on my computer) And here is where it froze up: [...some stuff...] [ Up to this point, there is NO "open" call which returns the integer 4. I think these calls are being made by PAM...] open("/etc/passwd", O_RDONLY) = 3 fcntl(3, F_GETFD) = 0 fcntl(3, F_SETFD, FD_CLOEXEC) = 0 fstat(3, {st_mode=S_IFREG|0644, st_size=1796, ...}) = 0 old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0$read(3, "root:ioog0Wb1DBAq6:0:0:root:/roo"..., 4096) = 1796 close(3) = 0 munmap(0x40019000, 4096) = 0 pipe([3, 4]) = 0 fork() = 11595 close(4) = 0 read(3, Here this is a very long read delay. FIXME! "", 256) = 0 --- SIGCHLD (Child exited) --- wait4(-1, [WIFEXITED(s) && WEXITSTATUS(s) == 1], 0, NULL) = 11595 close(3) = 0 close(4) = -1 EBADF (Bad file descriptor) [...more...] Notice the EBADF on the close(4). I think that "4" is the same "4" from the /var/log/messages entry that pam_console reports. Here is /etc/pam.d/su, it is whatever the default config is: #%PAM-1.0 auth required /lib/security/pam_pwdb.so shadow nullok account required /lib/security/pam_pwdb.so password required /lib/security/pam_cracklib.so password required /lib/security/pam_pwdb.so shadow use_authtok nullok session required /lib/security/pam_pwdb.so session optional /lib/security/pam_xauth.so Notice that the delay (i.e., blocked read :) goes away if I change the /etc/pam.d/su entry to read like this: #%PAM-1.0 auth required /lib/security/pam_pwdb.so shadow nullok account required /lib/security/pam_pwdb.so password required /lib/security/pam_cracklib.so password required /lib/security/pam_pwdb.so shadow use_authtok nullok session required /lib/security/pam_pwdb.so #session optional /lib/security/pam_xauth.so ...i.e., comment out pam_xauth.so. Is this problem caused by the fact that X is not running? I think so. If it is, bitch-slap whoever wrote pam_xauth.so for thinking that everybody runs X. Cheers, Derek Simkowiak dsimkowiak P.S.> Google could not find any information about this problem, except for a couple of email archives reporting the error message in /var/log/messages.
The "4" pam_console is logging a message for is the terminal device, not the file descriptor. FD 4 is being returned by the pipe() system call, which opens a bidirectional pipe. This is probably the pipe being used by pam_xauth to control /usr/X11R6/bin/xauth. The part after the fork(), which is in the child's process, is the more important part for determining where the pause occurs. I can't reproduce the large delay here on my test machine (7 to 7, or 7 to 6.2). In both cases the DISPLAY variable is inherited over the session, but the time elapsed between my entering the root password and getting a shell is still only a fraction of a second.
I have found the source of the problem. The delay is caused by a very slow DNS lookup. I am telnetting in to my Red Hat 6.2 server from a Mandrake box that is behind a masquerading proxy. The $DISPLAY environment variable is being inherited, but the DNS hostname in $DISPLAY, while valid on our internal network, is not resolved to the outside world. Not only is it not valid, but trying to resolve it from the outside world produces a very long delay. The DNS timeout is probably being reached. Apparantly, PAM is trying to resolve the name in the $DISPLAY variable whenever I call "su", and again when I "exit" from that su session. A new workaround for the problem is to put whatever host that's in the $DISPLAY variable into /etc/hosts (thus eliminating the delay). To simulate my environment, try putting a non-existent DNS server in /etc/resolv.conf. For example, put nameserver 192.168.123.123 ...as the only "nameserver" line in your /etc/resolve.conf, and then make sure that the host identified in your $DISPLAY variable is not in your /etc/hosts. This configuration will cause DNS lookups to be very slow (actually, it will eventually timeout). Then you will get the very long delay when you type "su", and again when you "exit" from that su session. I'm not sure what the proper behaviour here is. Anyone who has really slow DNS lookups could experience an annoying delay when running "su". Furthermore, this problem probably carries over into any application where pam_xauth.so is called. Perhaps a knowledge base article with the listed workarounds is enough? Or can the DNS lookups be eliminated from pam_xauth.so altogether? Thanks for the help. --Derek Simkowiak dsimkowiak
Using ssh will probably fix a lot of this. The built-in X11 forwarding uses an encrypted stream, and because it port-forwards from a DISPLAY set to localhost:10.0 or similar, you won't see this problem.