The program below causes the standard RedHat 6.2 kernel (2.2.14-5.0) and other recent versions (2.2.14-6.1.1, 2.2.15pre19) to hang. The machine is completely dead and has to be power-cycled. Run the program "client" below as follows: ---- $ ./client localhost 12345 ---- Here, "12345" can be any port on the local host to which no server is listening. The ouput will be something like this: ---- Wait, server not yet up Wait, server not yet up Wait, server not yet up Wait, server not yet up Wait, server not yet up Wait, server not yet up ---- at which point the machine hangs. We have tried this on several Pentium Pro and Pentium III machines, all with the same effect. The amount of "Wait, server not yet up" messages before the crash happens is not predictable, but is usually somewhere between four and seven times. Regards, Kees Verstoep ---------- client.c -------------- include <sys/types.h> #include <sys/socket.h> #include <netinet/in.h> #include <netdb.h> #include <stdio.h> #include <time.h> #include <stdlib.h> #include <sys/time.h> #include <unistd.h> #include <errno.h> int main(int argc, char *argv[]) { int sock; struct sockaddr_in server; struct hostent *hp; unsigned short port; char *hostname; if (argc != 3) { fprintf(stderr, "Usage: %s <server-host> <port>\n", argv[0]); exit(33); } hostname = argv[1]; port = htons(atoi(argv[2])); /* Create socket. */ sock = socket(AF_INET, SOCK_STREAM, 0); if (sock == -1) { perror("opening stream socket"); exit(1); } /* Connect socket using name specified by command line. */ server.sin_family = AF_INET; /* gethostbyname returns a structure including the network address * of the specified host. */ hp = gethostbyname(hostname); if (hp == (struct hostent *) 0) { fprintf(stderr, "%s: unknown host\n", argv[1]); exit(2); } memcpy((char *) &server.sin_addr, (char *) hp->h_addr, hp->h_length); server.sin_port = port; while (connect(sock, (struct sockaddr *) &server, sizeof server) == -1) { if (errno == ECONNREFUSED) { fprintf(stderr, "Wait, server not yet up\n"); sleep(1); } else { perror("connecting stream socket"); exit(1); } } close(sock); return 0; } -------- client.c ----------
Has this been tested using an unprivilaged user? Does the kernel actually give an OOPS or does it just lock up? -Stan Bubrouski
Yes, the kernel can be hung by a regular user this way; no oops. See also bug 11320. More people seem to have trouble with this one.
Reverted back to an old 2.2.14 kernel rpm I had and tried it and it indeed locks up. Old kernel 2.2.14 I got from kernel.org long ago did not exhibit the behaviour nor does the 2.3.99-pre5 or 2.3.99-pre9-2 from kernel.org I use now. Niether the program posted here nor the either report have any affect at all on kernel's from kernel.org which is fine by me since those are the ones I always use anyway ;) Oh and you should upgrade to kernel-2.2.14-12 package if you haven't already to get some bug fixes (including security apparently). I'd love to see where exactly this problem is ;) -Stan Bubrouski
Is there a way a user level program can workaround this bug, or do we have to force upgrades? Many people install our software on stock redhat 6.2 installs and won't or don't install upgraded kernels. Has this problem been fixed and just not recorded as being fixed anywhere? Thank you. -Peter Keller
It was fixed in the 2.2.16 errata kernel. For a workaround close sockets that fail to connect and allocate a new one