I have the following problem. I am running a database server at a customers site on an HP machine running linux 2.0.36. Every 3 days or so the server hangs and if I try to kill it with kill 11 to get a core to find out where it is, the server is killed but no core is produced. If I kill it while it is running normally I get a core. I set up a script to monitor what the server was doing and what was happening on the network and I found that the server is hanging in the system call "sys_ip_options_" or something like this. Here is the output from "ps l" showing the state of the server after it hangs: 100140 0 214 1 0 0 36068 33508 ip_options_ S ? 87:56 ./server At the same time I monitored what was happening on the network with netstat. So that you can understand the output here is the script that I used to do the monitoring. #/bin/sh while [ 1 ] do echo "--------------------" >> watchit.log date >> watchit.log netstat > netstat.log2 ps x | grep -v "ps" | grep -v "./server -b" > ps.log2 ps hlp 214 213 >> watchit.log diff netstat.log netstat.log2 >> watchit.log mv netstat.log2 netstat.log diff ps.log ps.log2 >> watchit.log mv ps.log2 ps.log sleep 60 done Here is the section of the log that is of interest, the server is published on tcp port 50375. Notice that the send queue is growing up until the server gets hung in the system call. I am not sure what the server is trying to do at this point since I dont get a core but it may be trying to close the connection. Any Linux internal experts out there have any idea of what has happened? It looks like a linux bug to me. -------------------- Sat May 22 00:45:05 MEST 1999 100 0 213 1 0 0 1456 388 schedule S ? 0:03 ./ticker 100140 0 214 1 0 0 36068 33508 select S ? 87:56 ./server 24c24 < tcp 0 128 susi.mosler.de:50375 10.249.158.177:1063 ESTABLISHED --- > tcp 0 132 susi.mosler.de:50375 10.249.158.177:1063 ESTABLISHED -------------------- Sat May 22 00:46:05 MEST 1999 100 0 213 1 0 0 1456 388 schedule S ? 0:03 ./ticker 100140 0 214 1 1 0 36068 33508 select S ? 87:56 ./server 24c24 < tcp 0 132 susi.mosler.de:50375 10.249.158.177:1063 ESTABLISHED --- > tcp 0 136 susi.mosler.de:50375 10.249.158.177:1063 ESTABLISHED -------------------- Sat May 22 00:47:05 MEST 1999 100 0 213 1 0 0 1456 388 schedule S ? 0:03 ./ticker 100140 0 214 1 1 0 36068 33508 select S ? 87:56 ./server 24c24 < tcp 0 136 susi.mosler.de:50375 10.249.158.177:1063 ESTABLISHED --- > tcp 0 140 susi.mosler.de:50375 10.249.158.177:1063 ESTABLISHED 40a41 > 3049 ? R 0:00 -bash -------------------- Sat May 22 00:48:06 MEST 1999 100 0 213 1 0 0 1456 388 schedule S ? 0:03 ./ticker 100140 0 214 1 0 0 36068 33508 ip_options_ S ? 87:56 ./server 7c7 < tcp 0 0 susi.mosler.de:50375 susi.mosler.de:1111 ESTABLISHED --- > tcp 76 0 susi.mosler.de:50375 susi.mosler.de:1111 ESTABLISHED 13c13 < tcp 0 0 susi.mosler:netbios-ssn 10.249.158.214:1171 ESTABLISHED --- > tcp 0 4 susi.mosler:netbios-ssn 10.249.158.214:1171 ESTABLISHED 41d40 < 3049 ? R 0:00 -bash -------------------- Sat May 22 00:49:06 MEST 1999 100 0 213 1 0 0 1456 388 schedule S ? 0:03 ./ticker 100140 0 214 1 0 0 36068 33508 ip_options_ S ? 87:56 ./server 7c7 < tcp 76 0 susi.mosler.de:50375 susi.mosler.de:1111 ESTABLISHED --- > tcp 84 0 susi.mosler.de:50375 susi.mosler.de:1111 ESTABLISHED 13c13 < tcp 0 4 susi.mosler:netbios-ssn 10.249.158.214:1171 ESTABLISHED --- > tcp 0 0 susi.mosler:netbios-ssn 10.249.158.214:1171 ESTABLISHED -------------------- Sat May 22 00:50:06 MEST 1999 100 0 213 1 0 0 1456 388 schedule S ? 0:03 ./ticker 100140 0 214 1 0 0 36068 33508 ip_options_ S ? 87:56 ./server 7c7 < tcp 84 0 susi.mosler.de:50375 susi.mosler.de:1111 ESTABLISHED --- > tcp 92 0 susi.mosler.de:50375 susi.mosler.de:1111 ESTABLISHED 12c12 < tcp 0 0 susi.mosler:netbios-ssn 10.249.158.179:1074 ESTABLISHED --- > tcp 0 4 susi.mosler:netbios-ssn 10.249.158.179:1074 ESTABLISHED 40a41 > 3084 ? R 0:00 -bash
This is a kernel question.