Description of problem: The system seems to go into some sort of writing sequence to the Hard Drive and starts showing a high load average at various times on Sunday. Monday morning I have to hit the reset button to reboot the system to gain access. I cannot even log into the console or SSH into it. [1024010] watson1.medna.com.cpu red Sun Jun 15 12:37:16 CDT 2003 up: 6 days, 0 users, 69 procs, load=312 LOAD AVG on watson1,medna,com is 312 Jun 15 12:55:37 watson1 sendmail[1686]: rejecting connections on daemon MTA: load average: 12 Jun 15 12:55:51 watson1 sendmail[1686]: rejecting connections on daemon MTA: load average: 12 Jun 15 12:56:09 watson1 sendmail[1686]: rejecting connections on daemon MTA: load average: 12 Jun 15 12:56:28 watson1 sendmail[1686]: rejecting connections on daemon MTA: load average: 12 Jun 15 12:57:37 watson1 sendmail[1686]: rejecting connections on daemon MTA: load average: 12 Jun 15 12:57:54 watson1 sendmail[1686]: rejecting connections on daemon MTA: load average: 14 Jun 15 12:58:08 watson1 sendmail[1686]: rejecting connections on daemon MTA: load average: 15 Monitor Report: Sun Jun 15 12:37:17 2003 red 19:17:12 Sun Jun 15 04:06:20 2003 red 0:09:37 Sun Jun 8 06:34:07 2003 red 1 day 01:20:22 Version-Release number of selected component (if applicable): How reproducible: Every Sunday - various times Steps to Reproduce: 1. RH 9.0 2. up2date 3. Snort 2.0, Syslog, Daemon, Big Brother, MRTG Actual results: Expected results: Additional info: /etc/cron.daily/date: 15 Jun 04:02:43 ntpdate[26515]: step time server 192.43.244.18 offset -1.639247 sec 04:02:45 up 5 days, 20:10, 0 users, load average: 1.57, 0.76, 0.51 65 processes: 64 sleeping, 1 running, 0 zombie, 0 stopped CPU states: 7.0% user 3.8% system 0.0% nice 0.0% iowait 89.0% idle Mem: 125984k av, 123820k used, 2164k free, 0k shrd, 3920k buff 9192k actv, 2320k in_d, 1716k in_c Swap: 514032k av, 59640k used, 454392k free 6776k cached PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND 26517 bb 24 0 1036 1036 848 R 1.9 0.8 0:00 0 top 1 root 15 0 80 52 24 S 0.0 0.0 0:04 0 init 2 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 keventd 3 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 kapmd 4 root 34 19 0 0 0 SWN 0.0 0.0 0:00 0 ksoftirqd_CPU 9 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 bdflush 5 root 15 0 0 0 0 DW 0.0 0.0 0:07 0 kswapd 6 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 kscand/DMA 7 root 15 0 0 0 0 SW 0.0 0.0 0:03 0 kscand/Normal 8 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 kscand/HighMe 10 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 kupdated 11 root 25 0 0 0 0 SW 0.0 0.0 0:00 0 mdrecoveryd 17 root 25 0 0 0 0 SW 0.0 0.0 0:00 0 scsi_eh_0 20 root 15 0 0 0 0 SW 0.0 0.0 0:32 0 kjournald 78 root 25 0 0 0 0 SW 0.0 0.0 0:00 0 khubd 1165 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 kjournald 1166 root 15 0 0 0 0 SW 0.0 0.0 0:00 0 kjournald 1507 root 15 0 348 320 252 S 0.0 0.2 0:36 0 syslogd 1511 root 15 0 56 4 0 S 0.0 0.0 0:00 0 klogd 1548 rpcuser 25 0 76 4 0 S 0.0 0.0 0:00 0 rpc.statd 1616 root 24 0 52 4 0 S 0.0 0.0 0:00 0 apmd 1653 root 15 0 244 4 0 S 0.0 0.0 0:02 0 sshd 1667 root 23 0 152 4 0 S 0.0 0.0 0:00 0 xinetd 1686 root 15 0 804 320 152 S 0.0 0.2 0:01 0 sendmail 1695 smmsp 15 0 532 4 0 S 0.0 0.0 0:00 0 sendmail 1705 root 15 0 56 4 0 S 0.0 0.0 0:00 0 gpm 1714 root 15 0 72 4 0 S 0.0 0.0 0:00 0 crond 1806 xfs 15 0 2300 4 0 S 0.0 0.0 0:00 0 xfs 1826 daemon 15 0 60 4 0 S 0.0 0.0 0:00 0 atd 1836 root 15 0 48 4 0 S 0.0 0.0 0:00 0 rhnsd 1849 root 15 0 68 20 4 S 0.0 0.0 0:38 0 portsentry 1851 root 15 0 84 32 16 S 0.0 0.0 0:02 0 portsentry 1951 bb 16 0 232 68 20 S 0.0 0.0 1:38 0 bbd 1979 bb 15 0 244 4 0 S 0.0 0.0 0:00 0 runbb.sh 1981 bb 15 0 244 4 0 S 0.0 0.0 0:00 0 runbb.sh 1985 bb 24 0 244 4 0 S 0.0 0.0 0:00 0 runbb.sh 1986 bb 20 0 180 4 0 S 0.0 0.0 0:06 0 bbrun 2255 root 15 0 2540 80 48 S 0.0 0.0 0:13 0 httpd 2273 root 18 0 5960 4 0 S 0.0 0.0 2:21 0 mrtg 2276 root 21 0 5992 4 0 S 0.0 0.0 8:27 0 mrtg 2277 root 21 0 52 4 0 S 0.0 0.0 0:00 0 mingetty 2278 root 21 0 52 4 0 S 0.0 0.0 0:00 0 mingetty 2279 root 21 0 52 4 0 S 0.0 0.0 0:00 0 mingetty 2280 root 21 0 52 4 0 S 0.0 0.0 0:00 0 mingetty 2281 root 21 0 52 4 0 S 0.0 0.0 0:00 0 mingetty 2282 root 21 0 52 4 0 S 0.0 0.0 0:00 0 mingetty 2368 bb 19 0 176 4 0 S 0.0 0.0 0:04 0 bbrun 4784 bb 19 0 176 4 0 S 0.0 0.0 0:04 0 bbrun 26910 root 15 0 39208 2404 168 S 0.0 1.9 54:04 0 snort 23963 bb 20 0 76 4 0 S 0.0 0.0 0:00 0 sleep 25790 bb 23 0 80 4 0 S 0.0 0.0 0:00 0 sleep 26250 bb 19 0 76 4 0 S 0.0 0.0 0:00 0 sleep 26255 root 20 0 72 4 0 S 0.0 0.0 0:00 0 crond 26256 root 18 0 392 372 268 S 0.0 0.2 0:00 0 run-parts 26513 root 22 0 696 696 580 S 0.0 0.5 0:00 0 date 26514 root 21 0 476 472 380 S 0.0 0.3 0:00 0 awk 26516 root 23 0 548 548 388 S 0.0 0.4 0:00 0 su root pts/0 ghostsrvr.medna. Fri Jun 13 08:12 - 17:20 (09:08) root pts/0 ghostsrvr.medna. Wed Jun 11 08:44 - 17:00 (08:16) root pts/0 ghostsrvr.medna. Tue Jun 10 07:56 - 11:25 (03:29) root pts/0 ghostsrvr.medna. Mon Jun 9 15:59 - 16:52 (00:53) root pts/0 ghostsrvr.medna. Mon Jun 9 08:01 - 13:57 (05:55) reboot system boot 2.4.20-13.9 Mon Jun 9 07:53 (5+20:08) root pts/1 ghostsrvr.medna. Fri Jun 6 09:16 - 09:30 (00:13) root pts/0 ghostsrvr.medna. Thu Jun 5 08:33 - 13:32 (04:58) root pts/0 ghostsrvr.medna. Wed Jun 4 08:38 - 11:37 (02:58) root pts/0 ghostsrvr.medna. Wed Jun 4 08:32 - 08:38 (00:05) root pts/0 ghostsrvr.medna. Mon Jun 2 08:30 - 16:32 (08:01) reboot system boot 2.4.20-13.9 Mon Jun 2 08:27 (12+19:35) wtmp begins Mon Jun 2 08:23:00 2003 btmp begins Thu Jun 5 09:09:38 2003 Active Internet connections (servers and established) Proto Recv-Q Send-Q Local Address Foreign Address State tcp 0 0 *:1984 *:* LISTEN tcp 0 0 *:1024 *:* LISTEN tcp 0 0 watson1.medna.com:1025 *:* LISTEN tcp 0 0 *:http *:* LISTEN tcp 0 0 *:ssh *:* LISTEN tcp 0 0 *:smtp *:* LISTEN tcp 0 0 *:https *:* LISTEN tcp 0 0 watson1.medna.com:http monitor03.medna.co:4075 TIME_WAIT tcp 0 0 watson1.medna.com:http monitor03.medna.co:4074 TIME_WAIT tcp 0 0 watson1.medna.com:2689 mail.medna.com:smtp TIME_WAIT tcp 0 0 watson1.medna.com:2688 watson1.medna.com:smtp TIME_WAIT tcp 0 0 watson1.medna.com:http monitor03.medna.co:4079 TIME_WAIT tcp 0 0 watson1.medna.com:http monitor03.medna.co:4078 TIME_WAIT tcp 0 0 watson1.medna.com:http monitor03.medna.co:4076 TIME_WAIT tcp 0 0 watson1.medna.com:http monitor03.medna.co:4083 TIME_WAIT tcp 0 0 watson1.medna.com:http monitor03.medna.co:4082 TIME_WAIT tcp 0 0 watson1.medna.com:http monitor03.medna.co:4081 TIME_WAIT tcp 0 0 watson1.medna.com:http monitor03.medna.co:4080 TIME_WAIT tcp 0 0 watson1.medna.com:http monitor03.medna.co:4070 TIME_WAIT tcp 0 0 watson1.medna.com:http monitor03.medna.co:4084 TIME_WAIT udp 0 0 *:1024 *:* udp 0 0 *:1025 *:* udp 0 0 *:syslog *:* udp 0 0 *:1027 *:* udp 0 0 *:876 *:* raw 0 0 *:tcp *:* 7 raw 0 0 *:udp *:* 7 Active UNIX domain sockets (servers and established) Proto RefCnt Flags Type State I-Node Path unix 2 [ ACC ] STREAM LISTENING 2244 /dev/gpmctl unix 12 [ ] DGRAM 1668 /dev/log unix 2 [ ACC ] STREAM LISTENING 2435 /tmp/.font-unix/fs7100 unix 2 [ ] DGRAM 4121232 unix 2 [ ] DGRAM 2510 unix 2 [ ] DGRAM 2495 unix 2 [ ] DGRAM 2243 unix 2 [ ] DGRAM 2210 unix 2 [ ] DGRAM 2196 unix 2 [ ] DGRAM 2144 unix 2 [ ] DGRAM 1878 unix 2 [ ] DGRAM 1730 unix 2 [ ] DGRAM 1680 Filesystem 1K-blocks Used Available Use% Mounted on /dev/sda6 11803992 2342608 8861760 21% / /dev/sda5 101089 14438 81432 16% /boot /dev/sda2 5036316 1764232 3016252 37% /old none 62992 0 62992 0% /dev/shm PID TTY TIME CMD 1 ? 00:00:04 init 2 ? 00:00:00 keventd 3 ? 00:00:00 kapmd 4 ? 00:00:00 ksoftirqd_CPU0 9 ? 00:00:00 bdflush 5 ? 00:00:07 kswapd 6 ? 00:00:00 kscand/DMA 7 ? 00:00:03 kscand/Normal 8 ? 00:00:00 kscand/HighMem 10 ? 00:00:00 kupdated 11 ? 00:00:00 mdrecoveryd 17 ? 00:00:00 scsi_eh_0 20 ? 00:00:32 kjournald 78 ? 00:00:00 khubd 1165 ? 00:00:00 kjournald 1166 ? 00:00:00 kjournald 1507 ? 00:00:36 syslogd 1511 ? 00:00:00 klogd 1548 ? 00:00:00 rpc.statd 1616 ? 00:00:00 apmd 1653 ? 00:00:02 sshd 1667 ? 00:00:00 xinetd 1686 ? 00:00:01 sendmail 1695 ? 00:00:00 sendmail 1705 ? 00:00:00 gpm 1714 ? 00:00:00 crond 1806 ? 00:00:00 xfs 1826 ? 00:00:00 atd 1836 ? 00:00:00 rhnsd 1849 ? 00:00:38 portsentry 1851 ? 00:00:02 portsentry 1951 ? 00:01:38 bbd 1979 ? 00:00:00 runbb.sh 1981 ? 00:00:00 runbb.sh 1985 ? 00:00:00 runbb.sh 1986 ? 00:00:06 bbrun 2255 ? 00:00:04 httpd 2273 ? 00:02:21 mrtg 2276 ? 00:08:27 mrtg 2277 tty1 00:00:00 mingetty 2278 tty2 00:00:00 mingetty 2279 tty3 00:00:00 mingetty 2280 tty4 00:00:00 mingetty 2281 tty5 00:00:00 mingetty 2282 tty6 00:00:00 mingetty 2368 ? 00:00:04 bbrun 4784 ? 00:00:04 bbrun 26910 ? 00:54:04 snort 17214 ? 00:00:01 httpd 31933 ? 00:00:01 httpd 15763 ? 00:00:01 httpd 18350 ? 00:00:01 httpd 23081 ? 00:00:01 httpd 12286 ? 00:00:01 httpd 5197 ? 00:00:00 httpd 32330 ? 00:00:00 httpd 23963 ? 00:00:00 sleep 25790 ? 00:00:00 sleep 26250 ? 00:00:00 sleep 26255 ? 00:00:00 crond 26256 ? 00:00:00 run-parts 26513 ? 00:00:00 date 26514 ? 00:00:00 awk 26547 ? 00:00:00 sendmail 26549 ? 00:00:00 ps /etc/cron.daily/logrotate: Null message body; hope that's ok Null message body; hope that's ok
CRONTAB----------------------- SHELL=/bin/bash PATH=/sbin:/bin:/usr/sbin:/usr/bin MAILTO=support HOME=/ # run-parts 01 * * * * root run-parts /etc/cron.hourly 16 * * * * root run-parts /etc/cron.minute 32 * * * * root run-parts /etc/cron.minute 47 * * * * root run-parts /etc/cron.minute 02 4 * * * root run-parts /etc/cron.daily 22 4 * * 0 root run-parts /etc/cron.weekly 42 4 1 * * root run-parts /etc/cron.monthly ----------------------------------- [root@watson1 cron.weekly]# ls -al total 76 drwxr-xr-x 2 root root 4096 May 2 18:16 . drwxr-xr-x 61 root root 4096 Jun 16 08:46 .. -rwxr-xr-x 1 root root 277 Jan 24 15:26 0anacron -rwxr-xr-x 1 root root 414 Feb 10 09:20 makewhatis.cron -rwxr-xr-x 1 root root 455 May 21 2002 snortlogs -rw-r--r-- 1 root root 50908 Jun 15 04:24 test ---------------------------------------------
sendmail stops processing email if the load is too high. This can be adjusted as a configuration param if the machine should still accept new email with very high load. Overall looks fine from a sendmail perspective. Thanks for your bug-report, Florian La Roche
I have changed the settings in sendmail.mc, but am getting errors in the teens. I don't thing the settings are transferring to sendmail.cf from sendmail.mc. Yes, I ran the make command. ---------------from sendmail.mc dnl define(`confQUEUE_LA', `20')dnl dnl define(`confREFUSE_LA', `26')dnl ------------------- ---------------from sendmail.cf # load average at which we just queue messages #O QueueLA=8 # load average at which we refuse connections #O RefuseLA=12 ----------------- un 30 15:44:49 watson1 sendmail[1666]: rejecting connections on daemon MTA: load average: 16 Jun 30 15:45:09 watson1 sendmail[1666]: rejecting connections on daemon MTA: load average: 17 Jun 30 15:45:23 watson1 sendmail[1666]: rejecting connections on daemon MTA: load average: 18 Jun 30 15:45:42 watson1 sendmail[1666]: rejecting connections on daemon MTA: load average: 17 Jun 30 15:46:02 watson1 sendmail[1666]: rejecting connections on daemon MTA: load average: 19 Jun 30 15:46:20 watson1 sendmail[1666]: rejecting connections on daemon MTA: load average: 18 Jun 30 15:46:38 watson1 sendmail[1666]: rejecting connections on daemon MTA: load average: 18 Jun 30 15:46:56 watson1 sendmail[1666]: rejecting connections on daemon MTA: load average: 17 Jun 30 15:47:15 watson1 sendmail[1666]: rejecting connections on daemon MTA: load average: 17 Jun 30 15:47:30 watson1 sendmail[1666]: rejecting connections on daemon MTA: load average: 18 Jun 30 15:47:50 watson1 sendmail[1666]: rejecting connections on daemon MTA: load average: 18
Oops - my bad. ---------------from sendmail.mc define(`confQUEUE_LA', `20')dnl define(`confREFUSE_LA', `26')dnl ------------------- Removed the "dnl" to it could compile. Now showing in sendmail.cf