From Bugzilla Helper: User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows 98; Win 9x 4.90) Description of problem: Every 38 days and 16 hours the system locks up tight. No services are running, the console screen saver does not turn off. Only fix is to reboot. uname -a Linux m5.do-box.net 2.4.2-2smp #1 SMP Sun Apr 8 20:21:34 EDT 2001 i686 unknown Running on dual-processor Micronics Pentium Pro 200 A friend is experiencing the same crash... uname -a Linux Linux233 2.4.2-2 #1 Sun Apr 8 19:37:14 EDT 2001 i586 unknown Running on AMD233MMX K6 How reproducible: Always Steps to Reproduce: 1. Simply keep the server running for 38 days, 16 hours 2. 3. Actual Results: System locked Expected Results: Indefinite uptime ;-) Additional info: There is nothing reported in /var/log/messages. It only shows a gap in time between the crash and the restart.
Thats utterly utterly bizarre. 38 days 24 isnt any clock or timer we use that I know of. No idea right now.
Some additional information from my friend who experienced the same crash at the same time as I... I am in Pompey England. My above box locked up Sunday night (just froze, ala M$, lost all processes, gateway etc.) without any intervention. Although I do not keep a running total on uptime, netcraft reports the last record as 38.17 days. My mate is in R.I. USA. He is also running R7.1. Yesterday, his box locked up exactly the same as mine. Uptime 38 days 16 hours - this is the second time his has crashed at 38 days (this was the first time my box had hit 38 days). Both machines have two NIC's (mine is ADSL gateway, his Cable). Neither machine has any information in the logs... just the last proper entries prior to the lock-up. Will both use the Red Hat 'up2date' programme to keep our systems current. Niether of us have had Linux (any flavour) lock up like this without one of use messing and breaking something. Neither of us run any 'un-toward' processes... just the usual services. Nick Warne nw
Here is a list of services running on my server... Linux m5.do-box.net 2.4.2-2smp #1 SMP Sun Apr 8 20:21:34 EDT 2001 i686 unknown amd is stopped anacron dead but subsys locked arpwatch is stopped atalkd (pid 1079) is running... atd is stopped Configured Mount Points: ------------------------ /usr/sbin/automount --timeout 60 /misc file /etc/auto.misc Active Mount Points: -------------------- /usr/sbin/automount --timeout 60 /misc file /etc/auto.misc crond (pid 6770 1002) is running... Not starting gated: [60G[ OK ] gpm (pid 974) is running... httpd (pid 7265 7261 7250 7243 6613 5648 990) is running... identd (pid 857 855 854 853 852) is running... Chain input (policy ACCEPT): <SNIP> Chain forward (policy DENY): <SNIP> Chain output (policy ACCEPT): <SNIP> ircd (pid 1133) is running... No status available for this package lpd is stopped nwserv is stopped nwbind is stopped ncpserv is stopped rndc: connect: connection refused Configured devices: lo eth0 eth1 ppp0 Devices that are down: Devices with modified configuration: rpc.mountd (pid 916) is running... nfsd (pid 928 927 926 925 924 923 922 921) is running... rpc.rquotad (pid 911) is running... rpc.statd (pid 718) is running... nscd is stopped ntpd is stopped Port Manger (portmgr) is running Power Alert Server (paserver) is running portmap (pid 703) is running... The random data source exists rhnsd (pid 1148) is running... rpc.rstatd (pid 942) is running... rpc.rusersd is stopped rpc.rwalld is stopped rwhod is stopped sendmail (pid 961) is running... smbd (pid 1247 1103) is running... nmbd (pid 1108) is running... snmpd is stopped squid (pid 1029 1028) is running... sshd (pid 2269 868) is running... syslogd (pid 684) is running... klogd (pid 689) is running... tux is stopped xfs (pid 1063) is running... xinetd (pid 888) is running... ypbind is stopped rpc.yppasswdd is stopped ypserv is stopped
Are both boxes running appletalk ?
No, just mine is running atalk. I experienced the 38 day lockup before I had netatalk, ircd, mgetty and Poweralert installed and running, and before I installed the second eth interface. I shall get a list of Nick's services forthcoming.
Could you also try to find out which modules (see lsmod) you guys have in common? That might narrow the suspects down a lot
Here is the lsmod output on my machine... Linux m5.do-box.net 2.4.2-2smp #1 SMP Sun Apr 8 20:21:34 EDT 2001 i686 unknown Module Size Used by appletalk 23792 12 nfsd 70976 8 (autoclean) lockd 53232 1 (autoclean) [nfsd] sunrpc 66352 1 (autoclean) [nfsd lockd] autofs 11808 1 (autoclean) tulip 39152 2 (autoclean) ipchains 41632 0 (unused) aic7xxx 136336 3 sd_mod 11744 3 scsi_mod 98624 2 [aic7xxx sd_mod]
OK, here is my stats/conf. Linux Linux233 2.4.2-2 #1 Sun Apr 8 19:37:14 EDT 2001 i586 unknown lsmod ===== Module Size Used by nfs 76800 2 (autoclean) lockd 52336 1 (autoclean) [nfs] sunrpc 62448 1 (autoclean) [nfs lockd] autofs 11136 1 (autoclean) 8139too 16480 2 (autoclean) ipchains 38944 0 (unused) mousedev 4160 1 hid 11808 0 (unused) input 3456 0 [mousedev hid] usb-uhci 20848 0 (unused) usbcore 49632 1 [hid usb-uhci] mounts ====== /dev/hda2 on / type ext2 (rw) none on /proc type proc (rw) usbdevfs on /proc/bus/usb type usbdevfs (rw) /dev/hda3 on /home type ext2 (rw) none on /dev/pts type devpts (rw,gid=5,mode=620) automount(pid802) on /misc type autofs (rw,fd=5,pgrp=802,minproto=2,maxproto=3) 486Linux:/home/httpd/ on /var/www type nfs 486Linux_2:/home/nick/hdb3/mp3 on /var/www/html/noxster type nfs Service status ============== anacron dead but subsys locked apmd (pid 755) is running... atd (pid 817) is running... Configured Mount Points: ------------------------ /usr/sbin/automount --timeout 60 /misc file /etc/auto.misc Active Mount Points: -------------------- /usr/sbin/automount --timeout 60 /misc file /etc/auto.misc crond (pid 2125 940) is running... gpm (pid 912) is running... httpd (pid 1839 1545 1444 1369 1039 1038 1034 1033 1032 928) is running... identd is stopped Chain input (policy ACCEPT): ** Chain forward (policy DENY): ** Chain output (policy ACCEPT): ** ircd (pid 1018) is running... No status available for this package lpd (pid 867) is running... mysqld is stopped Active NFS mountpoints: /var/www /var/www/html/noxster /var/www /var/www/html/noxster Configured devices: lo eth0 eth1 Devices that are down: Devices with modified configuration: rpc.statd (pid 671) is running... nscd is stopped ntpd is stopped portmap (pid 656) is running... The random data source exists rhnsd (pid 1037) is running... rwhod is stopped sendmail (pid 899) is running... smbd (pid 1954 988) is running... nmbd (pid 993) is running... sshd (pid 1957 829) is running... syslogd (pid 637) is running... klogd (pid 642) is running... tux is stopped xfs (pid 976) is running... xinetd (pid 849) is running... ypbind is stopped ================================================== That's it really. Everything just appears to run normal... Nick
The 38-day crash did not recurr on mine or Nick's machines. We have been updating our systems when patches come available on Red Hat Network. I'm closing this bug out. Regards, Clarence