When doing an rsync between 2 identical machines 1 being the primary machine 2 being the secondary. Machine 1 hard locks without warning of messages requireing a hard reset. The command used in the rsync is: rsync -azrcpogte ssh --delete --force --stats dns1:/var/spool/mail /var/spool These machines are IBM Netfinity 5600's with 1 gig of Ram and PII 400. APM is disabled on the machines. The first machine carries the brunt of the load for the network doing all authentication, web, dns and mail. The second machine is a backup machine in case server 1 has problems. On a side note the machine recently reported an error of 'To many open files'. The last 2 crashes were within 5 minutes of the machine booting. These are both stock 6.1 machines+errata with no changes to the core system. No logs or errors are reported when the machine locks up. Below is the output of ps ax, ksyms, vmstat, free, df, cat /proc/cpu-info, and uname -a. If further information is needed I will be happy to get it for you. PID TTY STAT TIME COMMAND 1 ? S 0:03 init [3] 2 ? SW 0:00 [kflushd] 3 ? SW 0:00 [kupdate] 4 ? SW 0:00 [kpiod] 5 ? SW 0:00 [kswapd] 6 ? SW< 0:00 [mdrecoveryd] 303 ? S 0:00 portmap 356 ? S 0:00 syslogd -m 0 367 ? S 0:00 klogd 383 ? S 0:00 /usr/sbin/atd 399 ? S 0:00 crond 410 ? S 0:00 /usr/sbin/radiusd -a /var/log/radacct -A services 412 ? S 0:00 /usr/sbin/radiusd -a /var/log/radacct -A services 427 ? S 0:00 inetd 443 ? S 0:00 named 452 ? S 0:00 /usr/sbin/sshd 468 ? SL 0:00 xntpd -A 507 ? S 0:00 sendmail: accepting connections on port 25 511 ? S 0:00 sendmail: MAA07129 mailserver0.webstakes.com.: user o 524 ? S 0:00 gpm -t ps/2 540 ? S 0:00 httpd 547 ? S 0:00 bash /etc/rc.d/rc3.d/S86rmserver start 562 ? S 0:00 initlog -q -c rmserver /usr/lib/Real/rmserver.cfg 563 ? S 0:00 rmserver /usr/lib/Real/rmserver.cfg 570 ? S 0:00 httpd 571 ? S 0:00 httpd 572 ? S 0:00 httpd 573 ? S 0:00 httpd 574 ? S 0:00 httpd 575 ? S 0:00 httpd 576 ? S 0:00 httpd 577 ? S 0:00 httpd 578 ? S 0:00 httpd 579 ? S 0:00 httpd 582 ? S 0:00 rmserver /usr/lib/Real/rmserver.cfg 583 ? S 0:00 xfs -droppriv -daemon -port -1 622 tty1 S 0:00 /sbin/mingetty tty1 623 tty2 S 0:00 /sbin/mingetty tty2 624 tty3 S 0:00 /sbin/mingetty tty3 625 tty4 S 0:00 /sbin/mingetty tty4 626 tty5 S 0:00 /sbin/mingetty tty5 627 tty6 S 0:00 /sbin/mingetty tty6 630 ? S 0:00 rmserver /usr/lib/Real/rmserver.cfg 631 ? S 0:00 rmserver /usr/lib/Real/rmserver.cfg 632 ? S 0:00 rmserver /usr/lib/Real/rmserver.cfg 633 ? S 0:00 rmserver /usr/lib/Real/rmserver.cfg 634 ? S 0:00 rmserver /usr/lib/Real/rmserver.cfg 635 ? S 0:00 rmserver /usr/lib/Real/rmserver.cfg 636 ? S 0:00 rmserver /usr/lib/Real/rmserver.cfg 637 ? S 0:00 rmserver /usr/lib/Real/rmserver.cfg 638 ? S 0:00 rmserver /usr/lib/Real/rmserver.cfg 645 ? S 0:00 /usr/sbin/sshd 647 pts/0 S 0:00 -bash 664 ? S 0:00 /usr/sbin/sshd 665 ? S 0:00 /usr/sbin/sshd 666 ? S 0:00 httpd 687 ? S 0:00 sendmail: server mail2.lig.bellsouth.net [205.152.0.5 705 ? S 0:00 httpd 716 ? S 0:00 sendmail: server imo15.mx.aol.com [152.163.225.5] cmd 724 ? S 0:00 sendmail: LAA02260 mailserver5.webstakes.com.: user o 729 ? S 0:00 httpd 730 ? S 0:00 httpd 731 ? S 0:00 httpd 749 ? S 0:00 httpd 756 ? S 0:00 httpd 758 ? S 0:00 httpd 759 ? S 0:00 httpd 781 ? S 0:00 sendmail: startup with imo14.mx.aol.com 783 ? S 0:00 sendmail: server dial-041.floweb.com [206.30.32.236] 784 ? S 0:00 sendmail: TAA00784 dial-041.floweb.com [206.30.32.236 785 pts/0 R 0:00 ps ax 786 pts/0 R 0:00 -bash [root@dns1 /root]# free total used free shared buffers cached Mem: 971556 962152 9404 100712 817436 37700 -/+ buffers/cache: 107016 864540 Swap: 265032 [root@dns1 /root]# cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 5 model name : Pentium II (Deschutes) stepping : 3 cpu MHz : 400.023926 cache size : 512 KB fdiv_bug : no hlt_bug : no sep_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx osfxsr bogomips : 398.95 0 265032 [root@dns1 /root]# ksyms Address Symbol Defined by fc85a1f4 pcnet32_probe [pcnet32] fc858060 st_template [st] fc825560 sequencer_patches [aic7xxx] fc81e304 aic7xxx_release [aic7xxx] fc81e24c aic7xxx_biosparam [aic7xxx] fc81e804 aic7xxx_set_info [aic7xxx] fc817d08 aic7xxx_chip_reset [aic7xxx] fc81e80c aic7xxx_proc_info [aic7xxx] fc80b04c aic7xxx_setup [aic7xxx] fc826380 driver_template [aic7xxx] fc819408 aic7xxx_detect [aic7xxx] fc8247e0 proc_scsi_aic7xxx [aic7xxx] fc81c0a4 aic7xxx_abort [aic7xxx] fc81d480 aic7xxx_reset [aic7xxx] fc80bd4c aic7xxx_info [aic7xxx] fc81b6bc aic7xxx_queue [aic7xxx] fc80407c DAC960_KernelIOCTL_R53011449 [DAC960] Filesystem 1k-blocks Used Available Use% Mounted on /dev/rd/c0d0p6 16752177 2563972 13315878 16% / /dev/rd/c0d0p1 54410 5925 45676 11% /boot Module Size Used by pcnet32 9628 1 (autoclean) st 24864 0 (unused) aic7xxx 112208 0 (unused) DAC960 29796 3 Linux dns1 2.2.12-20 #1 Mon Sep 27 10:40:35 EDT 1999 i686 unknown procs memory swap io system cpu r b w swpd free buff cache si so bi bo in cs us sy id 0 0 0 0 6360 817436 39324 0 0 744 164 164 138 3 2 95
The problem is possibly a memory leak. If rsync is not run the machine does not use nearly the amount of ram that is used when an rsync session is run. It also seems to be that once rsync has run it continues to use memory as though it is continuely running. Perhaps it is a combination between rsync and sshd 1.2.27+RSA. If further testing is needed please let me know.
If you kill klogd and do dmesg -n8 (to turn console logging up), are there any more messages before the crash? Roughly how many files should it be transferring?
Does this continue with the rsync currently in rawhide? (2.4.3-1)
The rsync Version (2.4.1) is broken. http://rsync.samba.org/rsync/index.html