From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7b) Gecko/20040316 Description of problem: Amanda runs on a regular cron job. The backup server is also a file server. The backup server will happily receive remote backup jobs from other servers; however frequently a backup will hang. If I run ps -A I will see dumper and taper processes running. These can hang for a couple of weeks until they are manually killed. This only seems to have started since I upgraded to Fedora from RH9. Version-Release number of selected component (if applicable): amanda-client-2.4.4p1-1 How reproducible: Sometimes Steps to Reproduce: 1.run amdump on local drives 2.amanda will cease to complete within normal time frame 3.killall dumper and taper Actual Results: Amanda will not complete its normal backup run Expected Results: Amanda should complete its backup run and send a summary notice. Additional info:
I haven't had any such trouble with my test Amanda installation here, so I'm going to need a lot more information to try and debug it. Can you start by attaching to this bug 1: Your /etc/amanda/*/amanda.conf 2: Your /etc/amanda/*/disklist 3: All useful information about your tape changer (if you have one). 4: A description of the relevent hardware in your server (Tape drive, SCSI controller it's attached to, type and number of CPUs, memory, etc). 5: the output of dmesg | tail and tail /var/log/messages after Amanda has hung. I'm afraid that the problem is unlikely to be in Amanda itself. But until we can get a better description of where the problem is, we won't know who to reassign the bug to.
Additional hardware and configuration info lspci 00:00.0 Host bridge: ALi Corporation M1541 (rev 04) 00:01.0 PCI bridge: ALi Corporation M1541 PCI to AGP Controller (rev 04) 00:02.0 USB Controller: ALi Corporation USB 1.1 Controller (rev 03) 00:07.0 ISA bridge: ALi Corporation M1533 PCI to ISA Bridge [Aladdin IV] (rev c3) 00:08.0 SCSI storage controller: Adaptec AHA-2940U/UW/D / AIC-7881U 00:09.0 Ethernet controller: Intel Corp. 82545EM Gigabit Ethernet Controller (Copper) (rev 01) 00:0b.0 RAID bus controller: CMD Technology Inc PCI0649 (rev 02) 00:0f.0 IDE interface: ALi Corporation M5229 IDE (rev c1) 01:00.0 VGA compatible controller: nVidia Corporation NV6 [Vanta/Vanta LT] (rev 15) disklist: fileserver.harkness.co.uk /etc comp-root-tar fileserver.harkness.co.uk /raid/xander comp-root-tar fileserver.harkness.co.uk /raid/stuart comp-root-tar fileserver.harkness.co.uk /raid/torkystuff comp-root-tar fileserver.harkness.co.uk /raid/configs comp-root-tar fileserver.harkness.co.uk /raid/backup comp-root-tar fileserver.harkness.co.uk /raid/documentation comp-root-tar fileserver.harkness.co.uk /raid/books comp-root-tar wks.harkness.co.uk /etc comp-root-tar wks.harkness.co.uk hda1 comp-high wks.harkness.co.uk hda2 comp-high wks.harkness.co.uk hda5 comp-high wks.harkness.co.uk /home/xander comp-root-tar www2-int hda6 comp-root www2-int hda7 comp-root www2-int hda5 comp-root mail2-int.harkness.co.uk /dev/md1 comp-high mail2-int.harkness.co.uk /dev/md5 comp-high mail2-int.harkness.co.uk /dev/md0 comp-high relay1-int.harkness.co.uk /dev/hda1 comp-high relay1-int.harkness.co.uk /dev/hda2 comp-high relay2-int.harkness.co.uk /dev/hda2 comp-high relay2-int.harkness.co.uk /dev/hda1 comp-high access.harkness.co.uk /dev/hda1 comp-high access.harkness.co.uk /dev/hda3 comp-high access.harkness.co.uk /dev/hda6 comp-high fileserver.harkness.co.uk /dev/hde1 nocomp-high fileserver.harkness.co.uk /dev/hde3 nocomp-high -1 local www.cpicountrywide.co.uk /dev/hda6 comp-high fileserver.harkness.co.uk /home/xander/all /home/xander { high-tar exclude "./[a-u]*" } 1 fileserver.harkness.co.uk /home/xander/ag /home/xander { high-tar include "./[a-g]*" } 1 fileserver.harkness.co.uk /home/xander/gm /home/xander { high-tar include "./[h-m]*" } 1 fileserver.harkness.co.uk /home/xander/nu /home/xander { high-tar include "./[n-u]*" } 1 amanda.conf: org "happy" # your organization name for reports mailto "amanda.uk" # space separated list of operators at your site dumpuser "amanda" # the user to run dumps under inparallel 4 # maximum dumpers that will run in parallel netusage 600 Kbps # maximum net bandwidth for Amanda, in KB per sec dumpcycle 4 weeks # the number of days in the normal dump cycle runspercycle 5 # the number of amdump runs in dumpcycle days tapecycle 30 tapes # the number of tapes in rotation bumpsize 20 Mb # minimum savings (threshold) to bump level 1 -> 2 bumpdays 1 # minimum days at each level bumpmult 4 # threshold = bumpsize * bumpmult^(level-1) etimeout 300 # number of seconds per filesystem for estimates. runtapes 1 # number of tapes to be used in a single run of amdump tapedev "/dev/nst0" # the no-rewind tape device to be used tapetype xander-compaq labelstr "^DailySet[0-9][0-9]*$" # label constraint regex: all tapes must match holdingdisk hd2 { directory "/home/tmp" use 4 Gb } infofile "/var/lib/amanda/happy/curinfo" # database filename logdir "/var/lib/amanda/happy" # log directory indexdir "/var/lib/amanda/happy/index" # index directory define tapetype xander-compaq { comment "just produced by tapetype program" length 9822 mbytes filemark 0 kbytes speed 973 kps } define dumptype global { comment "Global definitions" } define dumptype always-full { global comment "Full dump of this filesystem always" compress none priority high dumpcycle 0 } define dumptype root-tar { global program "GNUTAR" comment "root partitions dumped with tar" compress none index exclude list "/usr/local/lib/amanda/exclude.gtar" priority low } define dumptype user-tar { root-tar comment "user partitions dumped with tar" priority medium } define dumptype low-tar { root-tar comment "partitions dumped with tar" priority low index compress client fast } define dumptype high-tar { root-tar comment "partitions dumped with tar" priority high index compress client fast } define dumptype comp-root-tar { root-tar comment "Root partitions with compression" compress client fast index } define dumptype comp-user-tar { user-tar compress client fast } define dumptype holding-disk { global comment "The master-host holding disk itself" holdingdisk no # do not use the holding disk priority medium } define dumptype comp-user { global comment "Non-root partitions on reasonably fast machines" compress client fast priority medium } define dumptype nocomp-user { comp-user comment "Non-root partitions on slow machines" compress none } define dumptype comp-root { global comment "Root partitions with compression" compress client fast priority low } define dumptype nocomp-root { comp-root comment "Root partitions without compression" compress none } define dumptype comp-high { global comment "very important partitions on fast machines" compress client best priority high index } define dumptype nocomp-high { comp-high comment "very important partitions on slow machines" compress none } define dumptype nocomp-test { global comment "test dump without compression, no /etc/dumpdates recording" compress none record no priority medium } define dumptype comp-test { nocomp-test comment "test dump with compression, no /etc/dumpdates recording" compress client fast } define interface local { comment "a local disk" use 1000 kbps } define interface eth1 { comment "100 Mbps ethernet" use 2000 kbps } define interface eth0 { comment "100 Mbps ethernet" use 2000 kbps } cat /proc/scsi/scsi Attached devices: Host: scsi0 Channel: 00 Id: 06 Lun: 00 Vendor: COMPAQ Model: SDT-9000 Rev: 4.20 Type: Sequential-Access ANSI SCSI revision: 02 The Tape drive is a Compaq DDS3 I am running amanda again and will post details of the logs when I can find an entry. I have searched through /var/log/messages and have not found anything that does not relate to dhcp and samba.
I think that you are correct in your assessment that it may not be amanda. There are errors on a degraded RAID set; however it seems strange that this should interfere with AMANDA's operation. Apr 13 19:19:27 file kernel: end_request: I/O error, dev 16:00 (hdc), sector 0 Apr 13 19:19:27 file kernel: end_request: I/O error, dev 16:00 (hdc), sector 2 Apr 13 19:19:27 file kernel: end_request: I/O error, dev 16:00 (hdc), sector 4 Apr 13 19:19:27 file kernel: end_request: I/O error, dev 16:00 (hdc), sector 6 The array is now rebuilding and I shall test again when complete.
It seems that this happens when AMANDA runs out of space in its holding disk location. This is repeatable for me. Yes I know it looks kind of stupid; however there are a couple of daemons that use that disk and it stopped when the disk became full and when I checked on it there was disk space available.
Fedora Core 1 is maintained by the Fedora Legacy project for security updates only. If this problem is a security issue, please reopen and reassign to the Fedora Legacy product. If it is not a security issue and hasn't been resolved in the current FC5 updates or in the FC6 test release, reopen and change the version to match. Thanks! NOTE: Fedora Core 1 is reaching the final end of support even by the Legacy project. After Fedora Core 6 Test 2 is released (currently scheduled for July 26th), there will be no more security updates for FC1. Please use these next two weeks to upgrade any remaining FC1 systems to a current release.
Note that FC1 and FC2 are no longer supported even by Fedora Legacy. Many changes have occurred since these older releases. Please install a supported version of Fedora Core and retest. If this still occurs on FC3 or FC4, please assign to that version and Fedora Legacy. If it still occurs on FC5 or FC6, please reopen and assign to the correct version. Thanks!