Bug 120741 - Amanda hangs during backup
Summary: Amanda hangs during backup
Keywords:
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Fedora
Classification: Fedora
Component: amanda
Version: 1
Hardware: i686
OS: Linux
medium
medium
Target Milestone: ---
Assignee: Jay Fenlason
QA Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2004-04-13 17:24 UTC by Xander D Harkness
Modified: 2016-06-07 22:44 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2006-10-28 17:15:56 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Xander D Harkness 2004-04-13 17:24:07 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7b)
Gecko/20040316

Description of problem:
Amanda runs on a regular cron job.  The backup server is also a file
server.  The backup server will happily receive remote backup jobs
from other servers; however frequently a backup will hang.  

If I run ps -A I will see dumper and taper processes running.  These
can hang for a couple of weeks until they are manually killed.  This
only seems to have started since I upgraded to Fedora from RH9.

Version-Release number of selected component (if applicable):
amanda-client-2.4.4p1-1

How reproducible:
Sometimes

Steps to Reproduce:
1.run amdump on local drives
2.amanda will cease to complete within normal time frame
3.killall dumper and taper
    

Actual Results:  Amanda will not complete its normal backup run

Expected Results:  Amanda should complete its backup run and send a
summary notice.

Additional info:

Comment 1 Jay Fenlason 2004-04-13 17:48:09 UTC
I haven't had any such trouble with my test Amanda installation here,
so I'm going to need a lot more information to try and debug it.  Can
you start by attaching to this bug
1: Your /etc/amanda/*/amanda.conf
2: Your /etc/amanda/*/disklist
3: All useful information about your tape changer (if you have one).
4: A description of the relevent hardware in your server (Tape drive,
SCSI controller it's attached to, type and number of CPUs, memory, etc).
5: the output of dmesg | tail and tail /var/log/messages after Amanda
has hung.

I'm afraid that the problem is unlikely to be in Amanda itself.  But
until we can get a better description of where the problem is, we
won't know who to reassign the bug to.

Comment 2 Xander D Harkness 2004-04-13 18:27:12 UTC
Additional hardware and configuration info
lspci
00:00.0 Host bridge: ALi Corporation M1541 (rev 04)
00:01.0 PCI bridge: ALi Corporation M1541 PCI to AGP Controller (rev 04)
00:02.0 USB Controller: ALi Corporation USB 1.1 Controller (rev 03)
00:07.0 ISA bridge: ALi Corporation M1533 PCI to ISA Bridge [Aladdin
IV] (rev c3)
00:08.0 SCSI storage controller: Adaptec AHA-2940U/UW/D / AIC-7881U
00:09.0 Ethernet controller: Intel Corp. 82545EM Gigabit Ethernet
Controller (Copper) (rev 01)
00:0b.0 RAID bus controller: CMD Technology Inc PCI0649 (rev 02)
00:0f.0 IDE interface: ALi Corporation M5229 IDE (rev c1)
01:00.0 VGA compatible controller: nVidia Corporation NV6 [Vanta/Vanta
LT] (rev 15)

disklist:
fileserver.harkness.co.uk /etc comp-root-tar
fileserver.harkness.co.uk /raid/xander comp-root-tar
fileserver.harkness.co.uk /raid/stuart comp-root-tar
fileserver.harkness.co.uk /raid/torkystuff comp-root-tar
fileserver.harkness.co.uk /raid/configs comp-root-tar
fileserver.harkness.co.uk /raid/backup comp-root-tar
fileserver.harkness.co.uk /raid/documentation comp-root-tar
fileserver.harkness.co.uk /raid/books comp-root-tar
wks.harkness.co.uk /etc comp-root-tar
wks.harkness.co.uk hda1 comp-high
wks.harkness.co.uk hda2 comp-high 
wks.harkness.co.uk hda5 comp-high 
wks.harkness.co.uk /home/xander comp-root-tar 
www2-int	hda6 comp-root
www2-int	hda7 comp-root
www2-int	hda5 comp-root
mail2-int.harkness.co.uk /dev/md1 comp-high
mail2-int.harkness.co.uk /dev/md5 comp-high
mail2-int.harkness.co.uk /dev/md0 comp-high
relay1-int.harkness.co.uk /dev/hda1 comp-high
relay1-int.harkness.co.uk /dev/hda2 comp-high
relay2-int.harkness.co.uk /dev/hda2 comp-high
relay2-int.harkness.co.uk /dev/hda1 comp-high
access.harkness.co.uk /dev/hda1 comp-high
access.harkness.co.uk /dev/hda3 comp-high
access.harkness.co.uk /dev/hda6 comp-high
fileserver.harkness.co.uk /dev/hde1 nocomp-high
fileserver.harkness.co.uk /dev/hde3 nocomp-high -1 local
www.cpicountrywide.co.uk /dev/hda6 comp-high
fileserver.harkness.co.uk /home/xander/all /home/xander {
        high-tar
        exclude "./[a-u]*"
        } 1
fileserver.harkness.co.uk /home/xander/ag /home/xander {
        high-tar
        include "./[a-g]*"
        } 1
fileserver.harkness.co.uk /home/xander/gm /home/xander {
        high-tar
        include "./[h-m]*"
        } 1
fileserver.harkness.co.uk /home/xander/nu /home/xander {
        high-tar
        include "./[n-u]*"
        } 1

amanda.conf:
org "happy"		# your organization name for reports
mailto "amanda.uk"		# space separated list of operators at
your site
dumpuser "amanda"	# the user to run dumps under
inparallel 4		# maximum dumpers that will run in parallel
netusage  600 Kbps	# maximum net bandwidth for Amanda, in KB per sec

dumpcycle 4 weeks	# the number of days in the normal dump cycle
runspercycle 5 		# the number of amdump runs in dumpcycle days
tapecycle 30 tapes	# the number of tapes in rotation
bumpsize 20 Mb		# minimum savings (threshold) to bump level 1 -> 2
bumpdays 1		# minimum days at each level
bumpmult 4		# threshold = bumpsize * bumpmult^(level-1)

etimeout 300		# number of seconds per filesystem for estimates.
runtapes 1		# number of tapes to be used in a single run of amdump
tapedev "/dev/nst0"	# the no-rewind tape device to be used
tapetype xander-compaq
labelstr "^DailySet[0-9][0-9]*$"	# label constraint regex: all tapes
must match
holdingdisk hd2 {
    directory "/home/tmp"
    use 4 Gb
    }
infofile "/var/lib/amanda/happy/curinfo"	# database filename
logdir   "/var/lib/amanda/happy"		# log directory
indexdir "/var/lib/amanda/happy/index"	# index directory
define tapetype xander-compaq {
    comment "just produced by tapetype program"
    length 9822 mbytes
    filemark 0 kbytes
    speed 973 kps
}
define dumptype global {
    comment "Global definitions"
}

define dumptype always-full {
    global
    comment "Full dump of this filesystem always"
    compress none
    priority high
    dumpcycle 0
}

define dumptype root-tar {
    global
    program "GNUTAR"
    comment "root partitions dumped with tar"
    compress none
    index
    exclude list "/usr/local/lib/amanda/exclude.gtar"
    priority low
}

define dumptype user-tar {
    root-tar
    comment "user partitions dumped with tar"
    priority medium
}

define dumptype low-tar {
    root-tar
    comment "partitions dumped with tar"
    priority low
    index
    compress client fast

}

define dumptype high-tar {
    root-tar
    comment "partitions dumped with tar"
    priority high
    index
    compress client fast

}

define dumptype comp-root-tar {
    root-tar
    comment "Root partitions with compression"
    compress client fast
    index
}

define dumptype comp-user-tar {
    user-tar
    compress client fast
}

define dumptype holding-disk {
    global
    comment "The master-host holding disk itself"
    holdingdisk no # do not use the holding disk
    priority medium
}

define dumptype comp-user {
    global
    comment "Non-root partitions on reasonably fast machines"
    compress client fast
    priority medium
}

define dumptype nocomp-user {
    comp-user
    comment "Non-root partitions on slow machines"
    compress none
}

define dumptype comp-root {
    global
    comment "Root partitions with compression"
    compress client fast
    priority low
}

define dumptype nocomp-root {
    comp-root
    comment "Root partitions without compression"
    compress none
}

define dumptype comp-high {
    global
    comment "very important partitions on fast machines"
    compress client best
    priority high
    index
}

define dumptype nocomp-high {
    comp-high
    comment "very important partitions on slow machines"
    compress none
}

define dumptype nocomp-test {
    global
    comment "test dump without compression, no /etc/dumpdates recording"
    compress none
    record no
    priority medium
}

define dumptype comp-test {
    nocomp-test
    comment "test dump with compression, no /etc/dumpdates recording"
    compress client fast
}
define interface local {
    comment "a local disk"
    use 1000 kbps
}

define interface eth1 {
    comment "100 Mbps ethernet"
    use 2000 kbps
}
define interface eth0 {
    comment "100 Mbps ethernet"
    use 2000 kbps
}

cat /proc/scsi/scsi
Attached devices:
Host: scsi0 Channel: 00 Id: 06 Lun: 00
  Vendor: COMPAQ   Model: SDT-9000         Rev: 4.20
  Type:   Sequential-Access                ANSI SCSI revision: 02
The Tape drive is a Compaq DDS3

I am running amanda again and will post details of the logs when I can
find an entry.  I have searched through /var/log/messages and have not
found anything that does not relate to dhcp and samba.

Comment 3 Xander D Harkness 2004-04-13 23:12:35 UTC
I think that you are correct in your assessment that it may not be
amanda.  There are errors on a degraded RAID set; however it seems
strange that this should interfere with AMANDA's operation.

Apr 13 19:19:27 file kernel: end_request: I/O error, dev 16:00 (hdc),
sector 0
Apr 13 19:19:27 file kernel: end_request: I/O error, dev 16:00 (hdc),
sector 2
Apr 13 19:19:27 file kernel: end_request: I/O error, dev 16:00 (hdc),
sector 4
Apr 13 19:19:27 file kernel: end_request: I/O error, dev 16:00 (hdc),
sector 6

The array is now rebuilding and I shall test again when complete.

Comment 4 Xander D Harkness 2004-06-01 18:47:08 UTC
It seems that this happens when AMANDA runs out of space in its
holding disk location.  This is repeatable for me.

Yes I know it looks kind of stupid; however there are a couple of
daemons that use that disk and it stopped when the disk became full
and when I checked on it there was disk space available.

Comment 5 Matthew Miller 2006-07-11 17:36:54 UTC
Fedora Core 1 is maintained by the Fedora Legacy project for security updates
only. If this problem is a security issue, please reopen and reassign to the
Fedora Legacy product. If it is not a security issue and hasn't been resolved in
the current FC5 updates or in the FC6 test release, reopen and change the
version to match.

Thanks!

NOTE: Fedora Core 1 is reaching the final end of support even by the Legacy
project. After Fedora Core 6 Test 2 is released (currently scheduled for July
26th), there will be no more security updates for FC1. Please use these next two
weeks to upgrade any remaining FC1 systems to a current release.



Comment 6 John Thacker 2006-10-28 17:15:56 UTC
Note that FC1 and FC2 are no longer supported even by Fedora Legacy.  Many
changes have occurred since these older releases.  Please install a supported
version of Fedora Core and retest.  If this still occurs on FC3 or FC4, please
assign to that version and Fedora Legacy.  If it still occurs on FC5 or FC6,
please reopen and assign to the correct version.  Thanks!


Note You need to log in before you can comment on or make changes to this bug.