444762 – find diskspace: not enough diskspace appears to be incorrectly fatal

Bug 444762 - find diskspace: not enough diskspace appears to be incorrectly fatal

Summary: find diskspace: not enough diskspace appears to be incorrectly fatal

Keywords:
Status:	CLOSED WONTFIX
Alias:	None
Product:	Red Hat Enterprise Linux 5
Classification:	Red Hat
Component:	amanda
Sub Component:
Version:	5.3
Hardware:	x86_64
OS:	Linux
Priority:	low
Severity:	medium
Target Milestone:	rc
Target Release:	---
Assignee:	Petr Hracek
QA Contact:	qe-baseos-daemons
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2008-04-30 14:35 UTC by Noah Sheppard
Modified:	2013-03-07 12:15 UTC (History)
CC List:	0 users
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2013-03-07 12:15:11 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Noah Sheppard 2008-04-30 14:35:27 UTC

Description of problem:

Intermittently our daily backups fail on the server side.  The amdump log states:
find diskspace: not enough diskspace. Left with 201536 K
find diskspace: not enough diskspace. Left with 201536 K
driver: Don't know how to send ABORT command to chunker
taper: DONE [idle wait: 2730.174 secs]
chunker: error [bad command after RQ-MORE-DISK: "QUIT"]
chunker: time 1375.165: error [bad command after RQ-MORE-DISK: "QUIT"]
chunker: time 1375.165: pid 9916 finish time Wed Apr 30 00:46:07 2008
taper: writing end marker. [tape13 OK kb 3445216 fm 6]
dumper: kill index command
amdump: end at Wed Apr 30 00:46:07 EDT 2008

It is my understanding that the find diskspace error should not actually be a
fatal error, but it appears that sometimes it is treated as fatal and the server
quits the backup.

The client's sendbackup log reflects this:
sendbackup: time 1230.584:  52:  normal(|): gtar:
./var/spool/postfix/public/flush: socket ignored
sendbackup: time 1230.584:  52:  normal(|): gtar:
./var/spool/postfix/public/showq: socket ignored
sendbackup: time 1375.125: index tee cannot write [Broken pipe]
sendbackup: time 1375.139: pid 11355 finish time Wed Apr 30 00:46:07 2008
sendbackup: time 1375.125: 119: strange(?):
sendbackup: time 1375.139: 119: strange(?): gzip: stdout: Broken pipe
sendbackup: time 1375.139: 119: strange(?): sendbackup: index tee cannot write
[Broken pipe]
sendbackup: time 1375.155:  46:    size(|): Total bytes written: 14438103040
(14GiB, ?/s)
sendbackup: time 1375.155: 119: strange(?): gtar: -: Wrote only 8192 of 10240 bytes
sendbackup: time 1375.181: 119: strange(?): gtar: Error is not recoverable:
exiting now
sendbackup: time 1375.181: 119: strange(?): sed: couldn't flush stdout: Broken pipe
sendbackup: time 1375.181: error [compress returned 1, /bin/tar returned 2]
sendbackup: time 1375.181: pid 11351 finish time Wed Apr 30 00:46:07 2008

as does the client's amandad log:


amandad: time 0.151: stream_accept: connection from 10.120.1.15.34223
amandad: time 0.151: stream_accept: connection from 10.120.1.15.42220
amandad: time 0.151: stream_accept: connection from 10.120.1.15.58129
amandad: time 1375.179: sending NAK pkt:
<<<<<
ERROR write error on stream 52037: write error on stream 52037: Connection reset
by peer
>>>>>

Version-Release number of selected component (if applicable): 2.5.0p4

Looking around in the amanda-users list indicates this may be a bug in the
version of amanda that RHEL5 is using (see
http://readlist.com/lists/amanda.org/amanda-users/2/10137.html).

How reproducible: Set up a backup job over the network, then wait about a month
for it to start sporadically failing for a couple weeks, then proceed to work
great for a few months before it starts sporadically failing again.
  
Actual results: Every few months, for a week or two, the dumps fail for one or
two servers with the above errors.  Then everything starts working again for a
few more months, then the errors again.

Expected results: Backup should succeed without a hitch.

Comment 2 Daniel Novotny 2010-04-14 12:36:54 UTC

hello Noah,
did you contact http://www.redhat.com/support with this issue? it is unlikely we will rebase amanda in rhel5 and maybe they can help you work around the problem

Comment 4 RHEL Program Management 2010-08-09 18:26:49 UTC

This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated in the
current release, Red Hat is unfortunately unable to address this
request at this time. Red Hat invites you to ask your support
representative to propose this request, if appropriate and relevant,
in the next release of Red Hat Enterprise Linux.

Comment 5 RHEL Program Management 2011-01-11 20:39:49 UTC

This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated in the
current release, Red Hat is unfortunately unable to address this
request at this time. Red Hat invites you to ask your support
representative to propose this request, if appropriate and relevant,
in the next release of Red Hat Enterprise Linux.

Comment 6 RHEL Program Management 2011-01-11 22:15:59 UTC

This request was erroneously denied for the current release of
Red Hat Enterprise Linux.  The error has been fixed and this
request has been re-proposed for the current release.

Comment 7 RHEL Program Management 2011-05-31 13:28:51 UTC

This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated in the
current release, Red Hat is unfortunately unable to address this
request at this time. Red Hat invites you to ask your support
representative to propose this request, if appropriate and relevant,
in the next release of Red Hat Enterprise Linux.

Comment 8 RHEL Program Management 2012-06-12 01:04:11 UTC

This request was evaluated by Red Hat Product Management for
inclusion in the current release of Red Hat Enterprise Linux.
Because the affected component is not scheduled to be updated in the
current release, Red Hat is unfortunately unable to address this
request at this time. Red Hat invites you to ask your support
representative to propose this request, if appropriate and relevant,
in the next release of Red Hat Enterprise Linux.

Comment 9 Petr Hracek 2013-03-07 12:15:11 UTC

I am sorry, but it is too late in the RHEL-5 release cycle [1].  At the moment we are addressing only critical and security related issues in RHEL-5.  This one is fixed in RHEL-6.  I am closing the bug as WONTFIX.

[1] https://access.redhat.com/support/policy/updates/errata/

Note You need to log in before you can comment on or make changes to this bug.