Red Hat Bugzilla – Bug 480240
Need better handling of no space left rpm errors
Last modified: 2015-04-10 05:20:32 EDT
Description of problem:
A couple times recently during updates I've run out of space:
Error unpacking rpm package 1:openoffice.org-core-3.0.1-15.1.fc10.i386
error: unpacking of archive failed on file /usr/lib/openoffice.org/basis3.0/program/libbf_svxli.so;496f2290: cpio: write failed - No space left on device
Error unpacking rpm package wine-core-1.1.12-1.fc10.i386
error: unpacking of archive failed on file /usr/lib/wine/credui.dll.so;496f2290: cpio: write failed - No space left on device
and so on.
Now, it appears that yum (and perhaps rpm?) believes than nothing went wrong:
There were non-fatal errors in the transaction
But many (most?) of the packages for which there are errors are then no longer in the rpm database.
Version-Release number of selected component (if applicable):
If we run out of space in the download, then yum can do something about it ... and that should work, although I'm not sure anyone regularly tests it.
If we run out of space when installing then it's all on rpm, and has worked but again I'm not sure anyone regularly tests it.
Rpm too checks for disk space at the beginning of transaction, and quick test shows it basically working (and no, I don't think anybody tests it regularly).
The problem is any number of things can use up more disk-space while the transaction is in process: package scriptlets can generate arbitrarily amount of content which goes outside the any disk-space checks we do, and then there's the rest of the system - logs get written to, users doing things...
We do check to see if there is sufficient disk space before downloading a package. It may occasionally get a false-positive insufficient disk space when we're restarting a partially-finished download but in general it should be correct.
I don't think you are quite understanding the issue. I'm not (particularly) interested in pre-transaction space checks. (Although I would be somewhat interested to know how much space on /u rpm *thought* it needed before running the transaction - can I get yum to print that out?) In this case /var is a separate partition and it had plenty of space to download the rpms. The out of space condition happened while rpm was unpacking the files. I'm also partly to blame by setting my reserved block percentage on / to % so that there is no slop for rpm if it gets the calculation wrong (though perhaps it needs to be more conservative?)
What I am concerned about is that rpm apparently did not detect the out of space condition, it seems like it could have handled it better, and it did not report any errors back to yum. Instead of aborting the transaction (if possible?) after the first cpio write error, it kept plowing on. Also for unknown reasons packages that yum/rpm thought it upgraded were not in the rpm database after the transaction - which seems doubly strange since the rpm database was on a separate /var partition and unaffected by the out of space condition.
Created attachment 329217 [details]
Kind of hard to read as stdout/stderr get interleaved oddly.
If you want full-stop transaction termination when ENOSPC is detected,
then that decision has to be done by the yum application, not rpmlib.
There should be a callback with the ENOSPC error when
encountered, and its up to the yum application, not the rpmlib library,
to decide how to handle a ENOSPC condition.
Orion: there are several issues/bugs here really, yum included (in no particular order):
a) Packages getting removed although install of newer fails (bug 454903, this is certainly an rpm bug, still present in rpm 4.6.0, some steps towards fixing it have been taken upstream but far from resolved yet)
b) Rpm disk-usage space estimate at the start of transaction: as long as the rest of the system is running while rpm is run: it doesn't matter how conservative rpm's estimates is made, as long as others can fill up the disk while rpm is running there's no way to "fix" that.
c) Rpm transaction continuing despite out-of-disk errors: in many cases it's not clear whether it's better to try to continue the transaction despite some errors (aborting at a bad spot can be just as dangerous). But in case of running out of disk, no good is going to come out of blindly continuing, and the bug in a) makes this behavior even worse. A callback, be it ENOSPC-specific or more general "aborting now might be a good idea" could be added of course.
d) Yum reporting success even though something clearly failed, see bug 477849.
okay, I just tested the following on a kvm img in rawhide:
if there is not enough disk space to install the pkgs in question then the transaction test catches this.
So that part is correct.
The issue above is entirely about transaction error reporting.
I just wanted to clear that up.
This message is a reminder that Fedora 10 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 10. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora
'version' of '10'.
Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version prior to Fedora 10's end of life.
Bug Reporter: Thank you for reporting this issue and we are sorry that
we may not be able to fix it before Fedora 10 is end of life. If you
would still like to see this bug fixed and are able to reproduce it
against a later version of Fedora please change the 'version' of this
bug to the applicable version. If you are unable to change the version,
please add a comment here and someone will do it for you.
Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.
The process we are following is described here:
This package has changed ownership in the Fedora Package Database. Reassigning to the new owner of this component.
This is pretty old now. There have been some improvement for disk space handling since then. As the issue is very broad and touches many different places I close this for now. If there are still issues feel free to either reopen or open a new, more specific bug.