Bug 1102123
| Summary: | ose-upgrade rpms fails with no error message when /opt is out of space. | ||||||
|---|---|---|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Timothy Williams <tiwillia> | ||||
| Component: | Cluster Version Operator | Assignee: | John W. Lamb <jolamb> | ||||
| Status: | CLOSED WONTFIX | QA Contact: | libra bugs <libra-bugs> | ||||
| Severity: | high | Docs Contact: | |||||
| Priority: | medium | ||||||
| Version: | 2.1.0 | CC: | jokerman, libra-onpremise-devel, lmeyer, mmccomas, nicholas_schuetz, tiwillia | ||||
| Target Milestone: | --- | ||||||
| Target Release: | --- | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2015-02-04 21:35:39 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | 1102359 | ||||||
| Bug Blocks: | |||||||
| Attachments: |
|
||||||
|
Description
Timothy Williams
2014-05-28 14:21:33 UTC
I thought yum gave a useful error on this condition. Is it getting swallowed? Is the output literally a bunch of dots there? Created attachment 900024 [details]
Full ose-upgrade failure output
The idea behind this RFE is to have ose-upgrade check before a failure condition can occur. How much free space does the upgrade need? Does /opt have enough to satisfy it? If not, warn the user. If so, continue with the upgrade. Thanks - so there *are* a bunch of errors from yum in there, but they're not particularly helpful in figuring out what happened. I guess yum only checks for full file systems in the usual suspect locations, and /opt doesn't rate. Not sure how to address this. Given that /opt is now a standard destination for SCL packages, perhaps yum should be modified. On our side I suppose we could explicitly test if /opt is full, especially after a failure. (In reply to Nicholas Schuetz from comment #4) > The idea behind this RFE is to have ose-upgrade check before a failure > condition can occur. How much free space does the upgrade need? Does /opt > have enough to satisfy it? If not, warn the user. If so, continue with the > upgrade. Having ose-upgrade pre-emptively check for space in /opt may be indicated too, but the problem is that we really don't have any way of knowing how much space in /opt it will need - that's info that only yum has, and it will change over time as the packages in the subscription change. If these things were installed under the root FS yum is smart enough to refuse the entire transaction, but I guess not under /opt and possibly other locations. I also ran into a condition where /opt had no more inodes left. That error was identified and reported. But a solution was not provided to remedy the issue. The pre-emtive message could be just a warning. Also putting something in the deployment guide identifying the need for a certain amount of space in /opt would be useful as well. An educated guess/ ballpark figure perhaps? (In reply to Nicholas Schuetz from comment #8) > the deployment guide identifying the need for a certain amount of space in > /opt would be useful as well. An educated guess/ ballpark figure perhaps? I agree the manual ought to mention that installs require a fair amount of space (how much depends on cartridges installed...) in /opt, since this is an unusual place for RPMs to install content for anything aside from the RH SCL, which is rather new, and users may expect to use /opt entirely for their own uses. Indeed it may be an NFS mount, which would cause big problems because NFS doesn't store SELinux context. I'm wondering if the attached example was output when running out of space or when running out of inodes? As far as I can tell, yum *does* check for /opt running out of space if it is a separate partition, and gives a useful error message like: Error Summary ------------- Disk Requirements: At least 95MB more space needed on the /opt filesystem. That's pretty clear. So I'm guessing in the example you were running out of inodes, which I can recreate as looking pretty much like what you attached. I think that's worth a yum bug to have it check inodes in addition to space. Or *at least* give decent feedback on why the failure occurred after the fact. No that one was running out of disk I believe. When running out of inodes it told me so explicitly. Here's what that output looks like:
--- SNIP ---
Running rpm_check_debug
Running Transaction Test
Transaction Test Succeeded
Running Transaction
Transaction couldn't start:
installing package ruby193-mcollective-2.4.1-5.el6op.noarch needs 1 inodes on the /opt filesystem
[('installing package ruby193-mcollective-2.4.1-5.el6op.noarch needs 1 inodes on the /opt filesystem', (10, '/opt', 1L))]
--END /usr/lib/ruby/site_ruby/1.8/ose-upgrade/node/upgrades/3/rpms/04-both-yum-update OUTPUT--INFO: Setting node step 'rpms' status to FAILED
--- SNIP ---
Bizarre; that is the exact opposite of what I observed. Would you mind indicating what version of yum and rpm were in use? # rpm -q yum rpm yum-3.2.29-43.el6_5.noarch rpm-4.8.0-37.el6.x86_64 Sure they look to be the same: yum-3.2.29-43.el6_5.noarch rpm-4.8.0-37.el6.x86_64 So according to bug 1102359 which I filed, RPM does check for inodes and disk space before a transaction, but its accounting is fairly flawed. And this is unlikely to change any time soon. If the tools intended for this sort of problem can't handle it, I'm not sure what we can do from the installer, but since we have much more specific scope than yum or rpm, perhaps we can fail more helpfully. At the very least we could probably check a set of filesystems for "fullness" after a failed package install to tip off the user what might have gone wrong. For the upgrade use case specifically, we might be able to guess what kind of increases in the same filesystems are likely and give a warning beforehand if they're likely to fail. We really, really don't want upgrades (in particular) to fail for reasons we can reasonably predict. Check if /opt has < 3G free and emit a warning if so? My concern is that emitting a warning is likely to get lost in the noise. It might be wise to have an actual preflight check suite at the beginning of the the first step. And actually prompt the user if any of them come back with concerns to see if they're aware and want to continue anyway. To begin with, could just do this check and run oo-diagnostics. Sorry for allowing this ticket to languish for so long. I don't think the upgrader is necessarily the place for this kind of check - we can't really do any better than yum+rpm. As far as the "educated guess" check is concerned, it would be easy enough to check for sufficient space on /opt, but that only handles one case out of many many possible configurations. Trying to detect which mountpoints are targeted by the package upgrades and then judge how much free space is needed on each associated partition is going to add enough complexity that I doubt it would be worthwhile. I'm not certain how to make even a rough guess about how many free inodes will be needed. I think it's better to instruct the user to check that their hosts have enough free space on any impacted partitions - say 3 to 5 gigs free on mountpoints at or under /opt, /var, and /usr. If they do see yum fail due to insufficient inodes or space, they should expand any partitions mentioned in the error output and then "yum reinstall" the packages listed under "Failed:" in the output. If this doesn't seem reasonable, please feel free to re-open the bug. |