Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1102123

Summary:

ose-upgrade rpms fails with no error message when /opt is out of space.

Product:

OpenShift Container Platform

Reporter:

Timothy Williams <tiwillia>

Component:

Cluster Version Operator

Assignee:

John W. Lamb <jolamb>

Status:

CLOSED WONTFIX

QA Contact:

libra bugs <libra-bugs>

Severity:

high

Docs Contact:

Priority:

medium

Version:

2.1.0

CC:

jokerman, libra-onpremise-devel, lmeyer, mmccomas, nicholas_schuetz, tiwillia

Target Milestone:

---

Target Release:

---

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2015-02-04 21:35:39 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Bug Depends On:

1102359

Bug Blocks:

Attachments:

Description	Flags
Full ose-upgrade failure output	none

Description Timothy Williams 2014-05-28 14:21:33 UTC

Description of problem:
The ose-upgrade step 'rpms' can fail if there is not enough space or inodes available in /opt. The upgrade script does not check if there is enough space and the error reported contains no useful information:

# ose-upgrade rpms
INFO: OpenShift node installed.
INFO: Setting node step 'rpms' status to UPGRADING
WARN: run_upgrade_step_rpms
This may take a while.
INFO: Running upgrade scripts in /usr/lib/ruby/site_ruby/1.8/ose-upgrade/node/upgrades/3/rpms
INFO: running /usr/lib/ruby/site_ruby/1.8/ose-upgrade/node/upgrades/3/rpms/04-both-yum-update
ERROR: run_script
/usr/lib/ruby/site_ruby/1.8/ose-upgrade/node/upgrades/3/rpms/04-both-yum-update had errors:
…
…
…
INFO: Setting node step 'rpms' status to FAILED

Version-Release number of selected component (if applicable):
2.1

How reproducible:
Always

Steps to Reproduce:
1. Have very little or no space available in /opt on an OSE 2.0 system
2. Run through the upgrade process
3.

Actual results:
/usr/lib/ruby/site_ruby/1.8/ose-upgrade/node/upgrades/3/rpms/04-both-yum-update had errors:
…
…
…
INFO: Setting node step 'rpms' status to FAILED

Expected results:
--END /usr/lib/ruby/site_ruby/1.8/ose-upgrade/node/upgrades/3/rpms/05-install-recommended-metapackages OUTPUT--
INFO: Setting node step 'rpms' status to COMPLETE
INFO: Next step is 'conf'

Additional info:
If possible, the ose-upgrade script should check if there is enough space available before upgrading.

Comment 2 Luke Meyer 2014-05-28 14:30:38 UTC

I thought yum gave a useful error on this condition. Is it getting swallowed? Is the output literally a bunch of dots there?

Comment 3 nicholas_schuetz 2014-05-28 14:37:49 UTC

Created attachment 900024 [details]
Full ose-upgrade failure output

Comment 4 nicholas_schuetz 2014-05-28 14:46:08 UTC

The idea behind this RFE is to have ose-upgrade check before a failure condition can occur. How much free space does the upgrade need? Does /opt have enough to satisfy it? If not, warn the user. If so, continue with the upgrade.

Comment 5 Luke Meyer 2014-05-28 14:50:15 UTC

Thanks - so there *are* a bunch of errors from yum in there, but they're not particularly helpful in figuring out what happened. I guess yum only checks for full file systems in the usual suspect locations, and /opt doesn't rate.

Not sure how to address this. Given that /opt is now a standard destination for SCL packages, perhaps yum should be modified. On our side I suppose we could explicitly test if /opt is full, especially after a failure.

Comment 6 Luke Meyer 2014-05-28 14:54:39 UTC

(In reply to Nicholas Schuetz from comment #4)
> The idea behind this RFE is to have ose-upgrade check before a failure
> condition can occur. How much free space does the upgrade need? Does /opt
> have enough to satisfy it? If not, warn the user. If so, continue with the
> upgrade.

Having ose-upgrade pre-emptively check for space in /opt may be indicated too, but the problem is that we really don't have any way of knowing how much space in /opt it will need - that's info that only yum has, and it will change over time as the packages in the subscription change. If these things were installed under the root FS yum is smart enough to refuse the entire transaction, but I guess not under /opt and possibly other locations.

Comment 8 nicholas_schuetz 2014-05-28 15:07:31 UTC

I also ran into a condition where /opt had no more inodes left. That error was identified and reported. But a solution was not provided to remedy the issue.

The pre-emtive message could be just a warning. Also putting something in the deployment guide identifying the need for a certain amount of space in /opt would be useful as well. An educated guess/ ballpark figure perhaps?

Comment 9 Luke Meyer 2014-05-28 19:58:52 UTC

(In reply to Nicholas Schuetz from comment #8)
> the deployment guide identifying the need for a certain amount of space in
> /opt would be useful as well. An educated guess/ ballpark figure perhaps?

I agree the manual ought to mention that installs require a fair amount of space (how much depends on cartridges installed...) in /opt, since this is an unusual place for RPMs to install content for anything aside from the RH SCL, which is rather new, and users may expect to use /opt entirely for their own uses. Indeed it may be an NFS mount, which would cause big problems because NFS doesn't store SELinux context.

I'm wondering if the attached example was output when running out of space or when running out of inodes? As far as I can tell, yum *does* check for /opt running out of space if it is a separate partition, and gives a useful error message like:

Error Summary
-------------
Disk Requirements:
  At least 95MB more space needed on the /opt filesystem.

That's pretty clear. So I'm guessing in the example you were running out of inodes, which I can recreate as looking pretty much like what you attached. I think that's worth a yum bug to have it check inodes in addition to space. Or *at least* give decent feedback on why the failure occurred after the fact.

Comment 10 nicholas_schuetz 2014-05-28 20:12:54 UTC

No that one was running out of disk I believe. When running out of inodes it told me so explicitly. Here's what that output looks like:

--- SNIP ---

Running rpm_check_debug
Running Transaction Test
Transaction Test Succeeded
Running Transaction
Transaction couldn't start:
installing package ruby193-mcollective-2.4.1-5.el6op.noarch needs 1 inodes on the /opt filesystem


[('installing package ruby193-mcollective-2.4.1-5.el6op.noarch needs 1 inodes on the /opt filesystem', (10, '/opt', 1L))]

--END /usr/lib/ruby/site_ruby/1.8/ose-upgrade/node/upgrades/3/rpms/04-both-yum-update OUTPUT--INFO: Setting node step 'rpms' status to FAILED

--- SNIP ---

Comment 11 Luke Meyer 2014-05-28 20:17:54 UTC

Bizarre; that is the exact opposite of what I observed. Would you mind indicating what version of yum and rpm were in use?

# rpm -q yum rpm
yum-3.2.29-43.el6_5.noarch
rpm-4.8.0-37.el6.x86_64

Comment 12 nicholas_schuetz 2014-05-28 20:27:14 UTC

Sure they look to be the same:

yum-3.2.29-43.el6_5.noarch
rpm-4.8.0-37.el6.x86_64

Comment 13 Luke Meyer 2014-06-02 13:03:22 UTC

So according to bug 1102359 which I filed, RPM does check for inodes and disk space before a transaction, but its accounting is fairly flawed. And this is unlikely to change any time soon.

If the tools intended for this sort of problem can't handle it, I'm not sure what we can do from the installer, but since we have much more specific scope than yum or rpm, perhaps we can fail more helpfully. At the very least we could probably check a set of filesystems for "fullness" after a failed package install to tip off the user what might have gone wrong. For the upgrade use case specifically, we might be able to guess what kind of increases in the same filesystems are likely and give a warning beforehand if they're likely to fail. We really, really don't want upgrades (in particular) to fail for reasons we can reasonably predict.

Comment 14 John W. Lamb 2014-09-05 14:09:47 UTC

Check if /opt has < 3G free and emit a warning if so?

Comment 15 Luke Meyer 2014-09-05 15:44:17 UTC

My concern is that emitting a warning is likely to get lost in the noise.

It might be wise to have an actual preflight check suite at the beginning of the the first step. And actually prompt the user if any of them come back with concerns to see if they're aware and want to continue anyway.

To begin with, could just do this check and run oo-diagnostics.

Comment 16 John W. Lamb 2015-02-04 21:35:39 UTC

Sorry for allowing this ticket to languish for so long. I don't think the upgrader is necessarily the place for this kind of check - we can't really do any better than yum+rpm. As far as the "educated guess" check is concerned, it would be easy enough to check for sufficient space on /opt, but that only handles one case out of many many possible configurations. Trying to detect which mountpoints are targeted by the package upgrades and then judge how much free space is needed on each associated partition is going to add enough complexity that I doubt it would be worthwhile.

I'm not certain how to make even a rough guess about how many free inodes will be needed.

I think it's better to instruct the user to check that their hosts have enough free space on any impacted partitions - say 3 to 5 gigs free on mountpoints at or under /opt, /var, and /usr. If they do see yum fail due to insufficient inodes or space, they should expand any partitions mentioned in the error output and then "yum reinstall" the packages listed under "Failed:" in the output.

If this doesn't seem reasonable, please feel free to re-open the bug.