Bug 1533871 - the /boot partition grows after each update until it's at 100% causing boot loop.
Summary: the /boot partition grows after each update until it's at 100% causing boot l...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-node
Classification: oVirt
Component: Installation & Update
Version: 4.1
Hardware: All
OS: Unspecified
unspecified
high
Target Milestone: ovirt-4.2.1
: ---
Assignee: Ryan Barry
QA Contact: Huijuan Zhao
URL:
Whiteboard:
Depends On:
Blocks: 1533931
TreeView+ depends on / blocked
 
Reported: 2018-01-12 12:48 UTC by Federico Sun
Modified: 2019-04-28 13:51 UTC (History)
16 users (show)

Fixed In Version: imgbased-1.0.6
Doc Type: Bug Fix
Doc Text:
Cause: To mitigate limitations in some platform utilities, RHVH copies the kernel and initrd from a subdirectory into /boot, but these files were not cleaned up when RHVH layers were removed. Consequence: After a large number of updates, /boot could fill up, leaving the system in an unbootable state. Fix: RHVH now cleans extraneous boot files on layer removal. Result: /boot will no longer fill
Clone Of:
: 1533931 (view as bug list)
Environment:
Last Closed: 2018-02-12 11:51:06 UTC
oVirt Team: Node
Embargoed:
rule-engine: ovirt-4.2+
huzhao: testing_plan_complete+
rbarry: devel_ack+
huzhao: testing_ack+


Attachments (Terms of Use)
full output from each update. (25.64 KB, text/plain)
2018-01-12 12:48 UTC, Federico Sun
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 86283 0 master MERGED osupdater: clean up the files from removed boot dirs 2018-01-15 12:35:10 UTC
oVirt gerrit 86350 0 ovirt-4.2 MERGED osupdater: clean up the files from removed boot dirs 2018-01-15 12:35:40 UTC
oVirt gerrit 86351 0 ovirt-4.1 MERGED osupdater: clean up the files from removed boot dirs 2018-01-15 12:36:57 UTC

Description Federico Sun 2018-01-12 12:48:57 UTC
Created attachment 1380417 [details]
full output from each update.

Description of problem:

Starting with RHVH 4.1 when updating to a new version, it will remove the oldest /boot/rhvh-4.1-xxxx but failed to clean up the kernel/systemmap/initramfs that are in /boot.

This will eventually leads to /boot using 100% and fail to apply any new upgrades. 


Version-Release number of selected component (if applicable):

all RHVH 4.1 images

How reproducible:

100%

Steps to Reproduce:
1. Install from scratch with RHVH-4.1.0-20170417.0.iso. Take note of the content of /boot and its size.

2. install update from 4.1-20170616.0.el7_3. Newer version vmlinuz/system.map/initramfs is placed under /boot. The size grows.


3. install next update 4.1-20170706.0.el7_3. Same behavior. 


Actual results:

 /boot partition usage is at 100%. Updates don't fail to apply. Upon reboot, it enters a boot loop.


Expected results:

With each update it would clean up the older kernels under /boot or not putting them there at all. 


Additional info:

Not affecting 4.0. Because it does not place vmlinuz/system.map/initramfs under /boot

See attached rhvh41_boot_partition_leaking.txt for full output.

Comment 2 Huijuan Zhao 2018-01-15 07:57:39 UTC
QE can reproduce this issue according to comment 0 and https://bugzilla.redhat.com/attachment.cgi?id=1380417

Comment 3 Huijuan Zhao 2018-01-17 07:02:02 UTC
Test version:
rhvh-4.2.1.1-0.20180115.0
imgbased-1.0.6-0.1.el7ev.noarch


1. Upgrade versions and test steps are same as https://bugzilla.redhat.com/show_bug.cgi?id=1533931#c4


2. If only upgrade twice, the results are correct.
Test version:
Version 1: rhvh-4.1-0.20170417.0
Version 2: rhvh-4.1-0.20170616.0
Version 3: rhvh-4.2.1.1-0.20180115.0

Test results: 
After upgrade to rhvh-4.2.1.1-0.20180115.0, deleted rhvh-4.1-0.20170417.0 related files in /boot, the deleted files are as below:
(1)config-3.10.0-514.16.1.el7.x86_64
(2)initramfs-3.10.0-514.16.1.el7.x86_64.img
(3)initramfs-3.10.0-514.16.1.el7.x86_64kdump.img
(4)symvers-3.10.0-514.16.1.el7.x86_64.gz
(5)System.map-3.10.0-514.16.1.el7.x86_64
(6)vmlinuz-3.10.0-514.16.1.el7.x86_64


So will verify this bug after the status is ON_QA and Bug 1533931 is verified.

Comment 4 Huijuan Zhao 2018-01-26 01:44:42 UTC
Change the status to verified according to Comment 3.

Comment 5 Sandro Bonazzola 2018-02-12 11:51:06 UTC
This bugzilla is included in oVirt 4.2.1 release, published on Feb 12th 2018.

Since the problem described in this bug report should be
resolved in oVirt 4.2.1 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.