Bug 1533871

Summary: the /boot partition grows after each update until it's at 100% causing boot loop.
Product: [oVirt] ovirt-node Reporter: Federico Sun <fsun>
Component: Installation & UpdateAssignee: Ryan Barry <rbarry>
Status: CLOSED CURRENTRELEASE QA Contact: Huijuan Zhao <huzhao>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.1CC: bugs, cshao, dfediuck, dguo, dossow, huzhao, lveyde, qiyuan, rbarry, sbonazzo, weiwang, yaniwang, ycui, yisong, yturgema, yzhao
Target Milestone: ovirt-4.2.1Flags: rule-engine: ovirt-4.2+
huzhao: testing_plan_complete+
rbarry: devel_ack+
huzhao: testing_ack+
Target Release: ---   
Hardware: All   
OS: Unspecified   
Whiteboard:
Fixed In Version: imgbased-1.0.6 Doc Type: Bug Fix
Doc Text:
Cause: To mitigate limitations in some platform utilities, RHVH copies the kernel and initrd from a subdirectory into /boot, but these files were not cleaned up when RHVH layers were removed. Consequence: After a large number of updates, /boot could fill up, leaving the system in an unbootable state. Fix: RHVH now cleans extraneous boot files on layer removal. Result: /boot will no longer fill
Story Points: ---
Clone Of:
: 1533931 (view as bug list) Environment:
Last Closed: 2018-02-12 11:51:06 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Node RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1533931    
Attachments:
Description Flags
full output from each update. none

Description Federico Sun 2018-01-12 12:48:57 UTC
Created attachment 1380417 [details]
full output from each update.

Description of problem:

Starting with RHVH 4.1 when updating to a new version, it will remove the oldest /boot/rhvh-4.1-xxxx but failed to clean up the kernel/systemmap/initramfs that are in /boot.

This will eventually leads to /boot using 100% and fail to apply any new upgrades. 


Version-Release number of selected component (if applicable):

all RHVH 4.1 images

How reproducible:

100%

Steps to Reproduce:
1. Install from scratch with RHVH-4.1.0-20170417.0.iso. Take note of the content of /boot and its size.

2. install update from 4.1-20170616.0.el7_3. Newer version vmlinuz/system.map/initramfs is placed under /boot. The size grows.


3. install next update 4.1-20170706.0.el7_3. Same behavior. 


Actual results:

 /boot partition usage is at 100%. Updates don't fail to apply. Upon reboot, it enters a boot loop.


Expected results:

With each update it would clean up the older kernels under /boot or not putting them there at all. 


Additional info:

Not affecting 4.0. Because it does not place vmlinuz/system.map/initramfs under /boot

See attached rhvh41_boot_partition_leaking.txt for full output.

Comment 2 Huijuan Zhao 2018-01-15 07:57:39 UTC
QE can reproduce this issue according to comment 0 and https://bugzilla.redhat.com/attachment.cgi?id=1380417

Comment 3 Huijuan Zhao 2018-01-17 07:02:02 UTC
Test version:
rhvh-4.2.1.1-0.20180115.0
imgbased-1.0.6-0.1.el7ev.noarch


1. Upgrade versions and test steps are same as https://bugzilla.redhat.com/show_bug.cgi?id=1533931#c4


2. If only upgrade twice, the results are correct.
Test version:
Version 1: rhvh-4.1-0.20170417.0
Version 2: rhvh-4.1-0.20170616.0
Version 3: rhvh-4.2.1.1-0.20180115.0

Test results: 
After upgrade to rhvh-4.2.1.1-0.20180115.0, deleted rhvh-4.1-0.20170417.0 related files in /boot, the deleted files are as below:
(1)config-3.10.0-514.16.1.el7.x86_64
(2)initramfs-3.10.0-514.16.1.el7.x86_64.img
(3)initramfs-3.10.0-514.16.1.el7.x86_64kdump.img
(4)symvers-3.10.0-514.16.1.el7.x86_64.gz
(5)System.map-3.10.0-514.16.1.el7.x86_64
(6)vmlinuz-3.10.0-514.16.1.el7.x86_64


So will verify this bug after the status is ON_QA and Bug 1533931 is verified.

Comment 4 Huijuan Zhao 2018-01-26 01:44:42 UTC
Change the status to verified according to Comment 3.

Comment 5 Sandro Bonazzola 2018-02-12 11:51:06 UTC
This bugzilla is included in oVirt 4.2.1 release, published on Feb 12th 2018.

Since the problem described in this bug report should be
resolved in oVirt 4.2.1 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.