Bug 1724373 - Build 420.8.20190626.0 does not complete booting within libvirt/qemu environment
Summary: Build 420.8.20190626.0 does not complete booting within libvirt/qemu environment
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: RHCOS
Version: 4.2.0
Hardware: x86_64
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 4.2.0
Assignee: Ben Howard
QA Contact: Micah Abbott
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-06-26 23:05 UTC by Ryan Phillips
Modified: 2019-07-01 21:54 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-07-01 21:54:38 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Ryan Phillips 2019-06-26 23:05:02 UTC
Description of problem:

Booted RHCOS Build 420.8.20190626.0 from https://releases-art-jenkins.cloud.paas.upshift.redhat.com/ and the image fails to boot completely.

Boot logs: https://gist.githubusercontent.com/rphillips/0b58b383651866899b8977d72a1d66ce/raw/0cc34b01485719cbb2cc9fb62eae800d92729ab9/gistfile1.txt

Version-Release number of selected component (if applicable):


How reproducible:

Every time

Steps to Reproduce:
1. Update installer to use the new image:
   hack/update-rhcos-bootimage.py https://releases-art-jenkins.cloud.paas.upshift.redhat.com/storage/releases/rhcos-4.2/420.8.20190626.0/meta.json  
2. create a libvirt installer instance
3. bootstrap machine does not complete booting

Actual results:


Expected results:


Additional info:

Andrew Jeddeloh in #forum-coreos mentioned:

"""
https://github.com/coreos/coreos-assembler/blob/master/src/gf-anaconda-cleanup#L85 there's the problem
cosa master just switched off anaconda
looks like rhcos was relying on some anaconda behavior we no longer do
https://github.com/coreos/coreos-assembler/commit/f7b22b62ac656c406e19921ffb24e34eb25c826d
"""

Comment 1 Andrew Jeddeloh 2019-06-26 23:12:14 UTC
The hacks we carried in gf-anaconda-cleanup now need to move somewhere else. They might be able to go in the rpm-ostree treefile postprocess section, though I'm not sure if that can make changes to /var that will stick when doing ostree admin deploy.

Comment 2 Colin Walters 2019-06-27 12:55:09 UTC
We need to make those dirs in the image, not the ostree, since as you hint the ostree cannot contain files that go in /var.

Bigger picture I think we *can* fix this but I'm hesitating around risk/reward right now.  One option is to create an `rhcos-4.2` branch just before the anaconda rework and let this evolve in FCOS for a bit.  But I'm going to spend a bit of time today trying to make it work at least before punting and saying we should branch.

Comment 3 Colin Walters 2019-07-01 13:29:06 UTC
OK I pushed https://github.com/coreos/coreos-assembler/tree/rhcos-4.2

Comment 4 Steve Milner 2019-07-01 13:47:40 UTC
I opened https://github.com/coreos/coreos-assembler/issues/581. We recently finished a decent amount of work to un-fork and get back on master and, while I agree we will want to branch, I'm really hoping we don't need to fork once more for stability.

In the meantime we'll work on switching DEV and ART over to the branch Walters put together.

Leaving this bug open as what we are doing here is a workaround to un-break teams and not a fix.

Comment 5 Colin Walters 2019-07-01 21:54:38 UTC
Fixed in Build 420devel.8.20190701.0


Note You need to log in before you can comment on or make changes to this bug.