Bug 1700253 - Diskimage-builder fails to generate Octavia Amphora image using RHEL8 cloud image
Summary: Diskimage-builder fails to generate Octavia Amphora image using RHEL8 cloud i...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: diskimage-builder
Version: 15.0 (Stein)
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: beta
: 15.0 (Stein)
Assignee: Nir Magnezi
QA Contact: Ofer Blaut
URL:
Whiteboard:
Depends On:
Blocks: 1623857 1708783
TreeView+ depends on / blocked
 
Reported: 2019-04-16 07:51 UTC by Nir Magnezi
Modified: 2019-09-26 10:49 UTC (History)
8 users (show)

Fixed In Version: diskimage-builder-2.23.1-0.20190531100400.0a44028.el8ost.noarch
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-09-21 11:21:11 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
dib.log (53.58 KB, text/plain)
2019-04-16 07:51 UTC, Nir Magnezi
no flags Details
dib-tracking.log (151.27 KB, text/plain)
2019-04-16 10:18 UTC, Nir Magnezi
no flags Details
diskimage-builder.log (545.68 KB, text/plain)
2019-05-23 12:29 UTC, Nir Magnezi
no flags Details


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 643731 0 'None' MERGED Add version-less RHEL element for RHEL7 and RHEL8 2020-10-22 13:21:06 UTC
OpenStack gerrit 647710 0 'None' MERGED Deprecate rhel7 in favor of rhel 2020-10-22 13:21:19 UTC
Red Hat Product Errata RHEA-2019:2811 0 None None None 2019-09-21 11:21:35 UTC

Description Nir Magnezi 2019-04-16 07:51:28 UTC
Created attachment 1555400 [details]
dib.log

Description of problem:
================
One of OSP15 main RFEs (bug 1623857) is to have an Octavia Amphora image, that is based on RHEL8.

For that, we proposed the following changes to:
Octavia: https://review.openstack.org/#/c/600381/
Diskimage-builder: https://review.openstack.org/#/c/600890/

During development we test two main flows:
1. Minimal image - tested in collaboration RelDel. 
2. Using an existing RHEL cloud image that gets placed on the file system (similar to what we do in Octavia upstream CI for CentOS7 testing).

The issue them I'm noticing is with the latter.

Version-Release number of selected component (if applicable):
==========================================
OSP15

How reproducible:
=============
Always

Steps to Reproduce:
==============
1. Fedora28 machine (meaning using Python3)
2. RHEL8 cloud image with rhos-release and OSP15 repositories.
3. export DIB_LOCAL_IMAGE=/path/to/rhel-guest-image-8.0-1854.x86_64.qcow2

Actual results:
====

INFO diskimage_builder.block_device.blockdevice [-] State already cleaned - no way to do anything here

It looks like it is coming from: https://github.com/openstack/diskimage-builder/blob/36b4bc87f940efff8eae9925f7864d776758fa5f/diskimage_builder/block_device/blockdevice.py#L427-L435 

See attached log

Comment 2 Nir Magnezi 2019-04-16 07:55:20 UTC
Correction to comment 0, the current patches are:

Octavia: https://review.openstack.org/#/c/638581/
Diskimage-builder: https://review.openstack.org/#/c/643731/

Comment 3 Carlos Goncalves 2019-04-16 08:53:40 UTC
Enable tracing for diskimage-builder with option -x on the diskimage-create.sh script.

Comment 4 Nir Magnezi 2019-04-16 10:18:21 UTC
Created attachment 1555454 [details]
dib-tracking.log

(In reply to Carlos Goncalves from comment #3)
> Enable tracing for diskimage-builder with option -x on the
> diskimage-create.sh script.

Done:

2019-04-16 10:15:37.095 | + diskimage_builder/lib/common-functions:cleanup_image_dir:228                     :   timeout 120 sh -c 'while ! sudo umount -f /tmp/dib_image.tYZYGdD1; do sleep 1; done'
2019-04-16 10:15:37.122 | + diskimage_builder/lib/common-functions:cleanup_image_dir:233                     :   rm -rf --one-file-system /tmp/dib_image.tYZYGdD1
2019-04-16 10:15:37.125 | + diskimage_builder/lib/img-functions:trap_cleanup:38                              :   exit 1

Comment 5 Carlos Goncalves 2019-04-16 10:21:51 UTC
Scroll up.

2019-04-16 10:15:35.108 | ++ command -v python3
2019-04-16 10:15:35.109 | + python_path=
2019-04-16 10:15:35.113 | ++ diskimage_builder/lib/img-functions:run_in_target:59                             :   check_break after-error run_in_target bash

Comment 6 Ian Wienand 2019-04-17 08:25:50 UTC
(In reply to Carlos Goncalves from comment #5)
> Scroll up.
> 
> 2019-04-16 10:15:35.108 | ++ command -v python3
> 2019-04-16 10:15:35.109 | + python_path=
> 2019-04-16 10:15:35.113 | ++
> diskimage_builder/lib/img-functions:run_in_target:59                        
> :   check_break after-error run_in_target bash

Right, so the ultimate problem here I think is that python3 (the user package) is not installed.  The system python3 version is.  Note this is handled in the yum-minimal path at [1] but not in this version that is using the upstream cloud image.  It's tricky, we need to get it installed very early as we're essentially depending on it.  I'm uploading the qcow2 to my build environment to some more testing; it's taking a while but will see what can come up with.

[1] https://review.openstack.org/#/c/643731/17/diskimage_builder/elements/yum-minimal/root.d/08-yum-chroot

Comment 8 Ian Wienand 2019-05-14 08:33:02 UTC
I believe the changes in https://review.opendev.org/#/c/653646/ are sufficient to get things basically working

The problem is that it creates an image with no package repositories at all.  

Now that rhel8 is released, it seems like

https://opendev.org/openstack/diskimage-builder/src/branch/master/diskimage_builder/elements/rhel-common

can be updated so that it can actually subscribe the system to the right bits and pieces.  I can't say I have any experience in this, but this is what I think is needed ontop of 653646

Comment 10 Jean Paul Gatt 2019-05-17 07:52:01 UTC
These lines in "rhel-common/pre-install.d/00-rhel-registration" cause the build process to fail on RHEL8 when using satellite as the registration source.

repos="repos --enable rhel-7-server-rpms"
satellite_repo="rhel-7-server-rh-common-rpms"

I further patched on top of https://review.opendev.org/#/c/653646/ by adding 

if [ "${DIB_RELEASE:-7}" == "7" ]; then
    repos="repos --enable rhel-7-server-rpms"
    satellite_repo="rhel-7-server-rh-common-rpms"
elif [ "${DIB_RELEASE}" == "8" ]; then
    repos="repos --enable rhel-8-for-x86_64-appstream-rpms --enable rhel-8-for-x86_64-baseos-rpms"
    satellite_repo="satellite-tools-6.5-for-rhel-8-x86_64-rpms"
fi

I am also using the os-collect-config element. I had to add to this file tripleo-image-elements/os-collect-config/pkg-map, in the family section

    "redhat": {
      "python-dev": "python2-devel",
      "python3-dev": "python3-devel"
    }

Comment 11 Nir Magnezi 2019-05-23 12:28:34 UTC
Tested this again, against a rhel8 cloud image: rhel-guest-image-8.0-1854.x86_64.qcow2   
Currently what I get is a failure to boot with: "Failed to mount /sysroot."
log: http://paste.openstack.org/show/751983/

Ian, could you please take a look? I'll follow up on the next comment and paste the diskimage-builder image building log.

Comment 12 Nir Magnezi 2019-05-23 12:29:29 UTC
Created attachment 1572467 [details]
diskimage-builder.log

Comment 13 Ian Wienand 2019-05-24 07:06:38 UTC
I've made a few comments in https://review.opendev.org/#/c/643731/ ... I've run out of time for right now but will do some local builds and take a look early next week

Comment 14 Carlos Goncalves 2019-05-27 21:38:32 UTC
I tried building one myself using [7] and [8] but I faced the same boot problem. The problem appears to be because diskimage-builder creates an ext4 file system image (default). Instructing diskimage-builder to create an XFS file system (default in RHEL) makes the built image bootable.

Comment 15 Carlos Goncalves 2019-05-27 21:39:45 UTC
Missed the references in my previous comment:

https://review.opendev.org/#/c/638581/
https://review.opendev.org/#/c/643731/

Comment 16 Nir Magnezi 2019-06-05 13:19:05 UTC
RelDel synced the OSP15 branch with upstream master, and since both [1] and [2] got merged we now have them as a part of diskimage-builder-2.23.1-0.20190531100400.0a44028.el8ost.noarch  

[1] https://review.opendev.org/#/c/643731/
[2] https://review.opendev.org/#/c/647710/

Comment 24 errata-xmlrpc 2019-09-21 11:21:11 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2019:2811


Note You need to log in before you can comment on or make changes to this bug.