Created attachment 1555400 [details] dib.log Description of problem: ================ One of OSP15 main RFEs (bug 1623857) is to have an Octavia Amphora image, that is based on RHEL8. For that, we proposed the following changes to: Octavia: https://review.openstack.org/#/c/600381/ Diskimage-builder: https://review.openstack.org/#/c/600890/ During development we test two main flows: 1. Minimal image - tested in collaboration RelDel. 2. Using an existing RHEL cloud image that gets placed on the file system (similar to what we do in Octavia upstream CI for CentOS7 testing). The issue them I'm noticing is with the latter. Version-Release number of selected component (if applicable): ========================================== OSP15 How reproducible: ============= Always Steps to Reproduce: ============== 1. Fedora28 machine (meaning using Python3) 2. RHEL8 cloud image with rhos-release and OSP15 repositories. 3. export DIB_LOCAL_IMAGE=/path/to/rhel-guest-image-8.0-1854.x86_64.qcow2 Actual results: ==== INFO diskimage_builder.block_device.blockdevice [-] State already cleaned - no way to do anything here It looks like it is coming from: https://github.com/openstack/diskimage-builder/blob/36b4bc87f940efff8eae9925f7864d776758fa5f/diskimage_builder/block_device/blockdevice.py#L427-L435 See attached log
Correction to comment 0, the current patches are: Octavia: https://review.openstack.org/#/c/638581/ Diskimage-builder: https://review.openstack.org/#/c/643731/
Enable tracing for diskimage-builder with option -x on the diskimage-create.sh script.
Created attachment 1555454 [details] dib-tracking.log (In reply to Carlos Goncalves from comment #3) > Enable tracing for diskimage-builder with option -x on the > diskimage-create.sh script. Done: 2019-04-16 10:15:37.095 | + diskimage_builder/lib/common-functions:cleanup_image_dir:228 : timeout 120 sh -c 'while ! sudo umount -f /tmp/dib_image.tYZYGdD1; do sleep 1; done' 2019-04-16 10:15:37.122 | + diskimage_builder/lib/common-functions:cleanup_image_dir:233 : rm -rf --one-file-system /tmp/dib_image.tYZYGdD1 2019-04-16 10:15:37.125 | + diskimage_builder/lib/img-functions:trap_cleanup:38 : exit 1
Scroll up. 2019-04-16 10:15:35.108 | ++ command -v python3 2019-04-16 10:15:35.109 | + python_path= 2019-04-16 10:15:35.113 | ++ diskimage_builder/lib/img-functions:run_in_target:59 : check_break after-error run_in_target bash
(In reply to Carlos Goncalves from comment #5) > Scroll up. > > 2019-04-16 10:15:35.108 | ++ command -v python3 > 2019-04-16 10:15:35.109 | + python_path= > 2019-04-16 10:15:35.113 | ++ > diskimage_builder/lib/img-functions:run_in_target:59 > : check_break after-error run_in_target bash Right, so the ultimate problem here I think is that python3 (the user package) is not installed. The system python3 version is. Note this is handled in the yum-minimal path at [1] but not in this version that is using the upstream cloud image. It's tricky, we need to get it installed very early as we're essentially depending on it. I'm uploading the qcow2 to my build environment to some more testing; it's taking a while but will see what can come up with. [1] https://review.openstack.org/#/c/643731/17/diskimage_builder/elements/yum-minimal/root.d/08-yum-chroot
I believe the changes in https://review.opendev.org/#/c/653646/ are sufficient to get things basically working The problem is that it creates an image with no package repositories at all. Now that rhel8 is released, it seems like https://opendev.org/openstack/diskimage-builder/src/branch/master/diskimage_builder/elements/rhel-common can be updated so that it can actually subscribe the system to the right bits and pieces. I can't say I have any experience in this, but this is what I think is needed ontop of 653646
These lines in "rhel-common/pre-install.d/00-rhel-registration" cause the build process to fail on RHEL8 when using satellite as the registration source. repos="repos --enable rhel-7-server-rpms" satellite_repo="rhel-7-server-rh-common-rpms" I further patched on top of https://review.opendev.org/#/c/653646/ by adding if [ "${DIB_RELEASE:-7}" == "7" ]; then repos="repos --enable rhel-7-server-rpms" satellite_repo="rhel-7-server-rh-common-rpms" elif [ "${DIB_RELEASE}" == "8" ]; then repos="repos --enable rhel-8-for-x86_64-appstream-rpms --enable rhel-8-for-x86_64-baseos-rpms" satellite_repo="satellite-tools-6.5-for-rhel-8-x86_64-rpms" fi I am also using the os-collect-config element. I had to add to this file tripleo-image-elements/os-collect-config/pkg-map, in the family section "redhat": { "python-dev": "python2-devel", "python3-dev": "python3-devel" }
Tested this again, against a rhel8 cloud image: rhel-guest-image-8.0-1854.x86_64.qcow2 Currently what I get is a failure to boot with: "Failed to mount /sysroot." log: http://paste.openstack.org/show/751983/ Ian, could you please take a look? I'll follow up on the next comment and paste the diskimage-builder image building log.
Created attachment 1572467 [details] diskimage-builder.log
I've made a few comments in https://review.opendev.org/#/c/643731/ ... I've run out of time for right now but will do some local builds and take a look early next week
I tried building one myself using [7] and [8] but I faced the same boot problem. The problem appears to be because diskimage-builder creates an ext4 file system image (default). Instructing diskimage-builder to create an XFS file system (default in RHEL) makes the built image bootable.
Missed the references in my previous comment: https://review.opendev.org/#/c/638581/ https://review.opendev.org/#/c/643731/
RelDel synced the OSP15 branch with upstream master, and since both [1] and [2] got merged we now have them as a part of diskimage-builder-2.23.1-0.20190531100400.0a44028.el8ost.noarch [1] https://review.opendev.org/#/c/643731/ [2] https://review.opendev.org/#/c/647710/
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2019:2811