Hide Forgot
Description of problem: Building an image using the toml file (and Ansible roles) from https://github.com/myllynen/rhel-image works with composer-cli-28.14.62-1.el8.x86_64 but fails using weldr-client-35.3-2.el8.x86_64. Tried both as root and as the dedicated user. There are several warnings in the logs. Ideally we'd have clean logs for successful builds as the warnings might cause confusion or false alarms among users. I'm not sure are these package-specific issues or something else. For example: Output: [/usr/lib/tmpfiles.d/grafana.conf:1] Unknown user 'grafana'. [/usr/lib/tmpfiles.d/journal-nocow.conf:26] Failed to resolve specifier: uninitialized /etc detected, skipping [/usr/lib/tmpfiles.d/rpcbind.conf:2] Unknown user 'rpc'. The real issue seems to be: Failed to create file /sys/fs/selinux/checkreqprot: Read-only file system truncate: Invalid number: ‘18446744073709549623’: Value too large for defined data type Traceback (most recent call last): File "/run/osbuild/bin/org.osbuild.truncate", line 54, in <module> ret = main(args["tree"], args["options"]) File "/run/osbuild/bin/org.osbuild.truncate", line 47, in main subprocess.run(["truncate", "--size", size, dest], check=True) File "/usr/lib64/python3.6/subprocess.py", line 438, in run output=stdout, stderr=stderr) subprocess.CalledProcessError: Command '['truncate', '--size', '18446744073709549623', '/run/osbuild/tree/disk.img']' returned non-zero exit status 1. This should be easy to reproduce with the toml file in the Git repo. Thanks. Version-Release number of selected component (if applicable): composer-cli-28.14.62-1.el8.x86_64 weldr-client-35.3-2.el8.x86_64
This isn't related to composer-cli, all it does it pass the blueprint to the server. What version of osbuild-composer and osbuild were you using when it worked, and what versions when it failed? That error traceback is python, from osbuild's org.osbuild.truncate module.
Right, I had updated only weldr-client and then started seeing the issue. Now tested again with: osbuild-43-1.el8.noarch osbuild-composer-40-1.el8.x86_64 osbuild-composer-core-40-1.el8.x86_64 osbuild-composer-dnf-json-40-1.el8.x86_64 osbuild-composer-worker-40-1.el8.x86_64 osbuild-ostree-43-1.el8.noarch osbuild-selinux-43-1.el8.noarch python3-osbuild-43-1.el8.noarch selinux-policy-3.14.3-85.el8.noarch selinux-policy-targeted-3.14.3-85.el8.noarch weldr-client-35.3-2.el8.x86_64 And see this: Stage org.osbuild.sfdisk Output: [/usr/lib/tmpfiles.d/journal-nocow.conf:26] Failed to resolve specifier: uninitialized /etc detected, skipping All rules containing unresolvable specifiers will be skipped. Failed to create file /sys/fs/selinux/checkreqprot: Read-only file system label: gpt label-id: D209C89E-EA5E-4FBD-B161-B461CCE297E0 start="2048", size="2048", type="21686148-6449-6E6F-744E-656564454649", uuid="FAC7F1FB-3E8D-4137-A512-961DE09A5549", bootable start="4096", size="204800", type="C12A7328-F81F-11D2-BA4B-00A0C93EC93B", uuid="68B2905B-DF3E-4FB3-80FA-49D1E773AA33" start="208896", size="18446744073709343024", type="0FC63DAF-8483-4772-8E79-3D69D8477DE4", uuid="6264D520-3FB9-423F-8AB8-7A0A8E3D3562" Sector 2048 already used. Failed to add #1 partition: Numerical result out of range Traceback (most recent call last): File "/run/osbuild/bin/org.osbuild.sfdisk", line 203, in <module> ret = main(args["devices"], args["options"]) File "/run/osbuild/bin/org.osbuild.sfdisk", line 194, in main pt.write_to(device) File "/run/osbuild/bin/org.osbuild.sfdisk", line 151, in write_to check=True) File "/usr/lib64/python3.6/subprocess.py", line 438, in run output=stdout, stderr=stderr) subprocess.CalledProcessError: Command '['sfdisk', '-q', '--no-tell-kernel', '/dev/loop0']' returned non-zero exit status 1. If this is a hickup with the toml file then it's a bit hard to tell from the above message especially since it works on RHEL 8.5. Thanks.
Hello Marko, can you post the blueprint that doesn't work for you as an attachment? I'm not sure if this bus is up-to-date with the repository you linked.
Thanks for looking into this. Here's a permalink to the blueprint: https://github.com/myllynen/rhel-image/blob/d551286c8a7a2c42c3bb5c44f063abed6708d6a8/base-image.toml Build ok with RHEL 8.5 packages but fails with these: # rpm -qa | grep -i -e osbuild -e weldr | sort osbuild-52-1.el8eng.noarch osbuild-composer-46-1.el8.x86_64 osbuild-composer-core-46-1.el8.x86_64 osbuild-composer-dnf-json-46-1.el8.x86_64 osbuild-composer-worker-46-1.el8.x86_64 osbuild-luks2-52-1.el8eng.noarch osbuild-lvm2-52-1.el8eng.noarch osbuild-ostree-52-1.el8eng.noarch osbuild-selinux-52-1.el8eng.noarch osbuild-tools-52-1.el8eng.noarch python3-osbuild-52-1.el8eng.noarch weldr-client-35.5-1.el8.x86_64 Thanks.
Hi Marko, sorry, I completely forgot about this. Can you add selinux-policy-targeted into the list of packages in the blueprint? We are currently observing a bug that causes osbuild-composer to not be able to depsolve packages with conditional dependencies. Thanks, Ondřej
Hi Ondřej, I added selinux-policy-targeted into the list of packages in the blueprint but still see the same partition creation related issue. I'm attaching a slightly more complete output from my test. I used the blueprint I linked above so this should be straightforward for you to reproduce. Thanks.
Created attachment 1874995 [details] test output
Hi @myllynen, I tried your blueprint on a fresh RHEL 8.6 install, and it built alright: $ rpm -qa | grep -i -e osbuild -e weldr | sort osbuild-53-2.el8.noarch osbuild-composer-46.3-1.el8_6.x86_64 osbuild-composer-core-46.3-1.el8_6.x86_64 osbuild-composer-dnf-json-46.3-1.el8_6.x86_64 osbuild-composer-worker-46.3-1.el8_6.x86_64 osbuild-luks2-53-2.el8.noarch osbuild-lvm2-53-2.el8.noarch osbuild-ostree-53-2.el8.noarch osbuild-selinux-53-2.el8.noarch python3-osbuild-53-2.el8.noarch weldr-client-35.5-1.el8.x86_64 Can you retest? I'm not entirely sure what else to do here because you used packages that weren't shipped to customers, we only shipped the following ones: dnf list --showduplicates weldr-client composer-cli osbuild osbuild-composer Installed Packages weldr-client.x86_64 35.5-1.el8 @rhel-8-for-x86_64-appstream-rpms Available Packages composer-cli.x86_64 28.14.23-1.el8 rhel-8-for-x86_64-appstream-rpms composer-cli.x86_64 28.14.23-5.el8_0 rhel-8-for-x86_64-appstream-rpms composer-cli.x86_64 28.14.23-7.el8_0 rhel-8-for-x86_64-appstream-rpms composer-cli.x86_64 28.14.30-1.el8 rhel-8-for-x86_64-appstream-rpms composer-cli.x86_64 28.14.42-1.el8 rhel-8-for-x86_64-appstream-rpms composer-cli.x86_64 28.14.42-2.el8_2 rhel-8-for-x86_64-appstream-rpms composer-cli.x86_64 28.14.55-2.el8 rhel-8-for-x86_64-appstream-rpms composer-cli.x86_64 28.14.58-1.el8 rhel-8-for-x86_64-appstream-rpms composer-cli.x86_64 28.14.62-1.el8 rhel-8-for-x86_64-appstream-rpms composer-cli.x86_64 28.14.68-1.el8 rhel-8-for-x86_64-appstream-rpms osbuild.noarch 18-3.el8 rhel-8-for-x86_64-appstream-rpms osbuild.noarch 27.2-1.el8 rhel-8-for-x86_64-appstream-rpms osbuild.noarch 27.3-2.el8_4 rhel-8-for-x86_64-appstream-rpms osbuild.noarch 35-3.el8 rhel-8-for-x86_64-appstream-rpms osbuild.noarch 53-2.el8 rhel-8-for-x86_64-appstream-rpms osbuild-composer.x86_64 20.1-1.el8 rhel-8-for-x86_64-appstream-rpms osbuild-composer.x86_64 28.4-1.el8 rhel-8-for-x86_64-appstream-rpms osbuild-composer.x86_64 28.6-1.el8_4 rhel-8-for-x86_64-appstream-rpms osbuild-composer.x86_64 28.7-1.el8_4 rhel-8-for-x86_64-appstream-rpms osbuild-composer.x86_64 33.2-1.el8 rhel-8-for-x86_64-appstream-rpms osbuild-composer.x86_64 46.1-1.el8 rhel-8-for-x86_64-appstream-rpms osbuild-composer.x86_64 46.3-1.el8_6 rhel-8-for-x86_64-appstream-rpms weldr-client.x86_64 35.5-1.el8 rhel-8-for-x86_64-appstream-rpms
I retried and still see the failure. Here are the steps what I did this time: 1) Installed a completely new RHEL 8.6 Server test VM using RHEL 8.6 DVD ISO and Minimal install 2) Subscribed the system to RHSM 3) yum update and reboot 4) yum install composer-cli git-core osbuild-composer tar wget; systemctl enable osbuild-composer.socket and reboot 5) Used wget to fetch the reproducer toml file Then used the following commands as root: # composer-cli blueprints push base-image.toml # composer-cli blueprints depsolve base-image # composer-cli compose start --size 20480 base-image qcow2 # composer-cli compose info 82e3f27f-9ca9-4914-a3c6-25cf1a7fabd8 | grep FAILED 82e3f27f-9ca9-4914-a3c6-25cf1a7fabd8 FAILED base-image 2022.01.24 qcow2 20480 Downloading the logs for the build show again the same XFS related message but nothing helpful, really. At this point as an end-user I can't see a way to investigate this further. Is it something to do with the VM or disk/partitioning/storage or the blueprint or the commands or SELinux or permissions or a missing dependency or something else, no hints whatsoever in the logs. Would it be possible that you ping me over the chat and I would then give you full access to this test VM so that you could then investigate this further hands-on? Thanks.
Thanks, Marko, for giving me access. I found two issues. Firstly, let's see the documentation for the --size parameter of composer-cli compose start: --size uint Size of image in MiB The thing is that the Weldr API accepts the size parameter in bytes instead of MiB. The old composer-cli did convert the units from MiB to B before sending the compose request, but weldr-client doesn't do it. @bcl, can you fix that? @elpereir, this would be valuable in the 8.7 known issues: The --size parameters of composer-cli compose start is documented to be in MiB, but this is currently broken and composer-cli treats it as bytes instead. The workaround is to multiply the size by 1048576. The better workaround is to use customizations.filesystem which allows more granular control over filesystems and accepts units like MiB or GiB. Marko, the workaround for you is the same: either use --size 21474836480 when using the buggy weldr-client version, or switch to [customizations.filesystem] (preferred), see https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/composing_a_customized_rhel_system_image/creating-system-images-with-composer-command-line-interface_composing-a-customized-rhel-system-image#image-customizations_creating-system-images-with-composer-command-line-interface. ---- The next issue is that osbuild-composer allowed building of such a small image that it couldn't fit an empty XFS root partition. The reason is that our 8.5 definitions shipped to 8.6 version of osbuild-composer don't specify a minimum size of / partition which is a bug. Note that building 8.6 on 8.6 is fine because the minimum size is there. I was thinking if we want to fix this in 8.6 but since building 8.6 on 8.6 isn't affected and 8.5 is already EOL, I don't find it very important. Note that this is fixed on 8.7/9.1 because this version is using unified image definitions (and thus partition tables) for all minor versions of both RHEL 8 and 9.
Hello @obudai Please, is this Known Issue still valid to be included in the RHEL 8.8 release? Thank you so much.
@bcl Seems like the fix landed in weldr-client 35.7. Do you have plans for rebasing the client in 8.8 and 9.2?
Yeah, that should be safe. The new functions (eg. diff) fail gracefully when the server doesn't support them so a rebase should be ok.
# rpm -q weldr-client weldr-client-35.9-2.el8.x86_64 # composer-cli compose start --size 20480 base-image qcow2 # composer-cli compose status ID Status Time Blueprint Version Type Size 8014a5a8-9f45-4c4d-bd90-52cee9ffa050 FINISHED Mon Feb 20 11:22:38 2023 base-image 2022.01.24 qcow2 21474836480 Moving to verified.