Bug 2048789 - broken toolbox in OCP 4.10 with non-default image
Summary: broken toolbox in OCP 4.10 with non-default image
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: RHCOS
Version: 4.10
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.12.0
Assignee: Timothée Ravier
QA Contact: HuijingHei
URL:
Whiteboard:
Depends On: 2093040
Blocks: 2105456
TreeView+ depends on / blocked
 
Reported: 2022-01-31 20:11 UTC by Andreas Karis
Modified: 2023-01-17 19:47 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
* Previously, updating to Podman 4.0 prevented users from using custom images with toolbox containers on {op-system}. This fix updates the toolbox library code to account for the new Podman behavior, so users can now use custom images with toolbox on {op-system} as expected. (link:https://bugzilla.redhat.com/show_bug.cgi?id=2048789[*BZ#2048789*])
Clone Of:
: 2105456 (view as bug list)
Environment:
Last Closed: 2023-01-17 19:47:08 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github coreos toolbox pull 73 0 None open rhcos-toolbox: check for an empty RUN label 2022-02-02 13:43:43 UTC
Red Hat Product Errata RHSA-2022:7399 0 None None None 2023-01-17 19:47:32 UTC

Description Andreas Karis 2022-01-31 20:11:23 UTC
Hi,

In 4.10, it is not possible to change the toolbox image:
https://docs.openshift.com/container-platform/4.9/support/gathering-cluster-data.html#starting-an-alternative-image-with-toolbox_gathering-cluster-data

sh-4.4# cat ~/.toolboxrc 
REGISTRY=registry.fedoraproject.org
IMAGE=f33/fedora-toolbox
sh-4.4# toolbox
.toolboxrc file detected, overriding defaults...
Spawning a container 'toolbox-root' with image 'registry.fedoraproject.org/f33/fedora-toolbox'
Detected RUN label in the container image. Using that as the default...
Error: cannot find the value of label: RUN in image: registry.fedoraproject.org/f33/fedora-toolbox
/usr/bin/toolbox: failed to runlabel on image 'registry.fedoraproject.org/f33/fedora-toolbox'
sh-4.4# 


The problem is that podman image inspect yields: "<no value" an/usr/bin/toolboxd thus this if condition here misbehaves:
/usr/bin/toolbox

~~~
     27     if ! container_exists; then
     28         echo "Spawning a container '$TOOLBOX_NAME' with image '$TOOLBOX_IMAGE'"
     29         if [[ -z "$runlabel" ]]; then
     30             container_run
     31             return
     32         else
     33             echo "Detected RUN label in the container image. Using that as the default..."
     34             container_runlabel
     35             return
     36         fi
~~~



[akaris@linux ipi-us-east-1]$ oc get clusterversion
NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.10.0-0.nightly-2022-01-30-073053   True        False         115m    Cluster version is 4.10.0-0.nightly-2022-01-30-073053





Thanks for reporting your issue!

In order for the CoreOS team to be able to quickly and successfully triage your issue, please fill out the following template as completely as possible.

Be ready for follow-up questions and please respond in a timely manner.

If we can't reproduce a bug, we might close your issue.

---

OCP Version at Install Time:
RHCOS Version at Install Time:
OCP Version after Upgrade (if applicable):
RHCOS Version after Upgrade (if applicable):
Platform: AWS, Azure, bare metal, GCP, vSphere, etc
Architecture: x86_64/ppc64le/s390x


What are you trying to do? What is your use case?


What happened? What went wrong or what did you expect?


What are the steps to reproduce your issue? Please try to reduce these steps to something that can be reproduced with a single RHCOS node.


If you're having problems booting/installing RHCOS, please provide:
- the full contents of the serial console showing disk initialization, network configuration, and Ignition stage (see https://access.redhat.com/articles/7212 for information about configuring your serial console)
- Ignition JSON
- output of `journalctl -b`


If you're having problems post-upgrade, please provide:
- A complete must-gather (`oc adm must-gather`)


If you're having SELinux related issues, please provide:
- The full `/var/log/audit/audit.log` file
- Were any SELinux modules or booleans changed from the default configuration?
- The output of `ostree admin config-diff | grep selinux/targeted` on impacted nodes


Please add anything else that might be useful, for example:
- kernel command line (`cat /proc/cmdline`)
- contents of `/etc/NetworkManager/system-connections/`
- contents of `/etc/sysconfig/network-scripts/`

Comment 1 Andreas Karis 2022-01-31 20:12:25 UTC
The problem is that podman image inspect yields: "<no value>" and /usr/bin/toolboxd thus thus goes into the wrong branch in this if/else:
/usr/bin/toolbox

~~~
     27     if ! container_exists; then
     28         echo "Spawning a container '$TOOLBOX_NAME' with image '$TOOLBOX_IMAGE'"
     29         if [[ -z "$runlabel" ]]; then
     30             container_run
     31             return
     32         else
     33             echo "Detected RUN label in the container image. Using that as the default..."
     34             container_runlabel
     35             return
     36         fi
~~~

Comment 2 Andreas Karis 2022-01-31 20:13:36 UTC
~~~
image_runlabel() {
    sudo podman image inspect "$TOOLBOX_IMAGE" --format "{{.Labels.run}}"
}
~~~

~~~
sh-4.4# podman image inspect "$TOOLBOX_IMAGE" --format "{{.Labels.run}}"
<no value>
~~~

Comment 3 Micah Abbott 2022-02-02 13:43:43 UTC
Thanks for the report!

I'm surprised we have got this far along before hitting this problem; I suspect there was a change to `podman` which altered the behavior in this case.

I have a proposed fix upstream - https://github.com/coreos/toolbox/pull/73

Comment 4 Micah Abbott 2022-07-07 13:38:49 UTC
This should be fixed as part of `toolbox-0.0.9-1.rhaos4.11.el8` included in RHCOS/OCP 4.11

Comment 6 HuijingHei 2022-07-08 06:22:07 UTC
Test with 4.11.0-0.nightly-2022-07-06-145812, toolbox does not work well.
Also test with RHCOS-411.86.202207062100-0, get same error

sh-4.4# chroot /host
sh-4.4# rpm -q toolbox
toolbox-0.0.9-1.rhaos4.11.el8.noarch

sh-4.4# vi ~/.toolboxrc
REGISTRY=quay.io                
IMAGE=fedora/fedora:36-x86_64   
TOOLBOX_NAME=toolbox-fedora-36

sh-4.4# toolbox 
.toolboxrc file detected, overriding defaults...
Trying to pull quay.io/fedora/fedora:36-x86_64...
Getting image source signatures
Copying blob 75f075168a24 done  
Copying config 3a66698e60 done  
Writing manifest to image destination
Storing signatures
3a66698e604003f7822a0c73e9da50e090fda9a99fe1f2e1e2e7fe796cc803d5
Spawning a container 'toolbox-fedora-36' with image 'quay.io/fedora/fedora:36-x86_64'
52567064981e4f9426db4ef4ddc0501fadc4e722fb03d8c8bd395a83d5ebe5d8
Container 'toolbox-fedora-36' in unknown state: 'created'

Comment 7 HuijingHei 2022-07-08 14:00:35 UTC
Change status to ASSIGNED based on Comment 6, and the `unknown state` issue is tracked by https://bugzilla.redhat.com/show_bug.cgi?id=2093040

Comment 8 Micah Abbott 2022-07-08 20:15:44 UTC
We won't be able to deliver this in time for 4.11, so re-targeting for 4.12 and setting a dependency on 2093040

Comment 10 HuijingHei 2022-09-05 03:45:42 UTC
Verify passed with build 412.86.202209030446-0, change the toolbox image and works

[core@cosa-devsh ~]$ rpm -q podman toolbox
podman-4.2.0-1.rhaos4.12.el8.x86_64
toolbox-0.1.0-1.rhaos4.12.el8.noarch

$ vi ~/.toolboxrc
REGISTRY=quay.io                
IMAGE=fedora/fedora:36-x86_64   
TOOLBOX_NAME=toolbox-fedora-36

[core@cosa-devsh ~]$ toolbox
.toolboxrc file detected, overriding defaults...
Trying to pull quay.io/fedora/fedora:36-x86_64...
Getting image source signatures
Copying blob 62946078034b done  
Copying config 2ecb6df959 done  
Writing manifest to image destination
Storing signatures
2ecb6df959942dd2fdeb65606ca2e42a54f8c06af10eeb594fdfc3e2656c53d1
Spawning a container 'toolbox-fedora-36' with image 'quay.io/fedora/fedora:36-x86_64'
0b9801f6ce382019cdd0ca711c66f6c917bf36f12922cc4a3a5b87de9bd5a276
toolbox-fedora-36
Container started successfully. To exit, type 'exit'.
[root@toolbox /]# cat /etc/os-release 
NAME="Fedora Linux"
VERSION="36 (Container Image)"
ID=fedora
VERSION_ID=36
VERSION_CODENAME=""
PLATFORM_ID="platform:f36"
PRETTY_NAME="Fedora Linux 36 (Container Image)"
ANSI_COLOR="0;38;2;60;110;180"
LOGO=fedora-logo-icon
CPE_NAME="cpe:/o:fedoraproject:fedora:36"
HOME_URL="https://fedoraproject.org/"
DOCUMENTATION_URL="https://docs.fedoraproject.org/en-US/fedora/f36/system-administrators-guide/"
SUPPORT_URL="https://ask.fedoraproject.org/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"
REDHAT_BUGZILLA_PRODUCT="Fedora"
REDHAT_BUGZILLA_PRODUCT_VERSION=36
REDHAT_SUPPORT_PRODUCT="Fedora"
REDHAT_SUPPORT_PRODUCT_VERSION=36
PRIVACY_POLICY_URL="https://fedoraproject.org/wiki/Legal:PrivacyPolicy"
VARIANT="Container Image"
VARIANT_ID=container

Comment 13 errata-xmlrpc 2023-01-17 19:47:08 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.12.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:7399


Note You need to log in before you can comment on or make changes to this bug.