Description of problem: the platform is IPI BareMetal cannot interact with toolbox container, stuck session, no input is possible. The only option to access it is to open another session and connect from it Version-Release number of selected component (if applicable): Cluster version is 4.7.0-0.nightly-2021-01-10-070949 toolbox-0.0.8-1.rhaos4.7.el8.noarch How reproducible: always Steps to Reproduce: 1. oc debug node/<node-name> 2. chroot /host 3. toolbox Actual results: Stuck on: sh-4.4# toolbox Trying to pull registry.redhat.io/rhel8/support-tools... Getting image source signatures Copying blob d9e72d058dc5 skipped: already exists Copying blob cca21acb641a skipped: already exists Copying blob 5ee83610639d done Copying config be1f7079a9 done Writing manifest to image destination Storing signatures be1f7079a938a4ab5c1f8b4c7d2dc82b8c60598bb1e248438ced576829f9638 Expected results: sh-4.4# toolbox [root@toolbox /]# Additional info: On a first attempt session is stuck, until new oc debug session will not be opened and toolbox will not be run again: [kni@provisionhost-0-0 ~]$ oc debug node/master-0-1 Starting pod/master-0-1-debug ... To use host binaries, run `chroot /host` Pod IP: 192.168.123.148 If you don't see a command prompt, try pressing enter. sh-4.4# chroot /host sh-4.4# rpm -q toolbox toolbox-0.0.8-1.rhaos4.7.el8.noarch sh-4.4# toolbox Error: error creating container storage: the container name "support-tools" is already in use by "e3801dcb314833f3ff7d0db68585b9f9be5d9c9f2bb097d23d4269f75a0bbf3a". You have to remove that container to be able to reuse that name.: that name is already in use Error: `/proc/self/exe run -it --name support-tools --privileged --ipc=host --net=host --pid=host -e HOST=/host -e NAME=support-tools -e IMAGE=registry.redhat.io/rhel8/support-tools:latest -v /run:/run -v /var/log:/var/log -v /etc/machine-id:/etc/machine-id -v /etc/localtime:/etc/localtime -v /:/host registry.redhat.io/rhel8/support-tools:latest` failed: exit status 125 Spawning a container 'toolbox-' with image 'registry.redhat.io/rhel8/support-tools' [root@toolbox /]#
The toolbox package is provided by the container-tools module, which RHCOS consumes as part of our OS manifest. Moving to the container-tools component for triage
sure
(In reply to Micah Abbott from comment #2) > The toolbox package is provided by the container-tools module, which RHCOS > consumes as part of our OS manifest. > > Moving to the container-tools component for triage Toolbox is actually maintained and packaged separately for OCP. OCP 4.7 looks like it will ship with toolbox-0.0.8-1.rhaos4.7.el8. RHEL 8 currently ships toolbox-0.0.4-1.module+el8.1.1+4407+ac444e5d as part of the container-tools module. I tried this on an OCP 4.6 cluster with toolbox-0.0.8-1.rhaos4.6.el8, which should be the same toolbox code that 4.7 has. I wasn't able to reproduce the issue: # ./oc debug node/worker0 Starting pod/worker0-debug ... To use host binaries, run `chroot /host` Pod IP: 192.168.130.20 If you don't see a command prompt, try pressing enter. sh-4.4# chroot /host sh-4.4# toolbox Trying to pull registry.redhat.io/rhel8/support-tools... Getting image source signatures Copying blob 5ee83610639d done Copying blob cca21acb641a done Copying blob d9e72d058dc5 done Copying config be1f7079a9 done Writing manifest to image destination Storing signatures be1f7079a938a4ab5c1f8b4c7d2dc82b8c60598bb1e248438ced576829f96389 Spawning a container 'toolbox-' with image 'registry.redhat.io/rhel8/support-tools' Detected RUN label in the container image. Using that as the default... command: podman run -it --name toolbox- --privileged --ipc=host --net=host --pid=host -e HOST=/host -e NAME=toolbox- -e IMAGE=registry.redhat.io/rhel8/support-tools:latest -v /run:/run -v /var/log:/var/log -v /etc/machine-id:/etc/machine-id -v /etc/localtime:/etc/localtime -v /:/host registry.redhat.io/rhel8/support-tools:latest [root@worker0 /]# [root@worker0 /]# exit exit sh-4.4# toolbox Container 'toolbox-' already exists. Trying to start... (To remove the container and start with a fresh toolbox, run: sudo podman rm 'toolbox-') toolbox- Container started successfully. To exit, type 'exit'. [root@worker0 /]# [root@worker0 /]# rpm -q sosreport package sosreport is not installed [root@worker0 /]# exit exit So I suspect that it may be a difference between podman-1.9.3-3.rhaos4.6.el8 included in 4.6 and podman-2.0.5-5.module+el8.3.0+8221+97165c3f included in 4.7
After discussing with Debarshi and other members of the Desktop team, we are going to move RHCOS related `toolbox` BZs back to the RHCOS component.
This one should be fixed with https://github.com/coreos/toolbox/pull/67
Will likely be fixed in upcoming sprint (needs code review & packaging).
See also https://bugzilla.redhat.com/show_bug.cgi?id=1877186
Verified on RHCOS 47.83.202101251242-0 which is a part of 4.7.0-0.nightly-2021-01-25-160335 $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.7.0-0.nightly-2021-01-25-160335 True False 35m Cluster version is 4.7.0-0.nightly-2021-01-25-160335 $ oc debug node/ip-10-0-154-51.us-west-2.compute.internal Starting pod/ip-10-0-154-51us-west-2computeinternal-debug ... To use host binaries, run `chroot /host` If you don't see a command prompt, try pressing enter. sh-4.2# chroot /host sh-4.4# toolbox Spawning a container 'toolbox-root' with image 'registry.redhat.io/rhel8/support-tools' Detected RUN label in the container image. Using that as the default... [root@ip-10-0-154-51 /]# exit exit sh-4.4# toolbox Container 'toolbox-root' already exists. Trying to start... (To remove the container and start with a fresh toolbox, run: sudo podman rm 'toolbox-root') toolbox-root Container started successfully. To exit, type 'exit'. bash-4.2# exit exit sh-4.4# toolbox Container 'toolbox-root' already exists. Trying to start... (To remove the container and start with a fresh toolbox, run: sudo podman rm 'toolbox-root') toolbox-root Container started successfully. To exit, type 'exit'. bash-4.2# exit exit sh-4.4# toolbox Container 'toolbox-root' already exists. Trying to start... (To remove the container and start with a fresh toolbox, run: sudo podman rm 'toolbox-root') toolbox-root Container started successfully. To exit, type 'exit'. bash-4.2# exit exit sh-4.4# exit exit sh-4.2# exit exit Removing debug pod ...
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5633