I've asked for a new bug in https://bugzilla.redhat.com/show_bug.cgi?id=2021431#c8 to investigate the new set of deadlocks on pod stop we're seeing I have created a scratch build of cri-o that I'm interested in seeing whether it triggers the problem http://brew-task-repos.usersys.redhat.com/repos/scratch/pehunt/cri-o/1.21.4/5.rhaos4.8.git84fa55d.el8/
oh and if the issue reproduces even with the scratch build I'll want the info provided in https://bugzilla.redhat.com/show_bug.cgi?id=2021431#c6 again
Chris, the must-gather tells me that they run CRI-O 1.21.4-5.rhaos4.8.git84fa55d.el8, which is not the one Peter provided (1.21.4-6.rhaos4.8.gitc845cf4.el8). Can you ask them to override the rpm via rpm-ostree?
Hey Chris, I uploaded two modified test binaries for 4.8 and 4.9. Please request another test from the customer. Then I will sync with Peter on monday how to approach this issue.
(In reply to Sascha Grunert from comment #11) > Hey Chris, I uploaded two modified test binaries for 4.8 and 4.9. Please > request another test from the customer. Then I will sync with Peter on > monday how to approach this issue. Hi Sascha, the binaries and instructions have been delievered to the customer. I will link a set of logs once their testing is complete. Instructions delievered: ######################################################## Create a node debug container: ``` oc debug node/ci-ln-2myl9xb-f76d1-ck27t-master-0 ``` Copy the tarball to the container: ``` kubectl cp crio.tar.gz ci-ln-2myl9xb-f76d1-ck27t-master-0-debug:/tmp/crio.tar.gz ``` In the container, move the tarball to the destination and verify that the executable works: ``` mv /tmp/crio.tar.gz /host/tmp chroot /host tar xf /tmp/crio.tar.gz -C /usr/local/bin/ /usr/local/bin/crio version ``` ``` Version: 1.21.3 GitCommit: 51409e1b2dc9ccfbb7d7f4fd543a094097627ae2 GitTreeState: dirty BuildDate: 1980-01-01T00:00:00Z GoVersion: go1.15.7 Compiler: gc Platform: linux/amd64 Linkmode: static ``` Edit the crio unit file: ``` systemctl edit crio ``` Add the following override: ``` [Service] ExecStart= ExecStart=-/usr/local/bin/crio ``` Restart crio: ``` systemctl daemon-reload systemctl restart crio ``` ############################################################ Br, Chris
*** Bug 2014083 has been marked as a duplicate of this bug. ***
Tested on 4.10.0-0.nightly-2022-01-17-223655 $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.10.0-0.nightly-2022-01-17-223655 True False 136m Cluster version is 4.10.0-0.nightly-2022-01-17-223655 $ oc get nodes -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME master-00.sunilc410bm.qe.devcluster.openshift.com Ready master,worker 156m v1.23.0+60f5a1c 147.75.80.115 <none> Red Hat Enterprise Linux CoreOS 410.84.202201171746-0 (Ootpa) 4.18.0-305.30.1.el8_4.x86_64 cri-o://1.23.0-102.rhaos4.10.git9c23ef3.el8
*** Bug 2040485 has been marked as a duplicate of this bug. ***
*** Bug 2015412 has been marked as a duplicate of this bug. ***
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0056