Bug 1993373 - crio: invalid conmon path /usr/libexec/crio/conmon: no such file or directory
Summary: crio: invalid conmon path /usr/libexec/crio/conmon: no such file or directory
Keywords:
Status: CLOSED DUPLICATE of bug 1993386
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: RHCOS
Version: 4.7
Hardware: Unspecified
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: 4.7.z
Assignee: RHCOS Bug Triage
QA Contact: Michael Nguyen
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-08-12 21:14 UTC by Ben Parees
Modified: 2021-08-16 07:47 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-08-16 07:47:16 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Ben Parees 2021-08-12 21:14:11 UTC
As seen here, all payloads are failing to be accepted, the jobs show the payloads fail to bootstrap:

https://amd64.ocp.releases.ci.openshift.org/#4.7.0-0.nightly

sample job:
https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.7-e2e-aws/1425793109945487360

level=info msg=Bootstrap gather logs captured here "/tmp/installer/log-bundle-20210812125452.tar.gz"
level=fatal msg=Bootstrap failed to complete: failed to wait for bootstrapping to complete: timed out waiting for the condition
Setup phase finished, prepare env for next steps
Copying log bundle... 


Looking at the first payload that started failing:
https://amd64.ocp.releases.ci.openshift.org/releasestream/4.7.0-0.nightly/release/4.7.0-0.nightly-2021-08-11-003951

the only diff appears to be the rhcos bump.
Red Hat Enterprise Linux CoreOS upgraded from 47.84.202108052031-0 to 47.84.202108101631-0

https://releases-rhcos-art.cloud.privileged.psi.redhat.com/diff.html?arch=x86_64&first_release=47.84.202108052031-0&first_stream=releases%2Frhcos-4.7&second_release=47.84.202108101631-0&second_stream=releases%2Frhcos-4.7

Comment 3 Luca BRUNO 2021-08-16 07:47:16 UTC
In the control-plane logs I'm seeing the following:

```
Aug 12 12:55:01 ip-10-0-147-9 crio[7292]: time="2021-08-12T12:55:01Z" level=info msg="Starting CRI-O, version: 1.20.4-9.rhaos4.7.git74c6592.el8, git: ()"
Aug 12 12:55:01 ip-10-0-147-9 crio[7292]: time="2021-08-12 12:55:01.091615078Z" level=info msg="Node configuration value for hugetlb cgroup is true"
Aug 12 12:55:01 ip-10-0-147-9 crio[7292]: time="2021-08-12 12:55:01.091636033Z" level=info msg="Node configuration value for pid cgroup is true"
Aug 12 12:55:01 ip-10-0-147-9 crio[7292]: time="2021-08-12 12:55:01.091649129Z" level=info msg="Node configuration value for memoryswap cgroup is true"
Aug 12 12:55:01 ip-10-0-147-9 crio[7292]: time="2021-08-12 12:55:01.098568432Z" level=info msg="Node configuration value for systemd CollectMode is true"
Aug 12 12:55:01 ip-10-0-147-9 crio[7292]: time="2021-08-12 12:55:01.099210948Z" level=info msg="Using default capabilities: CAP_CHOWN, CAP_DAC_OVERRIDE, CAP_FSETID, CAP_FOWNER, CAP_SETGID, CAP_SETUID, CAP_SETPCAP, CAP_NET_BIND_SERVICE, CAP_KILL"
Aug 12 12:55:01 ip-10-0-147-9 crio[7292]: time="2021-08-12 12:55:01.134220226Z" level=fatal msg="Validating runtime config: conmon validation: invalid conmon path: stat /usr/libexec/crio/conmon: no such file or directory"
Aug 12 12:55:01 ip-10-0-147-9 systemd[1]: crio.service: Main process exited, code=exited, status=1/FAILURE
Aug 12 12:55:01 ip-10-0-147-9 systemd[1]: crio.service: Failed with result 'exit-code'.
```

So I believe this is the same root cause of https://bugzilla.redhat.com/show_bug.cgi?id=1992557, I've retitled accordingly.

For 4.7 specifically we already have https://bugzilla.redhat.com/show_bug.cgi?id=1993386, so I'm closing this as a duplicate.

*** This bug has been marked as a duplicate of bug 1993386 ***


Note You need to log in before you can comment on or make changes to this bug.