oci-systemd-hook should either be required by cri-o, or cri-o should not require the oci hook directory. +++ This bug was initially created as a clone of Bug #1796537 +++ Description of problem: When running kola test - crio.base on the RHCOS version 4.4 - a private build - 44.81.202001301530.0, the test fails with: Jan 30 15:37:49.224043 crio[1520]: time="2020-01-30 15:37:49.223914700Z" level=fatal msg="runtime config: invalid hooks_dir: stat /usr/share/containers/oci/hooks.d: no such file or directory: stat /usr/share/con tainers/oci/hooks.d: no such file or directory" Version-Release number of selected component (if applicable): RHCOS - 4.4 How reproducible: All the time Steps to Reproduce: 1.Get a 4.4 RHCOS build 2. Boot into it anyway you like (I used the qemu image and did a `cosa run`) 3. do a `sudo systemctl start crio` 4. check `systemctl status crio` Actual results: crio service fails to start with above error Expected results: crio service should start successfully Additional info: After conversations on slack with the CoreOS and crio-team, the short term fix is to pull in the oci-systemd-hook package which provides the missing directory.
What really should be changing here is how cri-o is packaged. Right now, cri-o's default hooks dir (/usr/share/containers/oci/hooks.d) used to be created by oci-systemd-hook. However, the package oci-systemd-hook should be dropped, as we don't want to be injecting the actual hook into containers anymore. Thus, we need another RPM to create this directory at runtime. It should probably be cri-o's RPM, because cri-o fails on startup in the default case without the directory. The cri-o binary can't be responsible for creating it because RHCOS has /usr ro for programs that don't speak ostree. All in all, I'm passing this off to Jindrich. Jindrich, can you have the /usr/share/containers/oci/hooks.d directory created at rpm install time for cri-o?
jnovy@localhost .../rhaos-4.4-rhel-8/x86_64 (rhaos-4.4-rhel-8 *%)$ rpm -qpl cri-o-1.17.0-0.4.rc1.rhaos4.4.git5842752.el8.x86_64.rpm | grep hook /usr/share/containers/oci/hooks.d which means the rhaos4.4 version of cri-o does install and own the /usr/share/containers/oci/hooks.d directory already. Can you check whether your automation doesn't remove that dir att any stage?
RHCOS 4.4 is currently shipping cri-o-1.16.2-6.dev.rhaos4.3.git9e3db66.el8.x86_64 because of a problem in 1.17. See: https://github.com/cri-o/cri-o/pull/3138 https://bugzilla.redhat.com/show_bug.cgi?id=1792749 The RHCOS team needs to get in touch with the Node team to see if we can pull 1.17 back into RHCOS. I'll take that action today.
Checked with 4.4.0-0.nightly-2020-02-25-193901, and the issue is fixed. [core@wjiospy2261-b62zg-master-0 ~]$ rpm-ostree status State: idle AutomaticUpdates: disabled Deployments: ● pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:9e970239bfa663cfcea888e863d01c494cf27270d115a589c73e484fc58f6c8a CustomOrigin: Managed by machine-config-operator Version: 44.81.202002251730-0 (2020-02-25T17:35:50Z) ostree://f61524fda480c611dcd25629fd15eb6de27a306689261c211dbc8e88c19a5219 Version: 44.81.202001241431.0 (2020-01-24T14:36:48Z) [core@wjiospy2261-b62zg-master-0 ~]$ rpm -qa |grep -i cri-o cri-o-1.17.0-4.dev.rhaos4.4.gitc3436cc.el8.x86_64 [core@wjiospy2261-b62zg-master-0 ~]$ rpm -ql cri-o-1.17.0-4.dev.rhaos4.4.gitc3436cc.el8.x86_64|grep hook /usr/share/containers/oci/hooks.d [core@wjiospy2261-b62zg-master-0 ~]$ systemctl status crio ● crio.service - Open Container Initiative Daemon Loaded: loaded (/usr/lib/systemd/system/crio.service; disabled; vendor preset: disabled) Drop-In: /etc/systemd/system/crio.service.d └─10-default-env.conf Active: active (running) since Wed 2020-02-26 04:36:50 UTC; 3h 56min ago Docs: https://github.com/cri-o/cri-o Main PID: 2069 (crio) Tasks: 45 Memory: 3.7G CPU: 16min 46.985s CGroup: /system.slice/crio.service ├─ 2069 /usr/bin/crio --enable-metrics=true --metrics-port=9537 └─655156 /usr/libexec/crio/conmon -c 493cd140734cc3e6604cb7188d715215f07c5b2201ca60b985fddcfdcf0c005c -n k8s_openvswitch_ovs-phlbw_openshift-sdn_dcd0f89c-7935-48cb-8792-644856fd4408_0 -r /usr/bin/run> Feb 26 05:54:43 wjiospy2261-b62zg-master-0 crio[2069]: 2020-02-26T05:54:43Z [verbose] Del: openshift-etcd:revision-pruner-3-wjiospy2261-b62zg-master-0:openshift-sdn:eth0 {"cniVersion":"0.3.1","name":"openshift-> Feb 26 05:54:53 wjiospy2261-b62zg-master-0 crio[2069]: 2020-02-26T05:54:53Z [verbose] Add: openshift-kube-apiserver:revision-pruner-10-wjiospy2261-b62zg-master-0:openshift-sdn:eth0 {"cniVersion":"0.3.1","interf> Feb 26 05:54:55 wjiospy2261-b62zg-master-0 crio[2069]: 2020-02-26T05:54:55Z [verbose] Del: openshift-kube-apiserver:revision-pruner-10-wjiospy2261-b62zg-master-0:openshift-sdn:eth0 {"cniVersion":"0.3.1","name":> Feb 26 05:54:58 wjiospy2261-b62zg-master-0 crio[2069]: 2020-02-26T05:54:58Z [verbose] Add: openshift-kube-controller-manager:revision-pruner-10-wjiospy2261-b62zg-master-0:openshift-sdn:eth0 {"cniVersion":"0.3.1> Feb 26 05:54:59 wjiospy2261-b62zg-master-0 crio[2069]: 2020-02-26T05:54:59Z [verbose] Del: openshift-kube-controller-manager:revision-pruner-10-wjiospy2261-b62zg-master-0:openshift-sdn:eth0 {"cniVersion":"0.3.1> Feb 26 05:55:33 wjiospy2261-b62zg-master-0 crio[2069]: 2020-02-26T05:55:33Z [verbose] Add: openshift-kube-scheduler:revision-pruner-6-wjiospy2261-b62zg-master-0:openshift-sdn:eth0 {"cniVersion":"0.3.1","interfa> Feb 26 05:55:35 wjiospy2261-b62zg-master-0 crio[2069]: 2020-02-26T05:55:35Z [verbose] Del: openshift-kube-scheduler:revision-pruner-6-wjiospy2261-b62zg-master-0:openshift-sdn:eth0 {"cniVersion":"0.3.1","name":"> Feb 26 06:37:47 wjiospy2261-b62zg-master-0 crio[2069]: time="2020-02-26T06:37:47Z" level=error msg="container not running" Feb 26 06:37:47 wjiospy2261-b62zg-master-0 crio[2069]: container not running Feb 26 06:38:46 wjiospy2261-b62zg-master-0 crio[2069]: 2020-02-26T06:38:46Z [verbose] Add: openshift-authentication:oauth-openshift-c4b8bd48c-mvnlh:openshift-sdn:eth0 {"cniVersion":"0.3.1","interfaces":[{"name">
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:0581
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days