Bug 1777838

Summary: crio-1.16 does not execute or read in hook description
Product: OpenShift Container Platform Reporter: Zvonko Kosic <zkosic>
Component: ContainersAssignee: Peter Hunt <pehunt>
Status: CLOSED ERRATA QA Contact: weiwei jiang <wjiang>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 4.3.0CC: aos-bugs, dwalsh, ematysek, jokerman, mifiedle, mpatel, pehunt, tsweeney, vrothber, wabouham
Target Milestone: ---Keywords: TestBlocker
Target Release: 4.3.0Flags: vrothber: needinfo? (mpatel)
pehunt: needinfo? (zkosic)
Hardware: All   
OS: Linux   
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-01-23 11:14:45 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 1794257, 1769174    

Description Zvonko Kosic 2019-11-28 12:25:52 UTC
Description of problem:

Looks like the crio version ( 1.16.0-0.4.dev.rhaos4.3.giteed6aa1.el8-rc2) I have in my 4.3 cluster is not executing any hooks, I added log-level=debug and I cannot see that crio is trying to read any hooks from /usr/share/containers/oci/hooks.d or /etc/containers/oci/hooks.d  and I do not see any log of "AddHook", 

Version-Release number of selected component (if applicable):

crio version 1.16.0-0.4.dev.rhaos4.3.giteed6aa1.el8-rc2
commit: "c64f75c08173ed16b4aa09252eb07f669c2c5c39-dirty"

How reproducible:

Steps to Reproduce:

Install a recent 4.3 cluster enable log-level=debug and look for any mention of hook in the logs. 

Actual results:

# journalctl -u -f crio | grep hook

Expected results:
# journalctl -u -f crio | grep hook

DEBU[2019-11-27 20:59:13.286157549Z] reading hooks from /usr/share/containers/oci/hooks.d 
DEBU[2019-11-27 20:59:13.286295598Z] added hook /usr/share/containers/oci/hooks.d/oci-systemd-hook.json 
DEBU[2019-11-27 20:59:13.286339362Z] reading hooks from /etc/containers/oci/hooks.d 
DEBU[2019-11-27 20:59:13.544742728Z] monitoring "/usr/share/containers/oci/hooks.d" for hooks 
DEBU[2019-11-27 20:59:13.544770728Z] monitoring "/etc/containers/oci/hooks.d" for hooks 

Additional info:

Ran crio-1.14 on the 4.3 node and i got the desired output.

Comment 1 Valentin Rothberg 2019-11-28 12:54:09 UTC
The behaviour w.r.t. hooks has changed a bit with 1.16 (see https://github.com/cri-o/cri-o/pull/2731). The `hooks_dir` option must now be set explicitly in the crio.conf.

Can you check if it's set in the crio.conf and retest?

Comment 3 Zvonko Kosic 2019-11-28 13:46:00 UTC
Setting the hooks_dir = [] works only if I restart crio, SIGHUP didn't to anything. 

I do not want to restart crio at all cost. 

Is the oci-systemd-hook.json optional? Shouldn't this thing be activated per default?

Comment 4 Zvonko Kosic 2019-11-28 13:48:08 UTC
Looking at the PR, all the things I need to make NVIDIA work was removed from cri-o, this is bad, see this issue https://github.com/cri-o/cri-o/pull/1943 and the folllowing PR https://github.com/cri-o/cri-o/pull/1943

if config.HooksDir == nil {
		for _, hooksDir := range []string{hooks.DefaultDir, hooks.OverrideDir} {
			_, err = os.Stat(hooksDir)
			if err == nil {
				hookDirectories = append(hookDirectories, hooksDir)
				logrus.Warnf("implicit hook directories are deprecated; set --hooks-dir=%q explicitly to continue to load hooks from this directory", hooksDir)

this was added ^^ and the new PR removed it

/etc/container/oci/hooks.d is the only place where custom hooks can be placed on RHCOS this is needed by several customers.

Comment 5 Zvonko Kosic 2019-11-28 13:56:47 UTC
From an OpenShift, Operator perspective I do not want to/can restart crio, edit files on the Node, send signals to processes (which means I need elevated privileges). 

How crio worked before was a desired state I needed for several customers (NVIDIA, SolarFlare, Intel) so they can easily leverage the hooks in crio.

Comment 6 Valentin Rothberg 2019-11-28 15:21:18 UTC
I think that the breaking upstream change was not reflected in the packages or configurator to properly set the hooks_dir. We can either revert the upstream change or set the hooks_dir.

Mrunal, Peter, what do you think?

In general, I really don't like breaking changes such as with hooks_dir. It's always tricky to catch them, either in package updates or with configurators.

Comment 7 Peter Hunt 2019-12-02 16:24:12 UTC
So I've opened a PR upstream for including a default hooks dir in the crio.conf. That prevents any fixes in the machine config operator, which I think is good. 

Zvonko, are the desired hooks in `/usr/share/containers/oci/hooks.d` ?

Comment 9 Zvonko Kosic 2019-12-02 17:38:18 UTC

it is /etc/containers/oci/hooks.d, the /usr/... is not writable when running OCP with RHCOS.

Comment 10 Zvonko Kosic 2019-12-02 17:40:53 UTC

oci-systemd-hook.json is in /usr/share/containers/oci/hooks.d is this hook optional?

Comment 11 Peter Hunt 2019-12-02 18:07:26 UTC
Yes this hook is optional. In fact, I think it's even deprecated. I'll open a PR in MCO to point towards /etc/containers/oci/hook.d

Comment 16 weiwei jiang 2019-12-10 02:42:16 UTC
Checked with cri-o-1.16.1-2.dev.rhaos4.3.git7b04b62.el8

sh-4.4# journalctl  -u crio |grep -i hook                                                                                                                                                     
Dec 09 06:50:31 ip-10-0-72-197 crio[447394]: time="2019-12-09 06:50:31.354430625Z" level=debug msg="reading hooks from /etc/containers/oci/hooks.d" file="hooks/read.go:65"
Dec 09 06:50:31 ip-10-0-72-197 crio[447394]: time="2019-12-09 06:50:31.354491956Z" level=debug msg="reading hooks from /usr/share/containers/oci/hooks.d" file="hooks/read.go:65"
Dec 09 06:50:31 ip-10-0-72-197 crio[447394]: time="2019-12-09 06:50:31.354646242Z" level=debug msg="added hook /usr/share/containers/oci/hooks.d/oci-systemd-hook.json" file="hooks/read.go:91
Dec 09 06:50:33 ip-10-0-72-197 crio[447394]: time="2019-12-09 06:50:33.170015801Z" level=debug msg="monitoring \"/etc/containers/oci/hooks.d\" for hooks" file="hooks/monitor.go:43"
Dec 09 06:50:33 ip-10-0-72-197 crio[447394]: time="2019-12-09 06:50:33.171674411Z" level=debug msg="monitoring \"/usr/share/containers/oci/hooks.d\" for hooks" file="hooks/monitor.go:43"
Dec 09 06:51:01 ip-10-0-72-197 crio[447394]: time="2019-12-09 06:51:01.634261654Z" level=debug msg="hook oci-systemd-hook.json did not match" file="hooks/hooks.go:135"
sh-4.4# rpm-ostree status
State: idle
AutomaticUpdates: disabled
* pivot://quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:2a8d26fe5744bc92205a6df0677b0666cc08fb36f26184aacfc9e748cbe38c9d
              CustomOrigin: Managed by machine-config-operator
                   Version: 43.81.201912081754.0 (2019-12-08T17:59:29Z)

                   Version: 43.81.201911221453.0 (2019-11-22T14:58:44Z)

Comment 18 errata-xmlrpc 2020-01-23 11:14:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.