Description of problem: worker file integrity working fine and showing the results. However, it's showing oc get fileintegrity worker-fileintegrity -o=jsonpath='{.status}' {"phase":"Initializing"} aide worker pods are running: aide-worker-fileintegrity-l4nx6 1/1 Running 2 25h aide-worker-fileintegrity-nmc6d 1/1 Running 2 25h aide-worker-fileintegrity-pjpbg 1/1 Running 2 25h aide-worker-fileintegrity-twt25 1/1 Running 2 25h aide-worker-fileintegrity-zq86j 1/1 Running 2 25h file-integrity-operator-6d877d8c59-vmvt6 1/1 Running 0 22h Following more details internally.
Hi Matt, The bug was reproduced with v0.1.21 > v0.1.22 FIO upgrade. So verify it with v0.1.21 > v0.1.24 FIO upgrade. Generally it is fine, the cm aide-reinit was updated after FIO uprade; and the db reinit succeeded when user trigger manual reinit after FIO upgrade done. The only problem is /hostroot/run/aide.reinit is missing on node. Is it expected? Thanks. $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.8.39 True False 72m Cluster version is 4.8.39 1. install FIO v0.1.21, create FileIntegrity, and trigger failure: $ oc get ip NAME CSV APPROVAL APPROVED install-pzl7d file-integrity-operator.v0.1.21 Automatic true $ oc get csv -w NAME DISPLAY VERSION REPLACES PHASE elasticsearch-operator.5.2.10-20 OpenShift Elasticsearch Operator 5.2.10-20 file-integrity-operator.v0.1.21 File Integrity Operator 0.1.21 Installing file-integrity-operator.v0.1.21 File Integrity Operator 0.1.21 Succeeded ^C$ oc get pod NAME READY STATUS RESTARTS AGE file-integrity-operator-748cf55bbd-s4v59 1/1 Running 0 30s $ oc create -f - << EOF > apiVersion: fileintegrity.openshift.io/v1alpha1 > kind: FileIntegrity > metadata: > name: example-fileintegrity > namespace: openshift-file-integrity > spec: > # Change to debug: true to enable more verbose logging from the logcollector > # container in the aide pods > debug: false > config: > gracePeriod: 90 > EOF fileintegrity.fileintegrity.openshift.io/example-fileintegrity created $ oc extract cm/aide-reinit --confirm aide.sh $ cat aide.sh #!/bin/sh touch /hostroot/etc/kubernetes/aide.reinit $ oc extract cm/aide-pause --confirm pause.sh $ cat pause.sh #!/bin/sh sleep infinity & PID=$! trap "kill $PID" INT TERM wait $PID || true $ oc get fileintegritynodestatuses NAME NODE STATUS example-fileintegrity-xiyuan11-48-bpggz-master-0 xiyuan11-48-bpggz-master-0 Failed example-fileintegrity-xiyuan11-48-bpggz-master-1 xiyuan11-48-bpggz-master-1 Succeeded example-fileintegrity-xiyuan11-48-bpggz-master-2 xiyuan11-48-bpggz-master-2 Succeeded example-fileintegrity-xiyuan11-48-bpggz-worker-0-6rc8j xiyuan11-48-bpggz-worker-0-6rc8j Failed example-fileintegrity-xiyuan11-48-bpggz-worker-0-mnh67 xiyuan11-48-bpggz-worker-0-mnh67 Succeeded example-fileintegrity-xiyuan11-48-bpggz-worker-0-t4spn xiyuan11-48-bpggz-worker-0-t4spn Succeeded 2. upgrade to v0.1.24: $ oc get ip NAME CSV APPROVAL APPROVED install-7rvcj file-integrity-operator.v0.1.24 Automatic true install-pzl7d file-integrity-operator.v0.1.21 Automatic true [xiyuan@MiWiFi-RA69-srv func]$ oc get csv NAME DISPLAY VERSION REPLACES PHASE elasticsearch-operator.5.2.10-20 OpenShift Elasticsearch Operator 5.2.10-20 Succeeded file-integrity-operator.v0.1.21 File Integrity Operator 0.1.21 Succeeded $ oc get csv NAME DISPLAY VERSION REPLACES PHASE elasticsearch-operator.5.2.10-20 OpenShift Elasticsearch Operator 5.2.10-20 Succeeded file-integrity-operator.v0.1.24 File Integrity Operator 0.1.24 file-integrity-operator.v0.1.21 Succeeded $ oc extract cm/aide-reinit --confirm aide.sh $ cat aide.sh #!/bin/sh touch /hostroot/run/aide.reinit 3. trigger reinit manually: $ oc debug node/xiyuan11-48-bpggz-master-0 -- chroot /host ls -ltr /etc/kubernetes Starting pod/xiyuan11-48-bpggz-master-0-debug ... To use host binaries, run `chroot /host` total 3860 -rw-r--r--. 1 root root 9179 May 11 06:16 kubeconfig drwxr-xr-x. 3 root root 19 May 11 06:17 cni drwxr-xr-x. 3 root root 20 May 11 06:17 kubelet-plugins drwxr-xr-x. 19 root root 4096 May 11 06:44 static-pod-resources -rw-r--r--. 1 root root 101 May 11 06:50 apiserver-url.env drwxr-xr-x. 2 root root 192 May 11 06:50 manifests -rw-r--r--. 1 root root 5875 May 11 06:50 kubelet-ca.crt -rw-r--r--. 1 root root 1123 May 11 06:50 ca.crt -rw-r--r--. 1 root root 94 May 11 06:50 cloud.conf -rw-r--r--. 1 root root 1076 May 11 06:50 kubelet.conf -rw-------. 1 root root 67 May 11 07:23 aide.log.backup-20220511T07_23_30 -rw-------. 1 root root 1946990 May 11 07:24 aide.db.gz.new -rw-------. 1 root root 1946990 May 11 07:24 aide.db.gz -rw-------. 1 root root 877 May 11 07:45 aide.log.new -rw-------. 1 root root 877 May 11 07:45 aide.log Removing debug pod ... $ oc annotate fileintegrities/example-fileintegrity file-integrity.openshift.io/re-init= fileintegrity.fileintegrity.openshift.io/example-fileintegrity annotated $ oc get fileintegrity example-fileintegrity -o=jsonpath={.status} {"phase":"Initializing"}[xiyuan@MiWiFi-RA69-srv func]$ $ oc get fileintegrity example-fileintegrity -o=jsonpath={.status} {"phase":"Active"} $ oc debug node/xiyuan11-48-bpggz-master-0 -- ls -ltr /hostroot/run/aide.reinit Starting pod/xiyuan11-48-bpggz-master-0-debug ... To use host binaries, run `chroot /host` ls: cannot access '/hostroot/run/aide.reinit': No such file or directory Removing debug pod ... error: non-zero exit code from debug container $ oc debug node/xiyuan11-48-bpggz-master-0 -- chroot /host ls -ltr /etc/kubernetes Starting pod/xiyuan11-48-bpggz-master-0-debug ... To use host binaries, run `chroot /host` total 5764 -rw-r--r--. 1 root root 9179 May 11 06:16 kubeconfig drwxr-xr-x. 3 root root 19 May 11 06:17 cni drwxr-xr-x. 3 root root 20 May 11 06:17 kubelet-plugins drwxr-xr-x. 19 root root 4096 May 11 06:44 static-pod-resources -rw-r--r--. 1 root root 101 May 11 06:50 apiserver-url.env drwxr-xr-x. 2 root root 192 May 11 06:50 manifests -rw-r--r--. 1 root root 5875 May 11 06:50 kubelet-ca.crt -rw-r--r--. 1 root root 1123 May 11 06:50 ca.crt -rw-r--r--. 1 root root 94 May 11 06:50 cloud.conf -rw-r--r--. 1 root root 1076 May 11 06:50 kubelet.conf -rw-------. 1 root root 67 May 11 07:23 aide.log.backup-20220511T07_23_30 -rw-------. 1 root root 1946990 May 11 07:47 aide.db.gz.backup-20220511T07_47_50 -rw-------. 1 root root 877 May 11 07:47 aide.log.backup-20220511T07_47_50 -rw-------. 1 root root 1947002 May 11 07:48 aide.db.gz.new -rw-------. 1 root root 1947002 May 11 07:48 aide.db.gz -rw-------. 1 root root 651 May 11 07:52 aide.log -rw-------. 1 root root 0 May 11 07:53 aide.log.new Removing debug pod ...
Correct the command to check /hostroot/run/aide.reinit in https://bugzilla.redhat.com/show_bug.cgi?id=2072058#c23. The same result. $ oc debug node/xiyuan11-48-bpggz-master-0 -- chroot /host ls -ltr /run/aide.reinit Starting pod/xiyuan11-48-bpggz-master-0-debug ... To use host binaries, run `chroot /host` ls: cannot access '/run/aide.reinit': No such file or directory Removing debug pod ... error: non-zero exit code from debug container
per https://bugzilla.redhat.com/show_bug.cgi?id=2072058#c23 and https://bugzilla.redhat.com/show_bug.cgi?id=2049206#c14, move it to verified
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift File Integrity Operator bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:1331