Description of problem: After OCP upgrade, the fileintegritynodestatus object reports Errored status due to IO error while initializing the AIDE DB $ oc describe fileintegritynodestatus example-fileintegrity-ip-10-0-207-34.us-east-2.compute.internal |tail -8 Node Name: ip-10-0-207-34.us-east-2.compute.internal Results: Condition: Succeeded Last Probe Time: 2021-10-20T15:04:07Z Condition: Errored Error Message: Error initializing the AIDE DB: IO error exit status 18 <<---- Last Probe Time: 2021-10-20T15:12:08Z Events: <none> Version-Release number of selected component (if applicable): 4.6.48-x86_64 > 4.7.34-x86_64 + file-integrity-operator.v0.1.20 How reproducible: Always Steps to Reproduce: 1. Install OCP 4.6.48-x86_64 cluster 2. Install file-integrity-operator.v0.1.20 3. Create FileIntegrity object oc create -f - << EOF apiVersion: fileintegrity.openshift.io/v1alpha1 kind: FileIntegrity metadata: name: example-fileintegrity namespace: openshift-file-integrity spec: debug: true config: gracePeriod: 15 EOF 4. Check fileintegritynodestatus $ oc get fileintegritynodestatus NAME NODE STATUS example-fileintegrity-ip-10-0-139-3.us-east-2.compute.internal ip-10-0-139-3.us-east-2.compute.internal Succeeded example-fileintegrity-ip-10-0-153-51.us-east-2.compute.internal ip-10-0-153-51.us-east-2.compute.internal Succeeded example-fileintegrity-ip-10-0-177-128.us-east-2.compute.internal ip-10-0-177-128.us-east-2.compute.internal Succeeded example-fileintegrity-ip-10-0-178-123.us-east-2.compute.internal ip-10-0-178-123.us-east-2.compute.internal Succeeded example-fileintegrity-ip-10-0-192-194.us-east-2.compute.internal ip-10-0-192-194.us-east-2.compute.internal Succeeded example-fileintegrity-ip-10-0-207-34.us-east-2.compute.internal ip-10-0-207-34.us-east-2.compute.internal Succeeded 5. Upgrade OCP to 4.7.34-x86_64 $ oc adm upgrade --to-image=quay.io/openshift-release-dev/ocp-release:4.7.34-x86_64 --allow-explicit-upgrade=true --force warning: Using by-tag pull specs is dangerous, and while we still allow it in combination with --force for backward compatibility, it would be much safer to pass a by-digest pull spec instead warning: The requested upgrade image is not one of the available updates. You have used --allow-explicit-upgrade to the update to proceed anyway warning: --force overrides cluster verification of your supplied release image and waives any update precondition failures. Updating to release image quay.io/openshift-release-dev/ocp-release:4.7.34-x86_64 $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.7.34 True False 17m Cluster version is 4.7.34 6. Check fileintegritynodestatus again after OCP upgrade $ oc get fileintegritynodestatus NAME NODE STATUS example-fileintegrity-ip-10-0-139-3.us-east-2.compute.internal ip-10-0-139-3.us-east-2.compute.internal Errored example-fileintegrity-ip-10-0-153-51.us-east-2.compute.internal ip-10-0-153-51.us-east-2.compute.internal Succeeded example-fileintegrity-ip-10-0-177-128.us-east-2.compute.internal ip-10-0-177-128.us-east-2.compute.internal Succeeded example-fileintegrity-ip-10-0-178-123.us-east-2.compute.internal ip-10-0-178-123.us-east-2.compute.internal Succeeded example-fileintegrity-ip-10-0-192-194.us-east-2.compute.internal ip-10-0-192-194.us-east-2.compute.internal Succeeded example-fileintegrity-ip-10-0-207-34.us-east-2.compute.internal ip-10-0-207-34.us-east-2.compute.internal Errored Actual results: After OCP upgrade, the fileintegritynodestatus object reports Errored status due to IO error while initializing the AIDE DB $ oc describe fileintegritynodestatus example-fileintegrity-ip-10-0-207-34.us-east-2.compute.internal |tail -8 Node Name: ip-10-0-207-34.us-east-2.compute.internal Results: Condition: Succeeded Last Probe Time: 2021-10-20T15:04:07Z Condition: Errored Error Message: Error initializing the AIDE DB: IO error exit status 18 Last Probe Time: 2021-10-20T15:12:08Z Events: <none> $ oc get pods -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES aide-example-fileintegrity-4qtm7 1/1 Running 4 161m 10.130.0.7 ip-10-0-153-51.us-east-2.compute.internal <none> <none> aide-example-fileintegrity-dk2xq 1/1 Running 2 161m 10.131.0.7 ip-10-0-177-128.us-east-2.compute.internal <none> <none> aide-example-fileintegrity-j2b2s 1/1 Running 0 161m 10.128.0.7 ip-10-0-192-194.us-east-2.compute.internal <none> <none> aide-example-fileintegrity-mqwfp 1/1 Running 0 161m 10.128.2.8 ip-10-0-139-3.us-east-2.compute.internal <none> <none> aide-example-fileintegrity-rxqrg 1/1 Running 0 161m 10.129.0.3 ip-10-0-178-123.us-east-2.compute.internal <none> <none> aide-example-fileintegrity-wlz7p 1/1 Running 0 161m 10.129.2.6 ip-10-0-207-34.us-east-2.compute.internal <none> <none> aide-inif357f8ab11130a12cb08f72ac33dfd2b4654eb99-hmwkc 1/1 Running 0 74m 10.129.0.11 ip-10-0-178-123.us-east-2.compute.internal <none> <none> file-integrity-operator-6c6c545c77-twjvf 1/1 Running 0 74m 10.130.0.33 ip-10-0-153-51.us-east-2.compute.internal <none> <none> $ oc logs aide-example-fileintegrity-mqwfp |grep "IO error" 2021-10-20T15:15:53Z: Error initializing the AIDE DB: IO error exit status 18 Expected results: After OCP upgrade, the fileintegritynodestatus object should report Succeeded status for all nodes Additional info: Logs will be uploaded http://virt-openshift-05.lab.eng.nay.redhat.com/pdhamdhe/
This issue noticed on AWS & GCP cloud so far after OCP upgrade. Upgrade paths were: 4.6.48-x86_64 > 4.7.34-x86_64 and 4.7.35-x86_64 > 4.8.16-x86_64 Adding logs file 2016046-aide-example-fileintegrity-l6hbf-GCP-Cloud.txt
[PR Pre-Merge Testing] Looks good. After OCP upgrade, the fileintegritynodestatus object does not report status Errored. Also AIDE database does not report IO error exit status 18 Verified on: 4.6.48-x86_64-> 4.7.35-x86_64 $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.6.48 True False 40m Cluster version is 4.6.48 $ oc login -u kubeadmin -p TGtaP-sWgtX-rSCfn-jGGrk https://api.example.qe.devcluster.openshift.com:6443/ Login successful. You have access to 59 projects, the list has been suppressed. You can list all projects with 'oc projects' Using project "default". $ gh pr checkout 207 remote: Enumerating objects: 9, done. remote: Counting objects: 100% (9/9), done. remote: Compressing objects: 100% (3/3), done. remote: Total 9 (delta 6), reused 9 (delta 6), pack-reused 0 Unpacking objects: 100% (9/9), 1.23 KiB | 140.00 KiB/s, done. From https://github.com/openshift/file-integrity-operator * [new ref] refs/pull/207/head -> io-err Switched to branch 'io-err' A new release of gh is available: 2.0.0 → v2.1.0 https://github.com/cli/cli/releases/tag/v2.1.0 $ git branch * io-err master $ make deploy-local Creating 'openshift-file-integrity' namespace/project namespace/openshift-file-integrity created /home/pdhamdhe/go/bin/operator-sdk build quay.io/file-integrity-operator/file-integrity-operator:latest --image-builder podman INFO[0025] Building OCI image quay.io/file-integrity-operator/file-integrity-operator:latest STEP 1: FROM registry.access.redhat.com/ubi8/go-toolset AS builder STEP 2: USER root --> Using cache 86a3b78bf1c18fae4af4b48de5c3c365ca6e0dd2ab9e9752cc1525d14264a402 --> 86a3b78bf1c STEP 3: WORKDIR /go/src/github.com/openshift/file-integrity-operator --> Using cache 18cb87e0b950d19474dbe0f534e43433d7741e15fb222e85c92b2bf3b72f5d58 --> 18cb87e0b95 STEP 4: ENV GOFLAGS="-mod=vendor" --> Using cache 85900c07e47343dec62129f0a4df308e8bf4cc1405c5e4788834c2b328b65185 --> 85900c07e47 STEP 5: COPY . . --> b34aa347db7 STEP 6: RUN make operator-bin GOFLAGS=-mod=vendor GO111MODULE=auto go build -o /go/src/github.com/openshift/file-integrity-operator/build/_output/bin/file-integrity-operator github.com/openshift/file-integrity-operator/cmd/manager --> 2e49d017c6f STEP 7: FROM registry.fedoraproject.org/fedora-minimal:34 STEP 8: RUN microdnf -y install aide golang && microdnf clean all --> Using cache b2932b488b31ba7c3c319b8d1ac743ec869e79e178e7744264c67f88b580d9d2 --> b2932b488b3 STEP 9: ENV OPERATOR=/usr/local/bin/file-integrity-operator USER_UID=1001 USER_NAME=file-integrity-operator --> Using cache 56a651e02c00a1277ff751b47ce2392e3c9dc88b75a4acaad162f4dc8fa882f7 --> 56a651e02c0 STEP 10: COPY --from=builder /go/src/github.com/openshift/file-integrity-operator/build/_output/bin/file-integrity-operator ${OPERATOR} --> faf71dff3fd STEP 11: COPY build/bin /usr/local/bin --> 2e8b8756509 STEP 12: RUN /usr/local/bin/user_setup + mkdir -p /root + chown 1001:0 /root + chmod ug+rwx /root + chmod g+rw /etc/passwd + rm /usr/local/bin/user_setup --> ce3462b6635 STEP 13: ENTRYPOINT ["/usr/local/bin/entrypoint"] --> c44e4969d46 STEP 14: USER ${USER_UID} STEP 15: COMMIT quay.io/file-integrity-operator/file-integrity-operator:latest --> 96ac91a0bce 96ac91a0bce5e819e64b2df79c2ec9af5b47bf9f4052948e896fb3d541210230 INFO[0106] Operator build complete. podman build -t quay.io/file-integrity-operator/file-integrity-operator-bundle:latest -f bundle.Dockerfile . STEP 1: FROM scratch STEP 2: LABEL operators.operatorframework.io.bundle.mediatype.v1=registry+v1 --> Using cache 26a8e91ab2e4f3354de988b8939b27aaeb178ca953f3de3216fc819239e3f191 --> 26a8e91ab2e STEP 3: LABEL operators.operatorframework.io.bundle.manifests.v1=manifests/ --> Using cache ef3b2a97d47e6e3248dcc4d1502700bec2c42b3de3717e038d9157a341ec9a69 --> ef3b2a97d47 STEP 4: LABEL operators.operatorframework.io.bundle.metadata.v1=metadata/ --> Using cache ee9235ef331c45523adc4a9656f90324cef507779881a1ce87c2225a36502a21 --> ee9235ef331 STEP 5: LABEL operators.operatorframework.io.bundle.package.v1=file-integrity-operator --> Using cache 28b04557efafe26b93f75483cf8eddb5f003424a3446c09433d1c9441172e3dc --> 28b04557efa STEP 6: LABEL operators.operatorframework.io.bundle.channels.v1=alpha --> Using cache 733a2fce8e3ed670ab012670bbaeec7e8861783d1856d08cafad05050b446a2c --> 733a2fce8e3 STEP 7: LABEL operators.operatorframework.io.bundle.channel.default.v1=alpha --> Using cache 1f8b0c2a625387e11069fbb747f91d69bd518d6b3accf076cefc38fab52ad139 --> 1f8b0c2a625 STEP 8: COPY deploy/olm-catalog/file-integrity-operator/manifests /manifests/ --> ee5097eaf84 STEP 9: COPY deploy/olm-catalog/file-integrity-operator/metadata /metadata/ STEP 10: COMMIT quay.io/file-integrity-operator/file-integrity-operator-bundle:latest --> 73d84f115d1 73d84f115d17107eea9baebdb89b4435bbb652eea686606646c04621ef78e4d4 IMAGE_FROM_CI variable missing. We're in local enviornment. Temporarily exposing the default route to the image registry config.imageregistry.operator.openshift.io/cluster patched Pushing image quay.io/file-integrity-operator/file-integrity-operator:latest to the image registry IMAGE_REGISTRY_HOST=$(oc get route default-route -n openshift-image-registry --template='{{ .spec.host }}'); \ podman login --tls-verify=false -u kubeadmin -p sha256~LRHwmjwcp7-xvEHghm4bp-44lwvdvPgpMKgcmHhI9pE ${IMAGE_REGISTRY_HOST}; \ podman push --tls-verify=false quay.io/file-integrity-operator/file-integrity-operator:latest ${IMAGE_REGISTRY_HOST}/openshift-file-integrity/file-integrity-operator:latest Login Succeeded! Getting image source signatures Copying blob 4c36bf23b6a4 done Copying blob 5888b325046f done Copying blob 07f1d9697bbb done Copying blob 8a9c07290549 done Copying blob 0bdbb0192544 done Copying config 96ac91a0bc done Writing manifest to image destination Copying config 96ac91a0bc [--------------------------------------] 0.0b / 2.0KiB Writing manifest to image destination Storing signatures Removing the route from the image registry config.imageregistry.operator.openshift.io/cluster patched customresourcedefinition.apiextensions.k8s.io/fileintegrities.fileintegrity.openshift.io created customresourcedefinition.apiextensions.k8s.io/fileintegritynodestatuses.fileintegrity.openshift.io created Warning: resource namespaces/openshift-file-integrity is missing the kubectl.kubernetes.io/last-applied-configuration annotation which is required by oc apply. oc apply should only be used on resources created declaratively by either oc create --save-config or oc apply. The missing annotation will be patched automatically. namespace/openshift-file-integrity configured deployment.apps/file-integrity-operator created role.rbac.authorization.k8s.io/file-integrity-operator created role.rbac.authorization.k8s.io/file-integrity-daemon created clusterrole.rbac.authorization.k8s.io/file-integrity-operator created rolebinding.rbac.authorization.k8s.io/file-integrity-operator created rolebinding.rbac.authorization.k8s.io/file-integrity-daemon created clusterrolebinding.rbac.authorization.k8s.io/file-integrity-operator created rolebinding.rbac.authorization.k8s.io/prometheus-k8s created serviceaccount/file-integrity-operator created serviceaccount/file-integrity-daemon created clusterrole.rbac.authorization.k8s.io/file-integrity-operator-metrics created clusterrolebinding.rbac.authorization.k8s.io/file-integrity-operator-metrics created $ oc project openshift-file-integrity Now using project "openshift-file-integrity" on server "https://api.example.qe.devcluster.openshift.com:6443". $ oc get pods NAME READY STATUS RESTARTS AGE file-integrity-operator-9bff4c47f-p9m98 1/1 Running 0 34s $ oc create -f - << EOF > apiVersion: fileintegrity.openshift.io/v1alpha1 > kind: FileIntegrity > metadata: > name: example-fileintegrity > namespace: openshift-file-integrity > spec: > debug: true > config: > gracePeriod: 15 > EOF fileintegrity.fileintegrity.openshift.io/example-fileintegrity created $ oc get pods NAME READY STATUS RESTARTS AGE aide-example-fileintegrity-2fwbs 1/1 Running 0 3m31s aide-example-fileintegrity-8g8jc 1/1 Running 0 3m31s aide-example-fileintegrity-d7xwb 1/1 Running 0 3m31s aide-example-fileintegrity-hxs8g 1/1 Running 0 3m31s aide-example-fileintegrity-jrptk 1/1 Running 0 3m31s aide-example-fileintegrity-thpth 1/1 Running 0 3m31s file-integrity-operator-9bff4c47f-p9m98 1/1 Running 1 5m6s $ oc get fileintegritynodestatus NAME NODE STATUS example-fileintegrity-ip-10-0-129-41.us-east-2.compute.internal ip-10-0-129-41.us-east-2.compute.internal Succeeded example-fileintegrity-ip-10-0-137-189.us-east-2.compute.internal ip-10-0-137-189.us-east-2.compute.internal Succeeded example-fileintegrity-ip-10-0-166-83.us-east-2.compute.internal ip-10-0-166-83.us-east-2.compute.internal Succeeded example-fileintegrity-ip-10-0-180-75.us-east-2.compute.internal ip-10-0-180-75.us-east-2.compute.internal Succeeded example-fileintegrity-ip-10-0-210-232.us-east-2.compute.internal ip-10-0-210-232.us-east-2.compute.internal Succeeded example-fileintegrity-ip-10-0-212-230.us-east-2.compute.internal ip-10-0-212-230.us-east-2.compute.internal Succeeded $ oc get cm NAME DATA AGE aide-pause 1 3m58s aide-reinit 1 3m58s example-fileintegrity 1 3m58s file-integrity-operator-lock 0 5m2s $ oc adm upgrade --to-image=quay.io/openshift-release-dev/ocp-release:4.7.35-x86_64 --allow-explicit-upgrade=true --force warning: Using by-tag pull specs is dangerous, and while we still allow it in combination with --force for backward compatibility, it would be much safer to pass a by-digest pull spec instead warning: The requested upgrade image is not one of the available updates. You have used --allow-explicit-upgrade to the update to proceed anyway warning: --force overrides cluster verification of your supplied release image and waives any update precondition failures. Updating to release image quay.io/openshift-release-dev/ocp-release:4.7.35-x86_64 $ oc get clusterversion -w NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.6.48 True True 6s Working towards 4.7.35: 1% complete version 4.6.48 True True 28s Working towards quay.io/openshift-release-dev/ocp-release:4.7.35-x86_64 version 4.6.48 True True 28s Working towards quay.io/openshift-release-dev/ocp-release:4.7.35-x86_64: downloading update version 4.6.48 True True 28s Working towards quay.io/openshift-release-dev/ocp-release:4.7.35-x86_64: downloading update version 4.6.48 True True 28s Working towards 4.7.35 version 4.6.48 True True 28s Working towards 4.7.35: 1 of 668 done (0% complete) version 4.6.48 True True 28s Working towards 4.7.35: 3 of 668 done (0% complete) version 4.6.48 True True 28s Working towards 4.7.35: 4 of 668 done (0% complete) version 4.6.48 True True 43s Working towards 4.7.35: 69 of 668 done (10% complete) STATUS REASON MESSAGE $ oc get clusterversion -w NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.6.48 True True 13m Working towards 4.7.35: 94 of 668 done (14% complete) version 4.6.48 True True 18m Working towards 4.7.35: 94 of 668 done (14% complete), waiting on kube-apiserver version 4.6.48 True True 20m Working towards 4.7.35: 107 of 668 done (16% complete) version 4.6.48 True True 20m Working towards 4.7.35: 108 of 668 done (16% complete) version 4.6.48 True True 20m Working towards 4.7.35: 114 of 668 done (17% complete) version 4.6.48 True True 26m Working towards 4.7.35: 115 of 668 done (17% complete), waiting on kube-scheduler version 4.6.48 True True 29m Working towards 4.7.35: 115 of 668 done (17% complete) version 4.6.48 True True 29m Working towards 4.7.35: 116 of 668 done (17% complete) version 4.6.48 True True 29m Working towards 4.7.35: 152 of 668 done (22% complete) version 4.6.48 True True 29m Working towards 4.7.35: 171 of 668 done (25% complete) version 4.6.48 True True 32m Working towards 4.7.35: 173 of 668 done (25% complete) version 4.6.48 True True 34m Working towards 4.7.35: 188 of 668 done (28% complete) version 4.6.48 True True 34m Working towards 4.7.35: 199 of 668 done (29% complete) version 4.6.48 True True 34m Working towards 4.7.35: 218 of 668 done (32% complete) version 4.6.48 True True 34m Unable to apply 4.7.35: an unknown error has occurred: MultipleErrors version 4.6.48 True True 38m Working towards 4.7.35: 383 of 668 done (57% complete) version 4.6.48 True True 38m Working towards 4.7.35: 472 of 668 done (70% complete) version 4.6.48 True True 39m Working towards 4.7.35: 476 of 668 done (71% complete) version 4.6.48 True True 39m Working towards 4.7.35: 497 of 668 done (74% complete) version 4.6.48 True True 39m Working towards 4.7.35: 498 of 668 done (74% complete) version 4.6.48 True True 39m Working towards 4.7.35: 499 of 668 done (74% complete) version 4.6.48 True True 39m Working towards 4.7.35: 513 of 668 done (76% complete) version 4.6.48 True True 39m Working towards 4.7.35: 519 of 668 done (77% complete) version 4.6.48 True True 41m Working towards 4.7.35: 523 of 668 done (78% complete) version 4.6.48 True True 41m Working towards 4.7.35: 530 of 668 done (79% complete) version 4.6.48 True True 43m Working towards 4.7.35: 530 of 668 done (79% complete), waiting on network version 4.6.48 True True 47m Working towards 4.7.35: 530 of 668 done (79% complete) version 4.6.48 True True 48m Working towards 4.7.35: 559 of 668 done (83% complete) version 4.6.48 True True 52m Working towards 4.7.35: 559 of 668 done (83% complete), waiting on machine-config $ oc get clusterversion -w NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.7.35 True False 2m18s Cluster version is 4.7.35 $ oc get fileintegritynodestatus NAME NODE STATUS example-fileintegrity-ip-10-0-129-41.us-east-2.compute.internal ip-10-0-129-41.us-east-2.compute.internal Succeeded example-fileintegrity-ip-10-0-137-189.us-east-2.compute.internal ip-10-0-137-189.us-east-2.compute.internal Succeeded example-fileintegrity-ip-10-0-166-83.us-east-2.compute.internal ip-10-0-166-83.us-east-2.compute.internal Failed example-fileintegrity-ip-10-0-180-75.us-east-2.compute.internal ip-10-0-180-75.us-east-2.compute.internal Succeeded example-fileintegrity-ip-10-0-210-232.us-east-2.compute.internal ip-10-0-210-232.us-east-2.compute.internal Succeeded example-fileintegrity-ip-10-0-212-230.us-east-2.compute.internal ip-10-0-212-230.us-east-2.compute.internal Succeeded $ oc get cm NAME DATA AGE aide-example-fileintegrity-ip-10-0-129-41.us-east-2.compute.internal-failed 1 21m aide-example-fileintegrity-ip-10-0-137-189.us-east-2.compute.internal-failed 1 21m aide-example-fileintegrity-ip-10-0-166-83.us-east-2.compute.internal-failed 1 21m aide-example-fileintegrity-ip-10-0-180-75.us-east-2.compute.internal-failed 1 72m aide-example-fileintegrity-ip-10-0-210-232.us-east-2.compute.internal-failed 1 22m aide-example-fileintegrity-ip-10-0-212-230.us-east-2.compute.internal-failed 1 22m aide-pause 1 80m aide-reinit 1 80m example-fileintegrity 1 80m file-integrity-operator-lock 0 8m58s kube-root-ca.crt 1 50m openshift-service-ca.crt 1 50m
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (File Integrity Operator version 0.1.21 bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:4631