Bug 2117268
| Summary: | ocp4-pci-dss-api-checks-pod in CrashLoopBackoff state due to ignition spec.config not in MC | ||||||
|---|---|---|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | mapillai | ||||
| Component: | Compliance Operator | Assignee: | Jakub Hrozek <jhrozek> | ||||
| Status: | CLOSED ERRATA | QA Contact: | xiyuan | ||||
| Severity: | high | Docs Contact: | Jeana Routh <jrouth> | ||||
| Priority: | high | ||||||
| Version: | 4.11 | CC: | lbragsta, mrogers, wenshen, xiyuan | ||||
| Target Milestone: | --- | ||||||
| Target Release: | 4.12.0 | ||||||
| Hardware: | s390 | ||||||
| OS: | Linux | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||
| Doc Text: |
* Previously, the Compliance Operator failed to fetch API resources when parsing machine configurations without ignition specifications. This caused the `api-check-pods` check to crash loop. With this release, the Compliance Operator is updated to gracefully handle machine config pools without ignition specifications.
(link:https://bugzilla.redhat.com/show_bug.cgi?id=2117268[*BZ#2117268*])
|
Story Points: | --- | ||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2022-11-02 16:00:55 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
MC content of 99-master-kargs-mpath
oc get mc 99-master-kargs-mpath -o yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"machineconfiguration.openshift.io/v1","kind":"MachineConfig","metadata":{"annotations":{},"labels":{"machineconfiguration.openshift.io/role":"master"},"name":"99-master-kargs-mpath"},"spec":{"kernelArguments":["rd.multipath=default","root=/dev/disk/by-label/dm-mpath-root"]}}
creationTimestamp: "2022-08-09T17:20:24Z"
generation: 1
labels:
machineconfiguration.openshift.io/role: master
name: 99-master-kargs-mpath
resourceVersion: "32552"
uid: 676bdf51-e8e6-4d47-9b12-eb5b85763193
spec:
kernelArguments:
- rd.multipath=default
- root=/dev/disk/by-label/dm-mpath-root
oc get mc
NAME GENERATEDBYCONTROLLER IGNITIONVERSION AGE
00-master 35d79621a58766190071f95415f0bef74ee204a7 3.2.0 19h
00-worker 35d79621a58766190071f95415f0bef74ee204a7 3.2.0 19h
01-master-container-runtime 35d79621a58766190071f95415f0bef74ee204a7 3.2.0 19h
01-master-kubelet 35d79621a58766190071f95415f0bef74ee204a7 3.2.0 19h
01-worker-container-runtime 35d79621a58766190071f95415f0bef74ee204a7 3.2.0 19h
01-worker-kubelet 35d79621a58766190071f95415f0bef74ee204a7 3.2.0 19h
99-master-fips 3.2.0 19h
99-master-generated-registries 35d79621a58766190071f95415f0bef74ee204a7 3.2.0 19h
99-master-kargs-mpath 19h
99-master-ssh 3.2.0 19h
99-worker-fips 3.2.0 19h
99-worker-generated-registries 35d79621a58766190071f95415f0bef74ee204a7 3.2.0 19h
99-worker-ssh 3.2.0 19h
rendered-master-31752550f462eb064898b4438d3eddc0 35d79621a58766190071f95415f0bef74ee204a7 3.2.0 19h
rendered-master-7775ea95940677bd253bce8f83e30a0f 35d79621a58766190071f95415f0bef74ee204a7 3.2.0 19h
rendered-master-db9a448ba06608879ebcbd032edd2834 35d79621a58766190071f95415f0bef74ee204a7 3.2.0 5h37m
rendered-worker-34e13f499a246c46951d0ae7efee91b7 35d79621a58766190071f95415f0bef74ee204a7 3.2.0 19h
rendered-worker-482af1dbfb11969f0d7236345ceb8b4e 35d79621a58766190071f95415f0bef74ee204a7 3.2.0 5h3
As discussed on Slack, this is a legit bug. This might be an easier MC to test with:
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
creationTimestamp: "2022-08-11T10:20:36Z"
generation: 1
labels:
machineconfiguration.openshift.io/role: worker
name: 99-worker-kargs-audit
resourceVersion: "58789"
uid: bafa4e44-976b-4628-bec3-4b0e07655c05
spec:
kernelArguments:
- audit=1
Setting blocker- because even if embarassing, this bug has an easy workaround of either merging the non-ignition params into another MC that has ignition or the other way around. verification pass with pre-merge process.
$ git log |head
commit 3bcc1bd367a6ec8270b5c92b94d4a06a9f479804
Author: Jakub Hrozek <jhrozek>
Date: Thu Aug 11 14:33:26 2022 +0200
api-resource-collector: Don't attempt to parse empty Ignition
While filtering files out of MachineConfigs, we tried to also parse
ignition specification of MachineConfigs that didn't specify any,
leading to an error.
$ oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.12.0-0.nightly-2022-08-15-150248 True False 8h Cluster version is 4.12.0-0.nightly-2022-08-15-150248
1. create mc without ignition defined:
$ oc apply -f -<<EOF
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
creationTimestamp: "2022-08-11T10:20:36Z"
generation: 1
labels:
machineconfiguration.openshift.io/role: worker
name: 99-worker-kargs-audit
resourceVersion: "58789"
uid: bafa4e44-976b-4628-bec3-4b0e07655c05
spec:
kernelArguments:
- audit=1
EOF
machineconfig.machineconfiguration.openshift.io/99-worker-kargs-audit created
2. Create large scale of mc:
# python3 genmc.py --in-file=/usr/share/dict/linux.words --num-files=150 --start-num=0 --create
machineconfig.machineconfiguration.openshift.io/75-testmc-1.rule created
machineconfig.machineconfiguration.openshift.io/75-testmc-2.rule created
machineconfig.machineconfiguration.openshift.io/75-testmc-3.rule created
machineconfig.machineconfiguration.openshift.io/75-testmc-4.rule created
machineconfig.machineconfiguration.openshift.io/75-testmc-5.rule created
machineconfig.machineconfiguration.openshift.io/75-testmc-6.rule created
machineconfig.machineconfiguration.openshift.io/75-testmc-7.rule created
machineconfig.machineconfiguration.openshift.io/75-testmc-8.rule created
machineconfig.machineconfiguration.openshift.io/75-testmc-9.rule created
machineconfig.machineconfiguration.openshift.io/75-testmc-10.rule created
...
machineconfig.machineconfiguration.openshift.io/75-testmc-150.rule created
$ oc get mc -o json | pv -b >/dev/null
172MiB
3. trigger test:
# oc apply -f -<<EOF
apiVersion: compliance.openshift.io/v1alpha1
kind: ScanSettingBinding
metadata:
name: my-ssb-r
profiles:
- name: ocp4-pci-dss
kind: Profile
apiGroup: compliance.openshift.io/v1alpha1
- name: ocp4-pci-dss-node
kind: Profile
apiGroup: compliance.openshift.io/v1alpha1
settingsRef:
name: default
kind: ScanSetting
apiGroup: compliance.openshift.io/v1alpha1
EOF
scansettingbinding.compliance.openshift.io/my-ssb-r created
# oc get suite -w
NAME PHASE RESULT
my-ssb-r RUNNING NOT-AVAILABLE
my-ssb-r AGGREGATING NOT-AVAILABLE
my-ssb-r DONE NON-COMPLIANT
my-ssb-r DONE NON-COMPLIANT
Verified with 4.12.0-0.nightly-2022-09-20-095559 + compliance-operator.v0.1.55.
$ oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.12.0-0.nightly-2022-09-20-095559 True False 7h40m Cluster version is 4.12.0-0.nightly-2022-09-20-095559
$ oc get csv
NAME DISPLAY VERSION REPLACES PHASE
compliance-operator.v0.1.55 Compliance Operator 0.1.55 Succeeded
$ oc apply -f -<<EOF
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
creationTimestamp: "2022-08-11T10:20:36Z"
generation: 1
labels:
machineconfiguration.openshift.io/role: worker
name: 99-worker-kargs-audit
resourceVersion: "58789"
uid: bafa4e44-976b-4628-bec3-4b0e07655c05
spec:
kernelArguments:
- audit=1
EOF
machineconfig.machineconfiguration.openshift.io/99-worker-kargs-audit created
$ oc apply -f -<<EOF
apiVersion: compliance.openshift.io/v1alpha1
kind: ScanSettingBinding
metadata:
name: my-ssb-r
profiles:
- name: ocp4-pci-dss
kind: Profile
apiGroup: compliance.openshift.io/v1alpha1
- name: ocp4-pci-dss-node
kind: Profile
apiGroup: compliance.openshift.io/v1alpha1
settingsRef:
name: default
kind: ScanSetting
apiGroup: compliance.openshift.io/v1alpha1
EOF
scansettingbinding.compliance.openshift.io/my-ssb-r created
$ oc get suite -w
NAME PHASE RESULToc g
my-ssb-r RUNNING NOT-AVAILABLE
my-ssb-r RUNNING NOT-AVAILABLE
my-ssb-r RUNNING NOT-AVAILABLE
my-ssb-r AGGREGATING NOT-AVAILABLE
my-ssb-r AGGREGATING NOT-AVAILABLE
my-ssb-r AGGREGATING NOT-AVAILABLE
my-ssb-r DONE NON-COMPLIANT
my-ssb-r DONE NON-COMPLIANT
$ oc get scan
NAME PHASE RESULT
ocp4-pci-dss DONE NON-COMPLIANT
ocp4-pci-dss-node-master DONE COMPLIANT
ocp4-pci-dss-node-worker DONE COMPLIANT
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Compliance Operator bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:6657 |
Created attachment 1904711 [details] The logs from the pod for all the containers. Description of problem: While enabling the PCI scan the ocp4-cpi-dss-api-checks-pod is crashloopbackoff as the containers are not starting up. Have a word with Jhrozek and looks like the issue is with ignition spec.config not present in the MC. Version-Release number of selected component (if applicable): CO version 0.1.53 OCP version 4.11.0-rc.2 How reproducible: Steps to Reproduce: 1. Install the compliance operator in s390x 2. Create/Apply the scan setting and scan binding for the ocp4-pci-dss and ocp4-pci-dss-node 3. Actual results: ocp4-pci-dss-api-checks-pod 0/2 Init:CrashLoopBackOff 40 (4m23s ago) 3h13m Expected results: ocp4-pci-dss-api-checks-pod 2/2 Running 40 (4m23s ago) 3h13m Additional info: