Bug 1972559
| Summary: | compliancecheckresults fails with inconsistent results | ||||||
|---|---|---|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | peter ducai <pducai> | ||||
| Component: | Compliance Operator | Assignee: | Matt Rogers <mrogers> | ||||
| Status: | CLOSED ERRATA | QA Contact: | Prashant Dhamdhere <pdhamdhe> | ||||
| Severity: | medium | Docs Contact: | |||||
| Priority: | medium | ||||||
| Version: | 4.6 | CC: | jhrozek, josorior, knewcome, mrogers, shaising, sople, xiyuan | ||||
| Target Milestone: | --- | ||||||
| Target Release: | 4.9.0 | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2021-09-07 06:05:14 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
Sorry for the delay in replying. I think this is just a matter of tailoring the profile to your environment, although I agree that the documentation and exposing the variables is currently not the best. I'm working on testing a tailored profile that would match your customer's kubelet config. Sorry again for the late reply. Using the customer's kubelet config, this tailoring file worked for me:
apiVersion: compliance.openshift.io/v1alpha1
kind: TailoredProfile
metadata:
name: cis-node-bz
spec:
extends: ocp4-moderate-node
title: CIS node tailored for BZ-1972559
setValues:
# evictionHard
- name: ocp4-var-kubelet-evictionhard-imagefs-available
rationale: evictionHard.imagefs.available = 15%
value: 15%
- name: ocp4-var-kubelet-evictionhard-nodefs-inodesfree
rationale: evictionHard.imagefs.inodesFree = 5%
value: 5%
- name: ocp4-var-kubelet-evictionhard-nodefs-available
rationale: evictionHard.imagefs.nodefs.available = 10%
value: 10%
- name: ocp4-var-kubelet-evictionhard-memory-available
rationale: evictionHard.memory.available = 100Mi
value: 100Mi
# evictionSoft
- name: ocp4-var-kubelet-evictionsoft-imagefs-available
rationale: evictionSoft.imagefs.available = 15%
value: 15%
- name: ocp4-var-kubelet-evictionsoft-nodefs-inodesfree
rationale: evictionSoft.imagefs.inodesFree = 5%
value: 5%
- name: ocp4-var-kubelet-evictionsoft-nodefs-available
rationale: evictionSoft.imagefs.nodefs.available = 10%
value: 10%
- name: ocp4-var-kubelet-evictionsoft-memory-available
rationale: evictionSoft.memory.available = 100Mi
value: 100Mi
disableRules:
- name: ocp4-kubelet-eviction-thresholds-set-hard-imagefs-inodesfree
rationale: The customer's kubelet doesn't seem to set this
- name: ocp4-kubelet-eviction-thresholds-set-soft-imagefs-inodesfree
rationale: The customer's kubelet doesn't seem to set this
As you can see, the tailorig file sets the values of the variables to the values that the customer is using. I then created the TailoredProfile out of the file:
$ oc apply -f tailoring/tailored-cis-node-bz.yam
which created a tailored profile:
$ oc get tailoredprofile
NAME STATE
cis-node-bz READY
And then I used that in a ScanSettingBinding:
$ cat tailoring/bindings-cis-node-bz.yaml
apiVersion: compliance.openshift.io/v1alpha1
kind: ScanSettingBinding
metadata:
name: cis-bz
namespace: openshift-compliance
profiles:
# Node checks
- name: cis-node-bz
kind: TailoredProfile
apiGroup: compliance.openshift.io/v1alpha1
# Cluster checks
- name: ocp4-cis
kind: Profile
apiGroup: compliance.openshift.io/v1alpha1
settingsRef:
name: default
kind: ScanSetting
apiGroup: compliance.openshift.io/v1alpha1
Using this file, all the checks for the kubelet parameters passed using the customer's kubelet:
oc get ccr -l 'compliance.openshift.io/scan-name=cis-node-bz-worker' | grep kubelet
cis-node-bz-worker-file-groupowner-kubelet-conf PASS medium
cis-node-bz-worker-file-owner-kubelet-conf PASS medium
cis-node-bz-worker-file-permissions-kubelet-conf PASS medium
cis-node-bz-worker-kubelet-anonymous-auth PASS medium
cis-node-bz-worker-kubelet-authorization-mode PASS medium
cis-node-bz-worker-kubelet-configure-client-ca PASS medium
cis-node-bz-worker-kubelet-configure-event-creation PASS medium
cis-node-bz-worker-kubelet-configure-tls-cipher-suites PASS medium
cis-node-bz-worker-kubelet-disable-hostname-override PASS medium
cis-node-bz-worker-kubelet-enable-cert-rotation PASS medium
cis-node-bz-worker-kubelet-enable-client-cert-rotation PASS medium
cis-node-bz-worker-kubelet-enable-iptables-util-chains PASS medium
cis-node-bz-worker-kubelet-enable-protect-kernel-defaults FAIL medium
cis-node-bz-worker-kubelet-enable-server-cert-rotation PASS medium
cis-node-bz-worker-kubelet-enable-streaming-connections PASS medium
cis-node-bz-worker-kubelet-eviction-thresholds-set-hard-imagefs-available PASS medium
cis-node-bz-worker-kubelet-eviction-thresholds-set-hard-memory-available PASS medium
cis-node-bz-worker-kubelet-eviction-thresholds-set-hard-nodefs-available PASS medium
cis-node-bz-worker-kubelet-eviction-thresholds-set-hard-nodefs-inodesfree PASS medium
cis-node-bz-worker-kubelet-eviction-thresholds-set-soft-imagefs-available PASS medium
cis-node-bz-worker-kubelet-eviction-thresholds-set-soft-memory-available PASS medium
cis-node-bz-worker-kubelet-eviction-thresholds-set-soft-nodefs-available PASS medium
cis-node-bz-worker-kubelet-eviction-thresholds-set-soft-nodefs-inodesfree PASS medium
except the protect-kernel-defaults one, but I guess if the customer doesn't need to comply with that recommendation, they can also remove that rule from the tailoring.
Some additional notes: 1) when I was playing with the tailoring, I realized that changing the TailoredProfile CR is not reflected in the configMap that Compliance Operator generates from the TailoredProfile for use by the OpenScap scanner. This is a bug and I'm going to fix it. In the meantime, deleting the Tailoredprofile and recreating is a workaround. Tracked by https://issues.redhat.com/browse/CMP-1008 2) I realize and acknowledge that the documentation around the variables and the tuning is very poor. Feel free to file a docs bug (and CC me so I can provide some concrent information there and our docs writers don't have to dive too deep). I also filed to follow-up tickets https://issues.redhat.com/browse/CMP-1006 and https://issues.redhat.com/browse/CMP-1007 to make the variables easier to expose and consume. In the meantime, I'm afraid the best way is to just run "oc get variables.compliance" and then "oc describe" the variables to see what is used and what values are available. Also https://github.com/openshift/compliance-operator/blob/master/doc/tutorials/workshop/content/exercises/05-tailoring-profiles.md#disable-rules-in-a-profile might help somewhat. 3) the protect-kernel-defaults one is quite tricky. Just setting the variable to true would crash kubelet with the default sysctls. The sysctls can be set with a MC file found here: https://github.com/ComplianceAsCode/content/blob/master/ocp-resources/kubelet-sysctls-mc.yaml But nonetheless I would recommend to first test things out with some test cluster as kubelet not coming up in prod can be bad. One more note, we're also tracking setting the kubelet config in a nicer manner in https://issues.redhat.com/browse/CMP-912 [Bug_Verification] This looks good. It checks the value in the kubelet threshold checks for all kubelet-eviction rules. Verified on: 4.8.0-0.nightly-2021-08-23-234834 + compliance-operator.v0.1.39 $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.8.0-0.nightly-2021-08-23-234834 True False 6h15m Cluster version is 4.8.0-0.nightly-2021-08-23-234834 $ oc get csv NAME DISPLAY VERSION REPLACES PHASE compliance-operator.v0.1.39 Compliance Operator 0.1.39 Succeeded elasticsearch-operator.5.1.1-42 OpenShift Elasticsearch Operator 5.1.1-42 Succeeded $ oc get machineconfigpool --show-labels NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE LABELS master rendered-master-bd244583a02138b1ea2968089313c2b0 True False False 3 3 3 0 5h29m machineconfiguration.openshift.io/mco-built-in=,operator.machineconfiguration.openshift.io/required-for-upgrade=,pools.operator.machineconfiguration.openshift.io/master= worker rendered-worker-0eaa90354bf63581601be7b39a0ba89c True False False 3 3 3 0 5h29m machineconfiguration.openshift.io/mco-built-in=,pools.operator.machineconfiguration.openshift.io/worker= $ oc get nodes NAME STATUS ROLES AGE VERSION ip-10-0-135-188.us-east-2.compute.internal Ready worker 5h40m v1.21.1+9807387 ip-10-0-148-205.us-east-2.compute.internal Ready master 5h46m v1.21.1+9807387 ip-10-0-177-244.us-east-2.compute.internal Ready master 5h46m v1.21.1+9807387 ip-10-0-181-138.us-east-2.compute.internal Ready worker 5h38m v1.21.1+9807387 ip-10-0-204-255.us-east-2.compute.internal Ready worker 5h40m v1.21.1+9807387 ip-10-0-211-145.us-east-2.compute.internal Ready master 5h46m v1.21.1+9807387 $ oc label machineconfigpool worker cis-hardening=true machineconfigpool.machineconfiguration.openshift.io/worker labeled $ oc get machineconfigpool --show-labels NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE LABELS master rendered-master-bd244583a02138b1ea2968089313c2b0 True False False 3 3 3 0 5h50m machineconfiguration.openshift.io/mco-built-in=,operator.machineconfiguration.openshift.io/required-for-upgrade=,pools.operator.machineconfiguration.openshift.io/master= worker rendered-worker-0eaa90354bf63581601be7b39a0ba89c True False False 3 3 3 0 5h50m cis-hardening=true,machineconfiguration.openshift.io/mco-built-in=,pools.operator.machineconfiguration.openshift.io/worker= $ oc project default Now using project "default" on server "https://api.pdhamdhe-2348.qe.devcluster.openshift.com:6443". $ cat worker-kube-config.yaml apiVersion: machineconfiguration.openshift.io/v1 kind: KubeletConfig metadata: name: cis-hardening spec: machineConfigPoolSelector: matchLabels: cis-hardening: "true" kubeletConfig: eventRecordQPS: 5 tlsCipherSuites: - TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 - TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384 - TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 - TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256 protectKernelDefaults: false evictionSoftGracePeriod: memory.available: "5m" nodefs.available: "5m" nodefs.inodesFree: "5m" imagefs.available: "5m" evictionHard: memory.available: "100Mi" nodefs.available: "10%" nodefs.inodesFree: "5%" imagefs.available: "15%" evictionSoft: memory.available: "100Mi" nodefs.available: "10%" nodefs.inodesFree: "5%" imagefs.available: "15%" $ oc create -f worker-kube-config.yaml kubeletconfig.machineconfiguration.openshift.io/cis-hardening created $ oc get mcp -w NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE master rendered-master-bd244583a02138b1ea2968089313c2b0 True False False 3 3 3 0 5h55m worker rendered-worker-0eaa90354bf63581601be7b39a0ba89c False True False 3 1 1 0 5h55m worker rendered-worker-0eaa90354bf63581601be7b39a0ba89c False True False 3 2 2 0 5h56m worker rendered-worker-0eaa90354bf63581601be7b39a0ba89c False True False 3 2 2 0 5h56m worker rendered-worker-9f03acf618ae1c8d022a0b8cdb2bc204 True False False 3 3 3 0 5h58m $ oc get mcp NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE master rendered-master-bd244583a02138b1ea2968089313c2b0 True False False 3 3 3 0 6h3m worker rendered-worker-9f03acf618ae1c8d022a0b8cdb2bc204 True False False 3 3 3 0 6h3m $ oc get nodes NAME STATUS ROLES AGE VERSION ip-10-0-135-188.us-east-2.compute.internal Ready worker 5h58m v1.21.1+9807387 ip-10-0-148-205.us-east-2.compute.internal Ready master 6h4m v1.21.1+9807387 ip-10-0-177-244.us-east-2.compute.internal Ready master 6h4m v1.21.1+9807387 ip-10-0-181-138.us-east-2.compute.internal Ready worker 5h56m v1.21.1+9807387 ip-10-0-204-255.us-east-2.compute.internal Ready worker 5h58m v1.21.1+9807387 ip-10-0-211-145.us-east-2.compute.internal Ready master 6h4m v1.21.1+9807387 $ oc debug -q node/ip-10-0-204-255.us-east-2.compute.internal -- jq -r '.evictionHard."imagefs.available"' /host/etc/kubernetes/kubelet.conf 15% $ oc project openshift-compliance Now using project "openshift-compliance" on server "https://api.pdhamdhe-2348.qe.devcluster.openshift.com:6443". $ oc debug -q node/ip-10-0-204-255.us-east-2.compute.internal sh-4.4# cat /host/etc/kubernetes/kubelet.conf { "kind": "KubeletConfiguration", "apiVersion": "kubelet.config.k8s.io/v1beta1", "staticPodPath": "/etc/kubernetes/manifests", "syncFrequency": "0s", "fileCheckFrequency": "0s", "httpCheckFrequency": "0s", "tlsCipherSuites": [ "TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384", "TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384", "TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256", "TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256" ], "tlsMinVersion": "VersionTLS12", "rotateCertificates": true, "serverTLSBootstrap": true, "authentication": { "x509": { "clientCAFile": "/etc/kubernetes/kubelet-ca.crt" }, "webhook": { "cacheTTL": "0s" }, "anonymous": { "enabled": false } }, "authorization": { "webhook": { "cacheAuthorizedTTL": "0s", "cacheUnauthorizedTTL": "0s" } }, "eventRecordQPS": 5, "clusterDomain": "cluster.local", "clusterDNS": [ "172.30.0.10" ], "streamingConnectionIdleTimeout": "0s", "nodeStatusUpdateFrequency": "0s", "nodeStatusReportFrequency": "0s", "imageMinimumGCAge": "0s", "volumeStatsAggPeriod": "0s", "systemCgroups": "/system.slice", "cgroupRoot": "/", "cgroupDriver": "systemd", "cpuManagerReconcilePeriod": "0s", "runtimeRequestTimeout": "0s", "maxPods": 250, "kubeAPIQPS": 50, "kubeAPIBurst": 100, "serializeImagePulls": false, "evictionHard": { "imagefs.available": "15%", "memory.available": "100Mi", "nodefs.available": "10%", "nodefs.inodesFree": "5%" }, "evictionSoft": { "imagefs.available": "15%", "memory.available": "100Mi", "nodefs.available": "10%", "nodefs.inodesFree": "5%" }, "evictionSoftGracePeriod": { "imagefs.available": "5m", "memory.available": "5m", "nodefs.available": "5m", "nodefs.inodesFree": "5m" }, "evictionPressureTransitionPeriod": "0s", "featureGates": { "APIPriorityAndFairness": true, "DownwardAPIHugePages": true, "LegacyNodeRoleBehavior": false, "NodeDisruptionExclusion": true, "RotateKubeletServerCertificate": true, "ServiceNodeExclusion": true, "SupportPodPidsLimit": true }, "containerLogMaxSize": "50Mi", "systemReserved": { "ephemeral-storage": "1Gi" }, "logging": {}, "shutdownGracePeriod": "0s", "shutdownGracePeriodCriticalPods": "0s" } sh-4.4# exit $ oc get pods NAME READY STATUS RESTARTS AGE compliance-operator-bb9f644cc-xwfnq 1/1 Running 1 4h14m ocp4-openshift-compliance-pp-6d7c7db4bd-jwnnq 1/1 Running 0 4h13m rhcos4-openshift-compliance-pp-c7b548bd-9hqvz 1/1 Running 0 4h13m $ oc create -f - << EOF > apiVersion: compliance.openshift.io/v1alpha1 > kind: TailoredProfile > metadata: > name: cis-node-bz > spec: > extends: ocp4-moderate-node > title: CIS node tailored for BZ-1972559 > setValues: > # evictionHard > - name: ocp4-var-kubelet-evictionhard-imagefs-available > rationale: evictionHard.imagefs.available = 15% > value: 15% > - name: ocp4-var-kubelet-evictionhard-nodefs-inodesfree > rationale: evictionHard.imagefs.inodesFree = 5% > value: 5% > - name: ocp4-var-kubelet-evictionhard-nodefs-available > rationale: evictionHard.imagefs.nodefs.available = 10% > value: 10% > - name: ocp4-var-kubelet-evictionhard-memory-available > rationale: evictionHard.memory.available = 100Mi > value: 100Mi > # evictionSoft > - name: ocp4-var-kubelet-evictionsoft-imagefs-available > rationale: evictionSoft.imagefs.available = 15% > value: 15% > - name: ocp4-var-kubelet-evictionsoft-nodefs-inodesfree > rationale: evictionSoft.imagefs.inodesFree = 5% > value: 5% > - name: ocp4-var-kubelet-evictionsoft-nodefs-available > rationale: evictionSoft.imagefs.nodefs.available = 10% > value: 10% > - name: ocp4-var-kubelet-evictionsoft-memory-available > rationale: evictionSoft.memory.available = 100Mi > value: 100Mi > disableRules: > - name: ocp4-kubelet-eviction-thresholds-set-hard-imagefs-inodesfree > rationale: The customer's kubelet doesn't seem to set this > - name: ocp4-kubelet-eviction-thresholds-set-soft-imagefs-inodesfree > rationale: The customer's kubelet doesn't seem to set this > EOF tailoredprofile.compliance.openshift.io/cis-node-bz created $ oc get TailoredProfile NAME STATE cis-node-bz READY $ oc create -f - << EOF > apiVersion: compliance.openshift.io/v1alpha1 > kind: ScanSettingBinding > metadata: > name: cis-bz > namespace: openshift-compliance > profiles: > # Node checks > - name: cis-node-bz > kind: TailoredProfile > apiGroup: compliance.openshift.io/v1alpha1 > # Cluster checks > - name: ocp4-cis > kind: Profile > apiGroup: compliance.openshift.io/v1alpha1 > settingsRef: > name: default > kind: ScanSetting > apiGroup: compliance.openshift.io/v1alpha1 > EOF scansettingbinding.compliance.openshift.io/cis-bz created ]$ oc get suite -w NAME PHASE RESULT cis-bz PENDING NOT-AVAILABLE cis-bz LAUNCHING NOT-AVAILABLE cis-bz LAUNCHING NOT-AVAILABLE cis-bz LAUNCHING NOT-AVAILABLE cis-bz RUNNING NOT-AVAILABLE $ oc get suite NAME PHASE RESULT cis-bz DONE NON-COMPLIANT $ oc get pods NAME READY STATUS RESTARTS AGE aggregator-pod-cis-node-bz-master 0/1 Completed 0 78s aggregator-pod-cis-node-bz-worker 0/1 Completed 0 89s aggregator-pod-ocp4-cis 0/1 Completed 0 78s compliance-operator-bb9f644cc-xwfnq 1/1 Running 1 4h19m ocp4-cis-api-checks-pod 0/2 Completed 0 109s ocp4-openshift-compliance-pp-6d7c7db4bd-jwnnq 1/1 Running 0 4h17m openscap-pod-03609e5ad2bea14ed075cccdd5ec9dc470c9e40f 0/2 Completed 0 116s openscap-pod-64df8668170b580a0de1163d34c8c127ae9e4891 0/2 Completed 0 116s openscap-pod-6d6aa5225203ba17dda78abbf29ffdc912653a70 0/2 Completed 0 116s openscap-pod-9b5f85e315dce20ab418c80fb2aef3bd16331c04 0/2 Completed 0 108s openscap-pod-d1bca77b2de582fe7acc13256246eba50ae9b6e5 0/2 Completed 0 108s openscap-pod-eb8b23263548944696ea25c82e9b3fa33da451b9 0/2 Completed 0 109s rhcos4-openshift-compliance-pp-c7b548bd-9hqvz 1/1 Running 0 4h17m $ oc get ccr -l 'compliance.openshift.io/scan-name=cis-node-bz-worker' | grep kubelet-eviction cis-node-bz-worker-kubelet-eviction-thresholds-set-hard-imagefs-available PASS medium cis-node-bz-worker-kubelet-eviction-thresholds-set-hard-memory-available PASS medium cis-node-bz-worker-kubelet-eviction-thresholds-set-hard-nodefs-available PASS medium cis-node-bz-worker-kubelet-eviction-thresholds-set-hard-nodefs-inodesfree PASS medium cis-node-bz-worker-kubelet-eviction-thresholds-set-soft-imagefs-available PASS medium cis-node-bz-worker-kubelet-eviction-thresholds-set-soft-memory-available PASS medium cis-node-bz-worker-kubelet-eviction-thresholds-set-soft-nodefs-available PASS medium cis-node-bz-worker-kubelet-eviction-thresholds-set-soft-nodefs-inodesfree PASS medium $ for rule in $(oc get ccr -l 'compliance.openshift.io/scan-name=cis-node-bz-worker' | grep kubelet-eviction | awk '{print $1}'); do echo -e "\n\n >>>>> Print instruction for" $rule ;oc get ccr $rule -ojsonpath={.instructions}; done >>>>> Print instruction for cis-node-bz-worker-kubelet-eviction-thresholds-set-hard-imagefs-available Run the following command on the kubelet node(s): $ oc debug -q node/$NODE -- jq -r '.evictionHard."imagefs.available"' /host/etc/kubernetes/kubelet.conf and make sure it outputs a value. >>>>> Print instruction for cis-node-bz-worker-kubelet-eviction-thresholds-set-hard-memory-available Run the following command on the kubelet node(s): $ oc debug -q node/$NODE -- jq -r '.evictionHard."memory.available"' /host/etc/kubernetes/kubelet.conf and make sure it outputs a value. >>>>> Print instruction for cis-node-bz-worker-kubelet-eviction-thresholds-set-hard-nodefs-available Run the following command on the kubelet node(s): $ oc debug -q node/$NODE -- jq -r '.evictionHard."nodefs.available"' /host/etc/kubernetes/kubelet.conf and make sure it outputs a value. >>>>> Print instruction for cis-node-bz-worker-kubelet-eviction-thresholds-set-hard-nodefs-inodesfree Run the following command on the kubelet node(s): $ oc debug -q node/$NODE -- jq -r '.evictionHard."nodefs.inodesFree"' /host/etc/kubernetes/kubelet.conf and make sure it outputs a value. >>>>> Print instruction for cis-node-bz-worker-kubelet-eviction-thresholds-set-soft-imagefs-available Run the following command on the kubelet node(s): $ oc debug -q node/$NODE -- jq -r '.evictionSoft."imagefs.available"' /host/etc/kubernetes/kubelet.conf and make sure it outputs a value. >>>>> Print instruction for cis-node-bz-worker-kubelet-eviction-thresholds-set-soft-memory-available Run the following command on the kubelet node(s): $ oc debug -q node/$NODE -- jq -r '.evictionSoft."memory.available"' /host/etc/kubernetes/kubelet.conf and make sure it outputs a value. >>>>> Print instruction for cis-node-bz-worker-kubelet-eviction-thresholds-set-soft-nodefs-available Run the following command on the kubelet node(s): $ oc debug -q node/$NODE -- jq -r '.evictionSoft."nodefs.available"' /host/etc/kubernetes/kubelet.conf and make sure it outputs a value. >>>>> Print instruction for cis-node-bz-worker-kubelet-eviction-thresholds-set-soft-nodefs-inodesfree Run the following command on the kubelet node(s): $ oc debug -q node/$NODE -- jq -r '.evictionSoft."nodefs.inodesFree"' /host/etc/kubernetes/kubelet.conf and make sure it outputs a value. $ oc get nodes NAME STATUS ROLES AGE VERSION ip-10-0-135-188.us-east-2.compute.internal Ready worker 6h20m v1.21.1+9807387 ip-10-0-148-205.us-east-2.compute.internal Ready master 6h26m v1.21.1+9807387 ip-10-0-177-244.us-east-2.compute.internal Ready master 6h26m v1.21.1+9807387 ip-10-0-181-138.us-east-2.compute.internal Ready worker 6h19m v1.21.1+9807387 ip-10-0-204-255.us-east-2.compute.internal Ready worker 6h21m v1.21.1+9807387 ip-10-0-211-145.us-east-2.compute.internal Ready master 6h26m v1.21.1+9807387 $ for NODE in $(oc get node -lnode-role.kubernetes.io/worker= --no-headers |awk '{print $1}'); do echo -n "$NODE "; oc debug -q node/$NODE -- jq -r '.evictionHard."imagefs.available"' /host/etc/kubernetes/kubelet.conf; done ip-10-0-135-188.us-east-2.compute.internal 15% ip-10-0-181-138.us-east-2.compute.internal 15% ip-10-0-204-255.us-east-2.compute.internal 15% $ for NODE in $(oc get node -lnode-role.kubernetes.io/worker= --no-headers |awk '{print $1}'); do echo -n "$NODE "; oc debug -q node/$NODE -- jq -r '.evictionHard."nodefs.inodesFree"' /host/etc/kubernetes/kubelet.conf; done ip-10-0-135-188.us-east-2.compute.internal 5% ip-10-0-181-138.us-east-2.compute.internal 5% ip-10-0-204-255.us-east-2.compute.internal 5% $ for NODE in $(oc get node -lnode-role.kubernetes.io/worker= --no-headers |awk '{print $1}'); do echo -n "$NODE "; oc debug -q node/$NODE -- jq -r '.evictionSoft."memory.available"' /host/etc/kubernetes/kubelet.conf; done ip-10-0-135-188.us-east-2.compute.internal 100Mi ip-10-0-181-138.us-east-2.compute.internal 100Mi ip-10-0-204-255.us-east-2.compute.internal 100Mi Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Compliance Operator bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:3214 Hi Shailendra, you attached an active case to a bug that was resolved almost a year ago without any addition details. This is not really likely to trigger any case investigation from our end. If the customer is running a recent version of the operator, please open a new bug, not attach a case to a closed bug. In general, this should not happen except for manual changes to the nodes (ssh + vi and such). The complianceCheckResult CRs should be labeled with nodes that differ from the rest, this is a good way to start an investigation. The needinfo request[s] on this closed bug have been removed as they have been unresolved for 365 days |
Created attachment 1791490 [details] kubelet config Description of problem: compliancecheckresults fails with inconsistent results Version-Release number of selected component (if applicable): $ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.6.27 True False 34d Cluster version is 4.6.27 How reproducible: oc -n openshift-compliance get compliancescans NAME PHASE RESULT ocp4-cis DONE NON-COMPLIANT ocp4-cis-node-master DONE NON-COMPLIANT ocp4-cis-node-worker DONE NON-COMPLIANT oc -n openshift-compliance describe checkresult ocp4-cis-node-master-kubelet-eviction-thresholds-set-hard-imagefs-available FAILS but imagefs hard/soft evictions are defined on all nodes for NODE in $(oc get node --no-headers |awk '{print $1}'); do echo -n "$NODE "; oc debug -q node/$NODE -- jq -r '.evictionHard."imagefs.available"' /host/etc/kubernetes/kubelet.conf; done ip-10-0-152-208.us-west-2.compute.internal 15% ip-10-0-159-70.us-west-2.compute.internal 15% ip-10-0-180-27.us-west-2.compute.internal 15% ip-10-0-191-133.us-west-2.compute.internal 15% ip-10-0-214-42.us-west-2.compute.internal 15% ip-10-0-220-243.us-west-2.compute.internal 15% and also 'oc get kubeletconfig -o yaml' confirms it oc get compliancecheckresults ocp4-cis-node-master-kubelet-eviction-thresholds-set-hard-imagefs-available FAIL medium ocp4-cis-node-master-kubelet-eviction-thresholds-set-hard-imagefs-inodesfree FAIL medium ocp4-cis-node-master-kubelet-eviction-thresholds-set-hard-memory-available FAIL medium ocp4-cis-node-master-kubelet-eviction-thresholds-set-hard-nodefs-available FAIL medium ocp4-cis-node-master-kubelet-eviction-thresholds-set-hard-nodefs-inodesfree FAIL medium ocp4-cis-node-master-kubelet-eviction-thresholds-set-soft-imagefs-available PASS medium ocp4-cis-node-master-kubelet-eviction-thresholds-set-soft-imagefs-inodesfree FAIL medium oc get complianceremediations NAME STATE ocp4-cis-api-server-encryption-provider-cipher NotApplied ocp4-cis-api-server-encryption-provider-config NotApplied Actual results: kubeletconfig.yaml (0 KB) FAIL FAIL PASS FAIL evictionSoftGracePeriod: memory.available: "5m" nodefs.available: "5m" nodefs.inodesFree: "5m" imagefs.available: "5m" evictionHard: memory.available: "100Mi" nodefs.available: "10%" nodefs.inodesFree: "5%" imagefs.available: "15%" evictionSoft: memory.available: "100Mi" nodefs.available: "10%" nodefs.inodesFree: "5%" imagefs.available: "15%" clearly shows the bug as there should be 3x PASS and not just once. Expected results: should be 3x PASS Additional info: and customer confirms he can see this bug even with fresh installation