Bug 1957374
| Summary: | mcddrainerr doesn't list specific pod | ||||||
|---|---|---|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Kirsten Garrison <kgarriso> | ||||
| Component: | Machine Config Operator | Assignee: | Kirsten Garrison <kgarriso> | ||||
| Status: | CLOSED ERRATA | QA Contact: | Michael Nguyen <mnguyen> | ||||
| Severity: | low | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 4.7 | CC: | jerzhang, rioliu, wking | ||||
| Target Milestone: | --- | Keywords: | Reopened | ||||
| Target Release: | 4.8.0 | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | No Doc Update | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2021-07-27 23:06:36 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
|
Description
Kirsten Garrison
2021-05-05 17:17:25 UTC
Created attachment 1780869 [details]
Verification
Verification
Verified on 4.8.0-0.nightly-2021-05-07-075528. Triggered the mcd_drain_err with the steps below, then looked at Prometheus.
$ oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.8.0-0.nightly-2021-05-07-075528 True False 67m Cluster version is 4.8.0-0.nightly-2021-05-07-075528
$$ cd openshift/
$ cat pdb.yaml
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:
name: dontevict
spec:
minAvailable: 1
selector:
matchLabels:
app: dontevict
$ oc create -f pdb.yaml
poddisruptionbudget.policy/dontevict created
$ oc get pdb
NAME MIN AVAILABLE MAX UNAVAILABLE ALLOWED DISRUPTIONS AGE
dontevict 1 N/A 0 7s
$ oc get nodes
NAME STATUS ROLES AGE VERSION
ci-ln-9wr6012-f76d1-z7bjv-master-0 Ready master 89m v1.21.0-rc.0+291e731
ci-ln-9wr6012-f76d1-z7bjv-master-1 Ready master 89m v1.21.0-rc.0+291e731
ci-ln-9wr6012-f76d1-z7bjv-master-2 Ready master 89m v1.21.0-rc.0+291e731
ci-ln-9wr6012-f76d1-z7bjv-worker-b-gwn8c Ready worker 80m v1.21.0-rc.0+291e731
ci-ln-9wr6012-f76d1-z7bjv-worker-c-c2ndb Ready worker 80m v1.21.0-rc.0+291e731
ci-ln-9wr6012-f76d1-z7bjv-worker-d-2sc2x Ready worker 80m v1.21.0-rc.0+291e731
$ oc run --restart=Never --labels app=dontevict --overrides='{ "spec": { "nodeSelector": { "kubernetes.io/hostname": "ci-ln-9wr6012-f76d1-z7bjv-worker-b-gwn8c"} } }' --image=docker.io/busybox dont-evict-this-pod -- sleep 1h
pod/dont-evict-this-pod created
$ oc get pods
NAME READY STATUS RESTARTS AGE
dont-evict-this-pod 1/1 Running 0 7s
$ cat file-ig3.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
labels:
machineconfiguration.openshift.io/role: worker
name: test-file
spec:
config:
ignition:
version: 3.1.0
storage:
files:
- contents:
source: data:text/plain;charset=utf;base64,c2VydmVyIGZvby5leGFtcGxlLm5ldCBtYXhkZWxheSAwLjQgb2ZmbGluZQpzZXJ2ZXIgYmFyLmV4YW1wbGUubmV0IG1heGRlbGF5IDAuNCBvZmZsaW5lCnNlcnZlciBiYXouZXhhbXBsZS5uZXQgbWF4ZGVsYXkgMC40IG9mZmxpbmUK
filesystem: root
mode: 0644
path: /etc/test
$ oc create -f file-ig3.yaml
machineconfig.machineconfiguration.openshift.io/test-file created
$ oc get nodes
NAME STATUS ROLES AGE VERSION
ci-ln-9wr6012-f76d1-z7bjv-master-0 Ready master 100m v1.21.0-rc.0+291e731
ci-ln-9wr6012-f76d1-z7bjv-master-1 Ready master 100m v1.21.0-rc.0+291e731
ci-ln-9wr6012-f76d1-z7bjv-master-2 Ready master 100m v1.21.0-rc.0+291e731
ci-ln-9wr6012-f76d1-z7bjv-worker-b-gwn8c Ready,SchedulingDisabled worker 91m v1.21.0-rc.0+291e731
ci-ln-9wr6012-f76d1-z7bjv-worker-c-c2ndb Ready worker 91m v1.21.0-rc.0+291e731
ci-ln-9wr6012-f76d1-z7bjv-worker-d-2sc2x Ready worker 91m v1.21.0-rc.0+291e731
$ oc get mcp
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE
master rendered-master-617f5a3d5cb1e6c6b2e34b8a7294d683 True False False 3 3 3 0 98m
worker rendered-worker-15776fcc21358742a1f4cb79346b7d50 False True False 3 1 1 0 98m
$ oc -n openshift-monitoring get routes
NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD
alertmanager-main alertmanager-main-openshift-monitoring.apps.ci-ln-9wr6012-f76d1.origin-ci-int-gce.dev.openshift.com alertmanager-main web reencrypt/Redirect None
grafana grafana-openshift-monitoring.apps.ci-ln-9wr6012-f76d1.origin-ci-int-gce.dev.openshift.com grafana https reencrypt/Redirect None
prometheus-k8s prometheus-k8s-openshift-monitoring.apps.ci-ln-9wr6012-f76d1.origin-ci-int-gce.dev.openshift.com prometheus-k8s web reencrypt/Redirect None
thanos-querier thanos-querier-openshift-monitoring.apps.ci-ln-9wr6012-f76d1.origin-ci-int-gce.dev.openshift.com thanos-querier web reencrypt/Redirect None
$ oc -n openshift-console get routes
NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD
console console-openshift-console.apps.ci-ln-9wr6012-f76d1.origin-ci-int-gce.dev.openshift.com console https reencrypt/Redirect None
downloads downloads-openshift-console.apps.ci-ln-9wr6012-f76d1.origin-ci-int-gce.dev.openshift.com downloads http edge/Redirect None
$ oc get pods -A --field-selector spec.nodeName=ci-ln-9wr6012-f76d1-z7bjv-worker-b-gwn8c
NAMESPACE NAME READY STATUS RESTARTS AGE
default dont-evict-this-pod 1/1 Running 0 13m
openshift-cluster-csi-drivers gcp-pd-csi-driver-node-l9sm4 3/3 Running 0 94m
openshift-cluster-node-tuning-operator tuned-qt6d5 1/1 Running 0 94m
openshift-dns dns-default-hl5bn 2/2 Running 0 93m
openshift-dns node-resolver-v7vdx 1/1 Running 0 94m
openshift-image-registry node-ca-r4mnv 1/1 Running 0 94m
openshift-ingress-canary ingress-canary-knnbf 1/1 Running 0 93m
openshift-machine-config-operator machine-config-daemon-zgc8w 2/2 Running 0 94m
openshift-monitoring node-exporter-qkzkz 2/2 Running 0 94m
openshift-multus multus-pv4px 1/1 Running 0 94m
openshift-multus network-metrics-daemon-zk2vz 2/2 Running 0 94m
openshift-network-diagnostics network-check-target-wkr6m 1/1 Running 0 94m
openshift-sdn sdn-dnb8q 2/2 Running 0 94m
$ oc -n openshift-machine-config-operator logs machine-config-daemon-zgc8w -c machine-config-daemon
E0507 18:59:57.986704 1970 daemon.go:330] WARNING: deleting Pods not managed by ReplicationController, ReplicaSet, Job, DaemonSet or StatefulSet: default/dont-evict-this-pod; ignoring DaemonSet-managed Pods: openshift-cluster-csi-drivers/gcp-pd-csi-driver-node-l9sm4, openshift-cluster-node-tuning-operator/tuned-qt6d5, openshift-dns/dns-default-hl5bn, openshift-dns/node-resolver-v7vdx, openshift-image-registry/node-ca-r4mnv, openshift-ingress-canary/ingress-canary-knnbf, openshift-machine-config-operator/machine-config-daemon-zgc8w, openshift-monitoring/node-exporter-qkzkz, openshift-multus/multus-pv4px, openshift-multus/network-metrics-daemon-zk2vz, openshift-network-diagnostics/network-check-target-wkr6m, openshift-sdn/sdn-dnb8q
I0507 18:59:57.992498 1970 daemon.go:330] evicting pod default/dont-evict-this-pod
E0507 18:59:58.001489 1970 daemon.go:330] error when evicting pods/"dont-evict-this-pod" -n "default" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
I0507 19:00:03.004519 1970 daemon.go:330] evicting pod default/dont-evict-this-pod
E0507 19:00:03.012675 1970 daemon.go:330] error when evicting pods/"dont-evict-this-pod" -n "default" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
I0507 19:00:08.013652 1970 daemon.go:330] evicting pod default/dont-evict-this-pod
E0507 19:00:08.023949 1970 daemon.go:330] error when evicting pods/"dont-evict-this-pod" -n "default" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
I0507 19:00:13.027718 1970 daemon.go:330] evicting pod default/dont-evict-this-pod
E0507 19:00:13.038685 1970 daemon.go:330] error when evicting pods/"dont-evict-this-pod" -n "default" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
I0507 19:00:18.042846 1970 daemon.go:330] evicting pod default/dont-evict-this-pod
E0507 19:00:18.054457 1970 daemon.go:330] error when evicting pods/"dont-evict-this-pod" -n "default" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
I0507 19:00:23.055472 1970 daemon.go:330] evicting pod default/dont-evict-this-pod
E0507 19:00:23.067515 1970 daemon.go:330] error when evicting pods/"dont-evict-this-pod" -n "default" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
I0507 19:00:28.071655 1970 daemon.go:330] evicting pod default/dont-evict-this-pod
E0507 19:00:28.081238 1970 daemon.go:330] error when evicting pods/"dont-evict-this-pod" -n "default" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
I0507 19:00:33.082191 1970 daemon.go:330] evicting pod default/dont-evict-this-pod
E0507 19:00:33.092854 1970 daemon.go:330] error when evicting pods/"dont-evict-this-pod" -n "default" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
I0507 19:00:38.097446 1970 daemon.go:330] evicting pod default/dont-evict-this-pod
E0507 19:00:38.105617 1970 daemon.go:330] error when evicting pods/"dont-evict-this-pod" -n "default" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
I0507 19:00:43.108816 1970 daemon.go:330] evicting pod default/dont-evict-this-pod
E0507 19:00:43.118708 1970 daemon.go:330] error when evicting pods/"dont-evict-this-pod" -n "default" (will retry after 5s): Cannot evict pod as it would violate the pod's disruption budget.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438 |