Summary: | vSphere Machines stuck in deleting phase if associated Node object is deleted | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Milind Yadav <miyadav> |
Component: | Cloud Compute | Assignee: | dmoiseev |
Cloud Compute sub component: | Other Providers | QA Contact: | sunzhaohua <zhsun> |
Status: | CLOSED DUPLICATE | Docs Contact: | |
Severity: | medium | ||
Priority: | medium | CC: | dmoiseev, jhou, mimccune, miyadav, ssoto, welin, zhsun |
Version: | 4.7 | ||
Target Milestone: | --- | ||
Target Release: | 4.7.z | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | 1977369 | Environment: | |
Last Closed: | 2021-08-19 10:23:36 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Bug Depends On: | 1977634 | ||
Bug Blocks: |
Description
Milind Yadav
2021-06-30 08:14:29 UTC
I have also encounted the machine stuck in deleting status for a long time # oc get machine -n openshift-machine-api NAME PHASE TYPE REGION ZONE AGE cluster1-storage-vlan-40-h8mm5 Running 44d cluster1-storage-vlan-40-s2lv4 Running 44d cluster1-storage-vlan-40-tqsd6 Running 44d cluster1-worker-vlan-40-2w7hx Running 7h58m cluster1-worker-vlan-40-kxbvw Deleting 4d1h cluster1-worker-vlan-50-lqs9m Deleting 4d1h # oc describe machine -n openshift-machine-api cluster1-worker-vlan-40-kxbvw Name: cluster1-worker-vlan-40-kxbvw Namespace: openshift-machine-api Labels: machine.openshift.io/cluster-api-cluster=cluster1-vh9zx machine.openshift.io/cluster-api-machine-role=worker machine.openshift.io/cluster-api-machine-type=worker machine.openshift.io/cluster-api-machineset=cluster1-worker-vlan-40 machine.openshift.io/region= machine.openshift.io/zone= Annotations: machine.openshift.io/instance-state: poweredOn API Version: machine.openshift.io/v1beta1 Kind: Machine Metadata: Creation Timestamp: 2021-07-08T00:01:50Z Deletion Grace Period Seconds: 0 Deletion Timestamp: 2021-07-11T15:27:56Z Finalizers: machine.machine.openshift.io Generate Name: cluster1-worker-vlan-40- Generation: 3 Managed Fields: API Version: machine.openshift.io/v1beta1 Fields Type: FieldsV1 fieldsV1: f:metadata: f:generateName: f:labels: .: f:machine.openshift.io/cluster-api-machine-role: f:machine.openshift.io/cluster-api-machine-type: f:machine.openshift.io/cluster-api-machineset: f:ownerReferences: .: k:{"uid":"a7bab249-e4f0-45b3-9d7a-9b403462f69b"}: .: f:apiVersion: f:blockOwnerDeletion: f:controller: f:kind: f:name: f:uid: f:spec: .: f:metadata: .: f:labels: .: f:node-role.kubernetes.io/app: f:node-role.kubernetes.io/vlan-40: f:providerSpec: .: f:value: .: f:apiVersion: f:credentialsSecret: f:diskGiB: f:kind: f:memoryMiB: f:metadata: f:network: f:numCPUs: f:numCoresPerSocket: f:snapshot: f:template: f:userDataSecret: f:workspace: Manager: machineset-controller Operation: Update Time: 2021-07-08T00:01:50Z API Version: machine.openshift.io/v1beta1 Fields Type: FieldsV1 fieldsV1: f:metadata: f:annotations: .: f:machine.openshift.io/instance-state: f:finalizers: .: v:"machine.machine.openshift.io": f:labels: f:machine.openshift.io/region: f:machine.openshift.io/zone: f:spec: f:providerID: f:status: .: f:addresses: f:phase: f:providerStatus: .: f:conditions: f:instanceId: f:instanceState: f:taskRef: Manager: machine-controller-manager Operation: Update Time: 2021-07-11T15:28:16Z API Version: machine.openshift.io/v1beta1 Fields Type: FieldsV1 fieldsV1: f:status: f:lastUpdated: f:nodeRef: .: f:kind: f:name: f:uid: Manager: nodelink-controller Operation: Update Time: 2021-07-11T18:24:10Z Owner References: API Version: machine.openshift.io/v1beta1 Block Owner Deletion: true Controller: true Kind: MachineSet Name: cluster1-worker-vlan-40 UID: a7bab249-e4f0-45b3-9d7a-9b403462f69b Resource Version: 82676685 Self Link: /apis/machine.openshift.io/v1beta1/namespaces/openshift-machine-api/machines/cluster1-worker-vlan-40-kxbvw UID: 0c3b83b1-9dfb-495d-ac58-a6b831d38435 Spec: Metadata: Labels: node-role.kubernetes.io/app: node-role.kubernetes.io/vlan-40: Provider ID: vsphere://422b6bc8-fabd-939d-0d45-ed987f81334d Provider Spec: Value: API Version: vsphereprovider.openshift.io/v1beta1 Credentials Secret: Name: vsphere-cloud-credentials Disk Gi B: 120 Kind: VSphereMachineProviderSpec Memory Mi B: 8192 Metadata: Creation Timestamp: <nil> Network: Devices: Network Name: OCP4-Lab2-vLan-40 Network Name: OCP4-Lab2-vLan-Trunk Network Name: OCP4-Lab2-vLan-Trunk Num CP Us: 2 Num Cores Per Socket: 1 Snapshot: Template: rhcos-4.7.7-x86_64 User Data Secret: Name: worker-user-data Workspace: Datacenter: Datacenter1 Datastore: datastore3 Folder: /Datacenter1/vm/OCP4-Lab2/03-DataCenter/OCP/Cluster-1/cluster1-worker-vlan-40 Server: 192.168.1.6 Status: Addresses: Address: 192.168.40.55 Type: InternalIP Address: fe80::5c56:1229:ae1e:dfbd Type: InternalIP Address: cluster1-worker-vlan-40-kxbvw Type: InternalDNS Last Updated: 2021-07-12T00:43:14Z Node Ref: Kind: Node Name: cluster1-worker-vlan-40-kxbvw UID: 6cb0b173-c2f1-44aa-8cca-cc3671edff10 Phase: Deleting Provider Status: Conditions: Last Probe Time: 2021-07-08T00:01:50Z Last Transition Time: 2021-07-08T00:01:50Z Message: Machine successfully created Reason: MachineCreationSucceeded Status: True Type: MachineCreation Instance Id: 422b6bc8-fabd-939d-0d45-ed987f81334d Instance State: poweredOn Task Ref: task-61052 Events: <none> # oc get nodes NAME STATUS ROLES AGE VERSION cluster1-storage-vlan-40-h8mm5 Ready storage,worker 44d v1.20.0+2817867 cluster1-storage-vlan-40-s2lv4 Ready storage,worker 44d v1.20.0+2817867 cluster1-storage-vlan-40-tqsd6 Ready storage,worker 44d v1.20.0+2817867 cluster1-worker-vlan-40-2w7hx Ready app,vlan-40,worker 7h51m v1.20.0+2817867 cluster1-worker-vlan-40-kxbvw Ready,SchedulingDisabled app,vlan-40,worker 4d1h v1.20.0+2817867 cluster1-worker-vlan-50-lqs9m Ready,SchedulingDisabled app,vlan-50,worker 4d1h v1.20.0+2817867 master-01.cluster1.ocp4.example.internal Ready master 56d v1.20.0+2817867 master-02.cluster1.ocp4.example.internal Ready master 56d v1.20.0+2817867 master-03.cluster1.ocp4.example.internal Ready master 56d v1.20.0+2817867 # oc describe nodes cluster1-worker-vlan-40-kxbvw Name: cluster1-worker-vlan-40-kxbvw Roles: app,vlan-40,worker Labels: beta.kubernetes.io/arch=amd64 beta.kubernetes.io/os=linux kubernetes.io/arch=amd64 kubernetes.io/hostname=cluster1-worker-vlan-40-kxbvw kubernetes.io/os=linux node-role.kubernetes.io/app= node-role.kubernetes.io/vlan-40= node-role.kubernetes.io/worker= node.openshift.io/os_id=rhcos Annotations: csi.volume.kubernetes.io/nodeid: {"openshift-storage.cephfs.csi.ceph.com":"cluster1-worker-vlan-40-kxbvw","openshift-storage.rbd.csi.ceph.com":"cluster1-worker-vlan-40-kxb... machine.openshift.io/machine: openshift-machine-api/cluster1-worker-vlan-40-kxbvw machineconfiguration.openshift.io/currentConfig: rendered-worker-a7aa7de76b7ef645f66b332beb7766dd machineconfiguration.openshift.io/desiredConfig: rendered-worker-a7aa7de76b7ef645f66b332beb7766dd machineconfiguration.openshift.io/reason: machineconfiguration.openshift.io/state: Done volumes.kubernetes.io/controller-managed-attach-detach: true CreationTimestamp: Thu, 08 Jul 2021 08:09:33 +0800 Taints: node.kubernetes.io/unschedulable:NoSchedule Unschedulable: true Lease: HolderIdentity: cluster1-worker-vlan-40-kxbvw AcquireTime: <unset> RenewTime: Mon, 12 Jul 2021 09:16:32 +0800 Conditions: Type Status LastHeartbeatTime LastTransitionTime Reason Message ---- ------ ----------------- ------------------ ------ ------- MemoryPressure False Mon, 12 Jul 2021 09:12:01 +0800 Mon, 12 Jul 2021 08:41:58 +0800 KubeletHasSufficientMemory kubelet has sufficient memory available DiskPressure False Mon, 12 Jul 2021 09:12:01 +0800 Mon, 12 Jul 2021 08:41:58 +0800 KubeletHasNoDiskPressure kubelet has no disk pressure PIDPressure False Mon, 12 Jul 2021 09:12:01 +0800 Mon, 12 Jul 2021 08:41:58 +0800 KubeletHasSufficientPID kubelet has sufficient PID available Ready True Mon, 12 Jul 2021 09:12:01 +0800 Mon, 12 Jul 2021 08:41:58 +0800 KubeletReady kubelet is posting ready status Addresses: ExternalIP: 192.168.40.55 InternalIP: 192.168.40.55 Hostname: cluster1-worker-vlan-40-kxbvw Capacity: cpu: 2 ephemeral-storage: 125293548Ki hugepages-2Mi: 0 memory: 8153700Ki pods: 250 Allocatable: cpu: 1500m ephemeral-storage: 114396791822 hugepages-2Mi: 0 memory: 7002724Ki pods: 250 System Info: Machine ID: 6a4377b6bbff45ba9b177c0418ee0291 System UUID: c86b2b42-bdfa-9d93-0d45-ed987f81334d Boot ID: 71ec56c1-6526-4465-900a-2aaf347f1230 Kernel Version: 4.18.0-240.22.1.el8_3.x86_64 OS Image: Red Hat Enterprise Linux CoreOS 47.83.202106032343-0 (Ootpa) Operating System: linux Architecture: amd64 Container Runtime Version: cri-o://1.20.3-2.rhaos4.7.gitb53fa9d.el8 Kubelet Version: v1.20.0+2817867 Kube-Proxy Version: v1.20.0+2817867 ProviderID: vsphere://422b6bc8-fabd-939d-0d45-ed987f81334d Non-terminated Pods: (14 in total) Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits AGE --------- ---- ------------ ---------- --------------- ------------- --- openshift-cluster-node-tuning-operator tuned-c4cg8 10m (0%) 0 (0%) 50Mi (0%) 0 (0%) 4d1h openshift-dns dns-default-j8lcv 65m (4%) 0 (0%) 131Mi (1%) 0 (0%) 4d1h openshift-image-registry node-ca-j5x2h 10m (0%) 0 (0%) 10Mi (0%) 0 (0%) 4d1h openshift-ingress-canary ingress-canary-j994v 10m (0%) 0 (0%) 20Mi (0%) 0 (0%) 4d1h openshift-machine-config-operator machine-config-daemon-jxszk 40m (2%) 0 (0%) 100Mi (1%) 0 (0%) 4d1h openshift-monitoring node-exporter-dn5kx 9m (0%) 0 (0%) 210Mi (3%) 0 (0%) 4d1h openshift-multus multus-nn76n 10m (0%) 0 (0%) 150Mi (2%) 0 (0%) 4d1h openshift-multus network-metrics-daemon-s9h7p 20m (1%) 0 (0%) 120Mi (1%) 0 (0%) 4d1h openshift-network-diagnostics network-check-target-7sh8h 10m (0%) 0 (0%) 15Mi (0%) 0 (0%) 4d1h openshift-nmstate nmstate-handler-g79nz 0 (0%) 0 (0%) 0 (0%) 0 (0%) 4d openshift-sdn sdn-jv4g5 110m (7%) 0 (0%) 220Mi (3%) 0 (0%) 4d1h openshift-storage csi-cephfsplugin-nfrlg 0 (0%) 0 (0%) 0 (0%) 0 (0%) 4d1h openshift-storage csi-rbdplugin-9tn4h 0 (0%) 0 (0%) 0 (0%) 0 (0%) 4d1h percona-test-1 cluster1-haproxy-1 200m (13%) 0 (0%) 1G (13%) 0 (0%) 4d Allocated resources: (Total limits may be over 100 percent, i.e., overcommitted.) Resource Requests Limits -------- -------- ------ cpu 494m (32%) 0 (0%) memory 2075838976 (28%) 0 (0%) ephemeral-storage 0 (0%) 0 (0%) hugepages-2Mi 0 (0%) 0 (0%) Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal NodeHasSufficientMemory 98m (x60 over 4d1h) kubelet Node cluster1-worker-vlan-40-kxbvw status is now: NodeHasSufficientMemory my OCP version is OCP 4.7.16,vSphere UPI + machineset @welin Not related to this bug. Node is there. ```cluster1-worker-vlan-40-kxbvw Ready,SchedulingDisabled app,vlan-40,worker 4d1h v1.20.0+2817867``` *** This bug has been marked as a duplicate of bug 1989648 *** |