Bug 1927373 - NoExecute taint violates pdb; VMIs are not live migrated
Summary: NoExecute taint violates pdb; VMIs are not live migrated
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Container Native Virtualization (CNV)
Classification: Red Hat
Component: Virtualization
Version: 2.5.3
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 2.6.0
Assignee: Igor Bezukh
QA Contact: Israel Pinto
URL:
Whiteboard:
Depends On: 1913532 1927836
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-02-10 15:51 UTC by Ruth Netser
Modified: 2021-03-10 11:24 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-03-10 11:23:40 UTC
Target Upstream Version:
Embargoed:
ibezukh: needinfo+
ibezukh: needinfo+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2021:0799 0 None None None 2021-03-10 11:24:48 UTC

Description Ruth Netser 2021-02-10 15:51:37 UTC
Description of problem:
After OCP upgrade, VMI with runStrategy: Manual and evictionStrategy: LiveMigrate is not running.

Version-Release number of selected component (if applicable):
OCP 4.6.16+ CNV 2.5.3 -> OCP 4.7.0 rc0

How reproducible:
100% (seen on 2 clusters)

Steps to Reproduce:
1. Cluster with OCP 4.6.16+ CNV 2.5.3
2. Create a VM with runStrategy: Manual and evictionStrategy: LiveMigrate
The VM is using a RWX PVC
3. Upgrade OCP

Actual results:
VMI pahse is "Succeeded"
The VMI is not running

Expected results:
The VMI should be running throughout the upgrade and after it

Additional info:
$ oc get vmi
rhel8-nfs      3h49m   Succeeded   10.131.1.93   ssp05-4fpbn-worker-0-vsfvz


$ oc get events -A | grep -vi normal | grep rhel8-nfs
b4-upgrade                               158m        Warning   SyncFailed                                   virtualmachineinstance/rhel8-nfs                                unknown error encountered sending command SyncVMI: rpc error: code = DeadlineExceeded desc = context deadline exceeded
b4-upgrade                               165m        Warning   NodeNotReady                                 pod/virt-launcher-rhel8-nfs-6l62f                               Node is not ready

153m        Warning   NodeNotReady                 pod/virt-launcher-rhel8-nfs-6l62f                           Node is not ready
148m        Normal    TaintManagerEviction         pod/virt-launcher-rhel8-nfs-6l62f                           Marking for deletion Pod b4-upgrade/virt-launcher-rhel8-nfs-6l62f
144m        Normal    TaintManagerEviction         pod/virt-launcher-rhel8-nfs-6l62f                           Cancelling deletion of Pod b4-upgrade/virt-launcher-rhel8-nfs-6l62f
144m        Normal    Killing                      pod/virt-launcher-rhel8-nfs-6l62f                           Stopping container compute


===========================

$ oc describe vmi rhel8-nfs
Name:         rhel8-nfs
Namespace:    b4-upgrade
Labels:       flavor.template.kubevirt.io/tiny=true
              kubevirt.io/domain=rhel8-nfs
              kubevirt.io/nodeName=ssp05-4fpbn-worker-0-vsfvz
              kubevirt.io/size=tiny
              os.template.kubevirt.io/rhel8.3=true
              vm.kubevirt.io/name=rhel8-nfs
              workload.template.kubevirt.io/server=true
Annotations:  kubevirt.io/latest-observed-api-version: v1alpha3
              kubevirt.io/storage-observed-api-version: v1alpha3
API Version:  kubevirt.io/v1alpha3
Kind:         VirtualMachineInstance
Metadata:
  Creation Timestamp:  2021-02-10T11:55:04Z
  Generation:          45
  Managed Fields:
    API Version:  kubevirt.io/v1alpha3
    Fields Type:  FieldsV1
    fieldsV1:
      f:status:
        f:interfaces:
        f:migrationMethod:
        f:phase:
    Manager:      virt-handler
    Operation:    Update
    Time:         2021-02-10T13:11:40Z
    API Version:  kubevirt.io/v1alpha3
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .:
          f:kubevirt.io/latest-observed-api-version:
          f:kubevirt.io/storage-observed-api-version:
        f:labels:
          .:
          f:flavor.template.kubevirt.io/tiny:
          f:kubevirt.io/domain:
          f:kubevirt.io/nodeName:
          f:kubevirt.io/size:
          f:os.template.kubevirt.io/rhel8.3:
          f:vm.kubevirt.io/name:
          f:workload.template.kubevirt.io/server:
        f:ownerReferences:
      f:spec:
        .:
        f:domain:
          .:
          f:cpu:
            .:
            f:cores:
            f:sockets:
            f:threads:
          f:devices:
            .:
            f:disks:
            f:interfaces:
            f:networkInterfaceMultiqueue:
            f:rng:
          f:firmware:
            .:
            f:uuid:
          f:machine:
            .:
            f:type:
          f:resources:
            .:
            f:requests:
              .:
              f:memory:
        f:evictionStrategy:
        f:hostname:
        f:networks:
        f:terminationGracePeriodSeconds:
        f:volumes:
      f:status:
        .:
        f:conditions:
        f:guestOSInfo:
        f:nodeName:
        f:qosClass:
    Manager:    virt-controller
    Operation:  Update
    Time:       2021-02-10T13:11:43Z
  Owner References:
    API Version:           kubevirt.io/v1alpha3
    Block Owner Deletion:  true
    Controller:            true
    Kind:                  VirtualMachine
    Name:                  rhel8-nfs
    UID:                   e715a3b4-4b49-4da6-a527-c903a3990e77
  Resource Version:        997581
  Self Link:               /apis/kubevirt.io/v1alpha3/namespaces/b4-upgrade/virtualmachineinstances/rhel8-nfs
  UID:                     b6e26b13-77cd-4382-b3d4-abcfbb0b14b5
Spec:
  Domain:
    Cpu:
      Cores:    1
      Sockets:  1
      Threads:  1
    Devices:
      Disks:
        Disk:
          Bus:       virtio
        Name:        cloudinitdisk
        Boot Order:  1
        Disk:
          Bus:  virtio
        Name:   rootdisk
      Interfaces:
        Masquerade:
        Model:                       virtio
        Name:                        nic-0
      Network Interface Multiqueue:  true
      Rng:
    Features:
      Acpi:
        Enabled:  true
    Firmware:
      Uuid:  8ce719db-adcd-5abe-99ec-813760c30897
    Machine:
      Type:  pc-q35-rhel8.2.0
    Resources:
      Requests:
        Cpu:          100m
        Memory:       1536Mi
  Eviction Strategy:  LiveMigrate
  Hostname:           rhel8-nfs
  Networks:
    Name:  nic-0
    Pod:
  Termination Grace Period Seconds:  180
  Volumes:
    Cloud Init No Cloud:
      User Data:  #cloud-config
user: cloud-user
password: redhat
chpasswd:
  expire: false

    Name:  cloudinitdisk
    Data Volume:
      Name:  rhel8-nfs-rootdisk-9k9cs
    Name:    rootdisk
Status:
  Conditions:
    Last Probe Time:       <nil>
    Last Transition Time:  <nil>
    Status:                True
    Type:                  LiveMigratable
  Guest OS Info:
  Interfaces:
    Interface Name:  eth0
    Ip Address:      10.131.1.93
    Ip Addresses:
      10.131.1.93
    Mac:             02:00:00:6a:8d:cd
    Name:            nic-0
  Migration Method:  BlockMigration
  Node Name:         ssp05-4fpbn-worker-0-vsfvz
  Phase:             Succeeded
  Qos Class:         Burstable
Events:
  Type     Reason            Age                    From                         Message
  ----     ------            ----                   ----                         -------
  Normal   SuccessfulCreate  3h48m                  disruptionbudget-controller  Created PodDisruptionBudget kubevirt-disruption-budget-szvgb
  Normal   SuccessfulCreate  3h48m                  virtualmachine-controller    Created virtual machine pod virt-launcher-rhel8-nfs-6l62f
  Normal   Started           3h48m                  virt-handler                 VirtualMachineInstance started.
  Normal   Created           163m (x84 over 3h48m)  virt-handler                 VirtualMachineInstance defined.
  Warning  SyncFailed        154m (x2 over 155m)    virt-handler                 unknown error encountered sending command SyncVMI: rpc error: code = DeadlineExceeded desc = context deadline exceeded


===========================
$ oc get vm rhel8-nfs -oyaml
apiVersion: kubevirt.io/v1alpha3
kind: VirtualMachine
metadata:
  annotations:
    kubevirt.io/latest-observed-api-version: v1alpha3
    kubevirt.io/storage-observed-api-version: v1alpha3
    name.os.template.kubevirt.io/rhel8.3: Red Hat Enterprise Linux 8.0 or higher
  creationTimestamp: "2021-02-10T10:56:15Z"
  generation: 4
  labels:
    app: rhel8-nfs
    flavor.template.kubevirt.io/tiny: "true"
    os.template.kubevirt.io/rhel8.3: "true"
    vm.kubevirt.io/template: rhel8-server-tiny-v0.11.3
    vm.kubevirt.io/template.namespace: openshift
    vm.kubevirt.io/template.revision: "1"
    vm.kubevirt.io/template.version: v0.12.4
    workload.template.kubevirt.io/server: "true"
  managedFields:
  - apiVersion: kubevirt.io/v1alpha3
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .: {}
          f:name.os.template.kubevirt.io/rhel8.3: {}
        f:labels:
          .: {}
          f:app: {}
          f:flavor.template.kubevirt.io/tiny: {}
          f:os.template.kubevirt.io/rhel8.3: {}
          f:vm.kubevirt.io/template: {}
          f:vm.kubevirt.io/template.namespace: {}
          f:vm.kubevirt.io/template.revision: {}
          f:vm.kubevirt.io/template.version: {}
          f:workload.template.kubevirt.io/server: {}
      f:spec:
        .: {}
        f:dataVolumeTemplates: {}
        f:runStrategy: {}
        f:template:
          .: {}
          f:metadata:
            .: {}
            f:labels:
              .: {}
              f:flavor.template.kubevirt.io/tiny: {}
              f:kubevirt.io/domain: {}
              f:kubevirt.io/size: {}
              f:os.template.kubevirt.io/rhel8.3: {}
              f:vm.kubevirt.io/name: {}
              f:workload.template.kubevirt.io/server: {}
          f:spec:
            .: {}
            f:domain:
              .: {}
              f:cpu:
                .: {}
                f:cores: {}
                f:sockets: {}
                f:threads: {}
              f:devices:
                .: {}
                f:disks: {}
                f:interfaces: {}
                f:networkInterfaceMultiqueue: {}
                f:rng: {}
              f:machine:
                .: {}
                f:type: {}
              f:resources:
                .: {}
                f:requests:
                  .: {}
                  f:memory: {}
            f:evictionStrategy: {}
            f:hostname: {}
            f:networks: {}
            f:terminationGracePeriodSeconds: {}
            f:volumes: {}
    manager: Mozilla
    operation: Update
    time: "2021-02-10T11:55:00Z"
  - apiVersion: kubevirt.io/v1alpha3
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          f:kubevirt.io/latest-observed-api-version: {}
          f:kubevirt.io/storage-observed-api-version: {}
      f:status:
        .: {}
        f:created: {}
    manager: virt-controller
    operation: Update
    time: "2021-02-10T13:11:40Z"
  name: rhel8-nfs
  namespace: b4-upgrade
  resourceVersion: "997362"
  selfLink: /apis/kubevirt.io/v1alpha3/namespaces/b4-upgrade/virtualmachines/rhel8-nfs
  uid: e715a3b4-4b49-4da6-a527-c903a3990e77
spec:
  dataVolumeTemplates:
  - apiVersion: cdi.kubevirt.io/v1alpha1
    kind: DataVolume
    metadata:
      creationTimestamp: null
      name: rhel8-nfs-rootdisk-9k9cs
    spec:
      pvc:
        accessModes:
        - ReadWriteMany
        resources:
          requests:
            storage: 30Gi
        storageClassName: nfs
        volumeMode: Filesystem
      source:
        pvc:
          name: rhel8
          namespace: openshift-virtualization-os-images
  runStrategy: Manual
  template:
    metadata:
      creationTimestamp: null
      labels:
        flavor.template.kubevirt.io/tiny: "true"
        kubevirt.io/domain: rhel8-nfs
        kubevirt.io/size: tiny
        os.template.kubevirt.io/rhel8.3: "true"
        vm.kubevirt.io/name: rhel8-nfs
        workload.template.kubevirt.io/server: "true"
    spec:
      domain:
        cpu:
          cores: 1
          sockets: 1
          threads: 1
        devices:
          disks:
          - disk:
              bus: virtio
            name: cloudinitdisk
          - bootOrder: 1
            disk:
              bus: virtio
            name: rootdisk
          interfaces:
          - masquerade: {}
            model: virtio
            name: nic-0
          networkInterfaceMultiqueue: true
          rng: {}
        machine:
          type: pc-q35-rhel8.2.0
        resources:
          requests:
            memory: 1536Mi
      evictionStrategy: LiveMigrate
      hostname: rhel8-nfs
      networks:
      - name: nic-0
        pod: {}
      terminationGracePeriodSeconds: 180
      volumes:
      - cloudInitNoCloud:
          userData: |
            #cloud-config
            user: cloud-user
            password: redhat
            chpasswd:
              expire: false
        name: cloudinitdisk
      - dataVolume:
          name: rhel8-nfs-rootdisk-9k9cs
        name: rootdisk
status:
  created: true



===========================



===========================

Comment 2 sgott 2021-02-10 20:01:59 UTC
This excerpt from an email thread is a comment from Roman:

--------------

Was looking with Ruth at the latest occurrence: In the events it looks like a taint on the nodes seems to cause the pods to be deleted before the migration via evictions kicks in:

```

153m        Warning   NodeNotReady                 pod/virt-launcher-rhel8-nfs-6l62f                           Node is not ready
148m        Normal    TaintManagerEviction         pod/virt-launcher-rhel8-nfs-6l62f                           Marking for deletion Pod b4-upgrade/virt-launcher-rhel8-nfs-6l62f
144m        Normal    TaintManagerEviction         pod/virt-launcher-rhel8-nfs-6l62f                           Cancelling deletion of Pod b4-upgrade/virt-launcher-rhel8-nfs-6l62f
144m        Normal    Killing                      pod/virt-launcher-rhel8-nfs-6l62f                           Stopping container compute

```

--------------

So we know that the VMI was stopped because the node was tainted. The VMI wasn't re-started because the RunStrategy was set to Manual. We need to ascertain what that taint was and why it was added.

Ruth, are you able to reproduce this error, and note what taints are being applied to the unresponsive nodes?

Comment 4 Ruth Netser 2021-02-11 15:28:45 UTC
Reproduced, taint is added, virt-launcher pods are terminating, no live migration:

VMI is running on node rhel8-nfs      42m   Running   10.131.0.47    ssp04-rvkqg-worker-0-zvfkw


rhel8-nfs      42m   Running   10.131.0.47    ssp04-rvkqg-worker-0-zvfkw


During the upgrade, the following taints are added to the node:

Taints:             node.kubernetes.io/unreachable:NoExecute
                    node.kubernetes.io/unreachable:NoSchedule
Unschedulable:      false


$ oc get node ssp04-rvkqg-worker-0-zvfkw
NAME                         STATUS     ROLES    AGE   VERSION
ssp04-rvkqg-worker-0-zvfkw   NotReady   worker   22h   v1.19.0+e49167a



virt-launcher pods of the VMIs running on the node are Terminating, live migration is not performed:

===========================================
$ oc get pod
NAME                               READY   STATUS        RESTARTS   AGE
virt-launcher-rhel8-nfs-rr9tx      1/1     Terminating   0          102m
virt-launcher-win10-ocs-2kx2v      1/1     Terminating   0          89m


===========================================
$ oc describe node ssp04-rvkqg-worker-0-zvfkw
Name:               ssp04-rvkqg-worker-0-zvfkw
Roles:              worker
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/instance-type=ci.nested.virt.m1.xlarge
                    beta.kubernetes.io/os=linux
                    cluster.ocs.openshift.io/openshift-storage=
                    cpumanager=true
                    failure-domain.beta.kubernetes.io/zone=nova
                    feature.node.kubernetes.io/cpu-feature-aes=true
                    feature.node.kubernetes.io/cpu-feature-avx=true
                    feature.node.kubernetes.io/cpu-feature-avx2=true
                    feature.node.kubernetes.io/cpu-feature-bmi1=true
                    feature.node.kubernetes.io/cpu-feature-bmi2=true
                    feature.node.kubernetes.io/cpu-feature-erms=true
                    feature.node.kubernetes.io/cpu-feature-f16c=true
                    feature.node.kubernetes.io/cpu-feature-fma=true
                    feature.node.kubernetes.io/cpu-feature-fsgsbase=true
                    feature.node.kubernetes.io/cpu-feature-invpcid=true
                    feature.node.kubernetes.io/cpu-feature-movbe=true
                    feature.node.kubernetes.io/cpu-feature-pcid=true
                    feature.node.kubernetes.io/cpu-feature-pclmuldq=true
                    feature.node.kubernetes.io/cpu-feature-popcnt=true
                    feature.node.kubernetes.io/cpu-feature-rdrand=true
                    feature.node.kubernetes.io/cpu-feature-rdtscp=true
                    feature.node.kubernetes.io/cpu-feature-smep=true
                    feature.node.kubernetes.io/cpu-feature-spec-ctrl=true
                    feature.node.kubernetes.io/cpu-feature-sse4.2=true
                    feature.node.kubernetes.io/cpu-feature-svm=true
                    feature.node.kubernetes.io/cpu-feature-tsc-deadline=true
                    feature.node.kubernetes.io/cpu-feature-vme=true
                    feature.node.kubernetes.io/cpu-feature-x2apic=true
                    feature.node.kubernetes.io/cpu-feature-xsave=true
                    feature.node.kubernetes.io/cpu-model-Haswell-noTSX=true
                    feature.node.kubernetes.io/cpu-model-Haswell-noTSX-IBRS=true
                    feature.node.kubernetes.io/cpu-model-IvyBridge=true
                    feature.node.kubernetes.io/cpu-model-IvyBridge-IBRS=true
                    feature.node.kubernetes.io/cpu-model-Nehalem=true
                    feature.node.kubernetes.io/cpu-model-Nehalem-IBRS=true
                    feature.node.kubernetes.io/cpu-model-Opteron_G1=true
                    feature.node.kubernetes.io/cpu-model-Opteron_G2=true
                    feature.node.kubernetes.io/cpu-model-Penryn=true
                    feature.node.kubernetes.io/cpu-model-SandyBridge=true
                    feature.node.kubernetes.io/cpu-model-SandyBridge-IBRS=true
                    feature.node.kubernetes.io/cpu-model-Westmere=true
                    feature.node.kubernetes.io/cpu-model-Westmere-IBRS=true
                    feature.node.kubernetes.io/cpu-model-kvm32=true
                    feature.node.kubernetes.io/cpu-model-kvm64=true
                    feature.node.kubernetes.io/cpu-model-qemu32=true
                    feature.node.kubernetes.io/cpu-model-qemu64=true
                    feature.node.kubernetes.io/kvm-info-cap-hyperv-base=true
                    feature.node.kubernetes.io/kvm-info-cap-hyperv-frequencies=true
                    feature.node.kubernetes.io/kvm-info-cap-hyperv-ipi=true
                    feature.node.kubernetes.io/kvm-info-cap-hyperv-reenlightenment=true
                    feature.node.kubernetes.io/kvm-info-cap-hyperv-reset=true
                    feature.node.kubernetes.io/kvm-info-cap-hyperv-runtime=true
                    feature.node.kubernetes.io/kvm-info-cap-hyperv-synic=true
                    feature.node.kubernetes.io/kvm-info-cap-hyperv-synic2=true
                    feature.node.kubernetes.io/kvm-info-cap-hyperv-synictimer=true
                    feature.node.kubernetes.io/kvm-info-cap-hyperv-time=true
                    feature.node.kubernetes.io/kvm-info-cap-hyperv-tlbflush=true
                    feature.node.kubernetes.io/kvm-info-cap-hyperv-vpindex=true
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=ssp04-rvkqg-worker-0-zvfkw
                    kubernetes.io/os=linux
                    kubevirt.io/schedulable=false
                    node-role.kubernetes.io/worker=
                    node.kubernetes.io/instance-type=ci.nested.virt.m1.xlarge
                    node.openshift.io/os_id=rhcos
                    topology.cinder.csi.openstack.org/zone=nova
                    topology.kubernetes.io/zone=nova
                    topology.rook.io/rack=rack2
Annotations:        csi.volume.kubernetes.io/nodeid:
                      {"cinder.csi.openstack.org":"885b9301-803e-43b9-a6c8-c07324f015e6","manila.csi.openstack.org":"ssp04-rvkqg-worker-0-zvfkw","openshift-stor...
                    kubevirt.io/heartbeat: 2021-02-11T14:46:59Z
                    machine.openshift.io/machine: openshift-machine-api/ssp04-rvkqg-worker-0-zvfkw
                    machineconfiguration.openshift.io/currentConfig: rendered-worker-e32d0c5fbb975fcd1cd2ff67ba0f0965
                    machineconfiguration.openshift.io/desiredConfig: rendered-worker-e32d0c5fbb975fcd1cd2ff67ba0f0965
                    machineconfiguration.openshift.io/reason: 
                    machineconfiguration.openshift.io/state: Done
                    node-labeller-feature.node.kubernetes.io/cpu-feature-aes: true
                    node-labeller-feature.node.kubernetes.io/cpu-feature-avx: true
                    node-labeller-feature.node.kubernetes.io/cpu-feature-avx2: true
                    node-labeller-feature.node.kubernetes.io/cpu-feature-bmi1: true
                    node-labeller-feature.node.kubernetes.io/cpu-feature-bmi2: true
                    node-labeller-feature.node.kubernetes.io/cpu-feature-erms: true
                    node-labeller-feature.node.kubernetes.io/cpu-feature-f16c: true
                    node-labeller-feature.node.kubernetes.io/cpu-feature-fma: true
                    node-labeller-feature.node.kubernetes.io/cpu-feature-fsgsbase: true
                    node-labeller-feature.node.kubernetes.io/cpu-feature-invpcid: true
                    node-labeller-feature.node.kubernetes.io/cpu-feature-movbe: true
                    node-labeller-feature.node.kubernetes.io/cpu-feature-pcid: true
                    node-labeller-feature.node.kubernetes.io/cpu-feature-pclmuldq: true
                    node-labeller-feature.node.kubernetes.io/cpu-feature-popcnt: true
                    node-labeller-feature.node.kubernetes.io/cpu-feature-rdrand: true
                    node-labeller-feature.node.kubernetes.io/cpu-feature-rdtscp: true
                    node-labeller-feature.node.kubernetes.io/cpu-feature-smep: true
                    node-labeller-feature.node.kubernetes.io/cpu-feature-spec-ctrl: true
                    node-labeller-feature.node.kubernetes.io/cpu-feature-sse4.2: true
                    node-labeller-feature.node.kubernetes.io/cpu-feature-svm: true
                    node-labeller-feature.node.kubernetes.io/cpu-feature-tsc-deadline: true
                    node-labeller-feature.node.kubernetes.io/cpu-feature-vme: true
                    node-labeller-feature.node.kubernetes.io/cpu-feature-x2apic: true
                    node-labeller-feature.node.kubernetes.io/cpu-feature-xsave: true
                    node-labeller-feature.node.kubernetes.io/cpu-model-Haswell-noTSX: true
                    node-labeller-feature.node.kubernetes.io/cpu-model-Haswell-noTSX-IBRS: true
                    node-labeller-feature.node.kubernetes.io/cpu-model-IvyBridge: true
                    node-labeller-feature.node.kubernetes.io/cpu-model-IvyBridge-IBRS: true
                    node-labeller-feature.node.kubernetes.io/cpu-model-Nehalem: true
                    node-labeller-feature.node.kubernetes.io/cpu-model-Nehalem-IBRS: true
                    node-labeller-feature.node.kubernetes.io/cpu-model-Opteron_G1: true
                    node-labeller-feature.node.kubernetes.io/cpu-model-Opteron_G2: true
                    node-labeller-feature.node.kubernetes.io/cpu-model-Penryn: true
                    node-labeller-feature.node.kubernetes.io/cpu-model-SandyBridge: true
                    node-labeller-feature.node.kubernetes.io/cpu-model-SandyBridge-IBRS: true
                    node-labeller-feature.node.kubernetes.io/cpu-model-Westmere: true
                    node-labeller-feature.node.kubernetes.io/cpu-model-Westmere-IBRS: true
                    node-labeller-feature.node.kubernetes.io/cpu-model-kvm32: true
                    node-labeller-feature.node.kubernetes.io/cpu-model-kvm64: true
                    node-labeller-feature.node.kubernetes.io/cpu-model-qemu32: true
                    node-labeller-feature.node.kubernetes.io/cpu-model-qemu64: true
                    node-labeller-feature.node.kubernetes.io/kvm-info-cap-hyperv-base: true
                    node-labeller-feature.node.kubernetes.io/kvm-info-cap-hyperv-frequencies: true
                    node-labeller-feature.node.kubernetes.io/kvm-info-cap-hyperv-ipi: true
                    node-labeller-feature.node.kubernetes.io/kvm-info-cap-hyperv-reenlightenment: true
                    node-labeller-feature.node.kubernetes.io/kvm-info-cap-hyperv-reset: true
                    node-labeller-feature.node.kubernetes.io/kvm-info-cap-hyperv-runtime: true
                    node-labeller-feature.node.kubernetes.io/kvm-info-cap-hyperv-synic: true
                    node-labeller-feature.node.kubernetes.io/kvm-info-cap-hyperv-synic2: true
                    node-labeller-feature.node.kubernetes.io/kvm-info-cap-hyperv-synictimer: true
                    node-labeller-feature.node.kubernetes.io/kvm-info-cap-hyperv-time: true
                    node-labeller-feature.node.kubernetes.io/kvm-info-cap-hyperv-tlbflush: true
                    node-labeller-feature.node.kubernetes.io/kvm-info-cap-hyperv-vpindex: true
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Wed, 10 Feb 2021 17:12:28 +0000
Taints:             node.kubernetes.io/unreachable:NoExecute
                    node.kubernetes.io/unreachable:NoSchedule
Unschedulable:      false
Lease:
  HolderIdentity:  ssp04-rvkqg-worker-0-zvfkw
  AcquireTime:     <unset>
  RenewTime:       Thu, 11 Feb 2021 14:47:18 +0000
Conditions:
  Type             Status    LastHeartbeatTime                 LastTransitionTime                Reason              Message
  ----             ------    -----------------                 ------------------                ------              -------
  MemoryPressure   Unknown   Thu, 11 Feb 2021 14:47:20 +0000   Thu, 11 Feb 2021 14:48:01 +0000   NodeStatusUnknown   Kubelet stopped posting node status.
  DiskPressure     Unknown   Thu, 11 Feb 2021 14:47:20 +0000   Thu, 11 Feb 2021 14:48:01 +0000   NodeStatusUnknown   Kubelet stopped posting node status.
  PIDPressure      Unknown   Thu, 11 Feb 2021 14:47:20 +0000   Thu, 11 Feb 2021 14:48:01 +0000   NodeStatusUnknown   Kubelet stopped posting node status.
  Ready            Unknown   Thu, 11 Feb 2021 14:47:20 +0000   Thu, 11 Feb 2021 14:48:01 +0000   NodeStatusUnknown   Kubelet stopped posting node status.
Addresses:
  InternalIP:  192.168.1.21
  Hostname:    ssp04-rvkqg-worker-0-zvfkw
Capacity:
  attachable-volumes-cinder:        256
  cpu:                              8
  devices.kubevirt.io/kvm:          110
  devices.kubevirt.io/tun:          110
  devices.kubevirt.io/vhost-net:    110
  ephemeral-storage:                41391084Ki
  hugepages-1Gi:                    0
  hugepages-2Mi:                    0
  memory:                           16418260Ki
  ovs-cni.network.kubevirt.io/br0:  1k
  pods:                             250
Allocatable:
  attachable-volumes-cinder:        256
  cpu:                              7500m
  devices.kubevirt.io/kvm:          110
  devices.kubevirt.io/tun:          110
  devices.kubevirt.io/vhost-net:    110
  ephemeral-storage:                37072281128
  hugepages-1Gi:                    0
  hugepages-2Mi:                    0
  memory:                           15267284Ki
  ovs-cni.network.kubevirt.io/br0:  1k
  pods:                             250
System Info:
  Machine ID:                             885b9301803e43b9a6c8c07324f015e6
  System UUID:                            885b9301-803e-43b9-a6c8-c07324f015e6
  Boot ID:                                8e64b1bd-f422-411c-a633-c451e40469ec
  Kernel Version:                         4.18.0-193.41.1.el8_2.x86_64
  OS Image:                               Red Hat Enterprise Linux CoreOS 46.82.202101301821-0 (Ootpa)
  Operating System:                       linux
  Architecture:                           amd64
  Container Runtime Version:              cri-o://1.19.1-7.rhaos4.6.git6377f68.el8
  Kubelet Version:                        v1.19.0+e49167a
  Kube-Proxy Version:                     v1.19.0+e49167a
ProviderID:                               openstack:///885b9301-803e-43b9-a6c8-c07324f015e6
Non-terminated Pods:                      (70 in total)
  Namespace                               Name                                                               CPU Requests  CPU Limits  Memory Requests   Memory Limits  AGE
  ---------                               ----                                                               ------------  ----------  ---------------   -------------  ---
  b4-upgarde                              virt-launcher-rhel8-nfs-rr9tx                                      100m (1%)     100m (1%)   1709Mi (11%)      40M (0%)       95m
  b4-upgarde                              virt-launcher-win10-ocs-2kx2v                                      100m (1%)     100m (1%)   4481613825 (28%)  40M (0%)       82m
  openshift-cluster-csi-drivers           openstack-cinder-csi-driver-node-h4pp8                             20m (0%)      0 (0%)      100Mi (0%)        0 (0%)         41m
  openshift-cluster-node-tuning-operator  tuned-7ll2f                                                        10m (0%)      0 (0%)      50Mi (0%)         0 (0%)         40m
  openshift-cnv                           bridge-marker-n9pqq                                                0 (0%)        0 (0%)      0 (0%)            0 (0%)         20h
  openshift-cnv                           cdi-apiserver-fd7f4fb6-q2b8v                                       0 (0%)        0 (0%)      0 (0%)            0 (0%)         20h
  openshift-cnv                           cdi-deployment-65c8dbfdc7-hxms2                                    0 (0%)        0 (0%)      0 (0%)            0 (0%)         20h
  openshift-cnv                           cdi-uploadproxy-c9ff9fc78-qp99p                                    0 (0%)        0 (0%)      0 (0%)            0 (0%)         20h
  openshift-cnv                           cluster-network-addons-operator-69dd97cb44-l26h9                   0 (0%)        0 (0%)      0 (0%)            0 (0%)         14m
  openshift-cnv                           hostpath-provisioner-operator-55d7b8b595-zhbk9                     0 (0%)        0 (0%)      0 (0%)            0 (0%)         14m
  openshift-cnv                           hostpath-provisioner-vhvr7                                         0 (0%)        0 (0%)      0 (0%)            0 (0%)         20h
  openshift-cnv                           kube-cni-linux-bridge-plugin-zvvcf                                 60m (0%)      0 (0%)      30Mi (0%)         0 (0%)         20h
  openshift-cnv                           kubevirt-node-labeller-qmmb6                                       0 (0%)        0 (0%)      0 (0%)            0 (0%)         20h
  openshift-cnv                           nmstate-handler-ssqsc                                              0 (0%)        0 (0%)      0 (0%)            0 (0%)         20h
  openshift-cnv                           ovs-cni-amd64-t8cbz                                                0 (0%)        0 (0%)      0 (0%)            0 (0%)         20h
  openshift-cnv                           virt-api-6d765b6dd5-t2v5l                                          0 (0%)        0 (0%)      0 (0%)            0 (0%)         20h
  openshift-cnv                           virt-controller-68985f6974-mv2vb                                   0 (0%)        0 (0%)      0 (0%)            0 (0%)         20h
  openshift-cnv                           virt-handler-mzfgg                                                 0 (0%)        0 (0%)      0 (0%)            0 (0%)         20h
  openshift-cnv                           virt-operator-7d57664b89-nzjcl                                     0 (0%)        0 (0%)      0 (0%)            0 (0%)         14m
  openshift-cnv                           virt-template-validator-65dbfc87d8-56dks                           0 (0%)        0 (0%)      0 (0%)            0 (0%)         20h
  openshift-cnv                           virt-template-validator-65dbfc87d8-89rvn                           0 (0%)        0 (0%)      0 (0%)            0 (0%)         7m25s
  openshift-cnv                           vm-import-controller-74d785b999-4vrkx                              0 (0%)        0 (0%)      0 (0%)            0 (0%)         20h
  openshift-cnv                           vm-import-operator-7d856d4f4b-kdzh6                                0 (0%)        0 (0%)      0 (0%)            0 (0%)         7m25s
  openshift-dns                           dns-default-pn9k5                                                  65m (0%)      0 (0%)      110Mi (0%)        512Mi (3%)     20m
  openshift-image-registry                node-ca-lmnqz                                                      10m (0%)      0 (0%)      10Mi (0%)         0 (0%)         40m
  openshift-ingress-canary                ingress-canary-gn4fs                                               10m (0%)      0 (0%)      20Mi (0%)         0 (0%)         41m
  openshift-ingress                       router-default-6c5c8d967b-vm8qb                                    100m (1%)     0 (0%)      256Mi (1%)        0 (0%)         14m
  openshift-local-storage                 local-block-local-diskmaker-8zkvh                                  0 (0%)        0 (0%)      0 (0%)            0 (0%)         20h
  openshift-local-storage                 local-block-local-provisioner-phdhq                                0 (0%)        0 (0%)      0 (0%)            0 (0%)         20h
  openshift-machine-config-operator       machine-config-daemon-sqjbv                                        40m (0%)      0 (0%)      100Mi (0%)        0 (0%)         17m
  openshift-manila-csi-driver             csi-nodeplugin-nfsplugin-frgkn                                     10m (0%)      0 (0%)      50Mi (0%)         0 (0%)         40m
  openshift-manila-csi-driver             openstack-manila-csi-nodeplugin-v9889                              15m (0%)      0 (0%)      70Mi (0%)         0 (0%)         40m
  openshift-marketplace                   certified-operators-5ghdn                                          10m (0%)      0 (0%)      50Mi (0%)         0 (0%)         13m
  openshift-marketplace                   hco-catalogsource-s5p9d                                            10m (0%)      0 (0%)      50Mi (0%)         0 (0%)         20h
  openshift-marketplace                   ocs-catalogsource-m5974                                            10m (0%)      0 (0%)      50Mi (0%)         0 (0%)         7m25s
  openshift-marketplace                   redhat-marketplace-qfmcp                                           10m (0%)      0 (0%)      50Mi (0%)         0 (0%)         7m26s
  openshift-marketplace                   redhat-operators-vhszj                                             10m (0%)      0 (0%)      50Mi (0%)         0 (0%)         14m
  openshift-monitoring                    alertmanager-main-0                                                8m (0%)       0 (0%)      270Mi (1%)        0 (0%)         14m
  openshift-monitoring                    alertmanager-main-1                                                8m (0%)       0 (0%)      270Mi (1%)        0 (0%)         40m
  openshift-monitoring                    kube-state-metrics-6c798c69f5-sdkph                                4m (0%)       0 (0%)      120Mi (0%)        0 (0%)         7m28s
  openshift-monitoring                    node-exporter-x6lq2                                                9m (0%)       0 (0%)      210Mi (1%)        0 (0%)         41m
  openshift-monitoring                    openshift-state-metrics-6bd7d6f65-b7c9x                            3m (0%)       0 (0%)      190Mi (1%)        0 (0%)         14m
  openshift-monitoring                    prometheus-adapter-6575b658bf-qnxj6                                1m (0%)       0 (0%)      25Mi (0%)         0 (0%)         14m
  openshift-monitoring                    prometheus-adapter-6575b658bf-rzxht                                1m (0%)       0 (0%)      25Mi (0%)         0 (0%)         7m27s
  openshift-monitoring                    prometheus-k8s-0                                                   76m (1%)      0 (0%)      1204Mi (8%)       0 (0%)         7m26s
  openshift-monitoring                    prometheus-k8s-1                                                   76m (1%)      0 (0%)      1204Mi (8%)       0 (0%)         41m
  openshift-monitoring                    thanos-querier-565fc8859b-pkzx2                                    9m (0%)       0 (0%)      92Mi (0%)         0 (0%)         14m
  openshift-multus                        multus-4w72h                                                       10m (0%)      0 (0%)      150Mi (1%)        0 (0%)         24m
  openshift-multus                        network-metrics-daemon-nm5zq                                       20m (0%)      0 (0%)      120Mi (0%)        0 (0%)         25m
  openshift-network-diagnostics           network-check-target-4bsnh                                         10m (0%)      0 (0%)      15Mi (0%)         0 (0%)         27m
  openshift-openstack-infra               coredns-ssp04-rvkqg-worker-0-zvfkw                                 100m (1%)     0 (0%)      200Mi (1%)        0 (0%)         21h
  openshift-openstack-infra               keepalived-ssp04-rvkqg-worker-0-zvfkw                              200m (2%)     0 (0%)      400Mi (2%)        0 (0%)         21h
  openshift-openstack-infra               mdns-publisher-ssp04-rvkqg-worker-0-zvfkw                          100m (1%)     0 (0%)      200Mi (1%)        0 (0%)         21h
  openshift-sdn                           ovs-k77kz                                                          15m (0%)      0 (0%)      400Mi (2%)        0 (0%)         23m
  openshift-sdn                           sdn-srksb                                                          110m (1%)     0 (0%)      220Mi (1%)        0 (0%)         27m
  openshift-storage                       csi-cephfsplugin-provisioner-6bc7cbf6f-88vxh                       0 (0%)        0 (0%)      0 (0%)            0 (0%)         14m
  openshift-storage                       csi-cephfsplugin-xjcf6                                             0 (0%)        0 (0%)      0 (0%)            0 (0%)         20h
  openshift-storage                       csi-rbdplugin-8rtwc                                                0 (0%)        0 (0%)      0 (0%)            0 (0%)         20h
  openshift-storage                       csi-rbdplugin-provisioner-7699b8c4b8-vtf9b                         0 (0%)        0 (0%)      0 (0%)            0 (0%)         14m
  openshift-storage                       noobaa-endpoint-7584c5969f-rwn7z                                   0 (0%)        0 (0%)      0 (0%)            0 (0%)         7m25s
  openshift-storage                       noobaa-operator-5855b5688-cfn67                                    0 (0%)        0 (0%)      0 (0%)            0 (0%)         14m
  openshift-storage                       ocs-metrics-exporter-5d66d5fc59-fbqm6                              0 (0%)        0 (0%)      0 (0%)            0 (0%)         7m27s
  openshift-storage                       ocs-operator-cd5b866f5-29q72                                       0 (0%)        0 (0%)      0 (0%)            0 (0%)         7m26s
  openshift-storage                       rook-ceph-crashcollector-ssp04-rvkqg-worker-0-zvfkw-666867qr7vg    0 (0%)        0 (0%)      0 (0%)            0 (0%)         20h
  openshift-storage                       rook-ceph-mds-ocs-storagecluster-cephfilesystem-b-786bdbf7qv4qj    0 (0%)        0 (0%)      0 (0%)            0 (0%)         14m
  openshift-storage                       rook-ceph-mgr-a-5576d7458b-sdsz2                                   0 (0%)        0 (0%)      0 (0%)            0 (0%)         14m
  openshift-storage                       rook-ceph-mon-b-76cb57664f-v7bhl                                   0 (0%)        0 (0%)      0 (0%)            0 (0%)         20h
  openshift-storage                       rook-ceph-operator-6cfc658d4b-prp6k                                0 (0%)        0 (0%)      0 (0%)            0 (0%)         14m
  openshift-storage                       rook-ceph-osd-1-5946d84494-rkmbx                                   0 (0%)        0 (0%)      0 (0%)            0 (0%)         20h
  openshift-storage                       rook-ceph-rgw-ocs-storagecluster-cephobjectstore-b-8b6b579tmgtf    0 (0%)        0 (0%)      0 (0%)            0 (0%)         19h
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource                         Requests           Limits
  --------                         --------           ------
  cpu                              1350m (18%)        200m (2%)
  memory                           12943622145 (82%)  616870912 (3%)
  ephemeral-storage                0 (0%)             0 (0%)
  hugepages-1Gi                    0 (0%)             0 (0%)
  hugepages-2Mi                    0 (0%)             0 (0%)
  attachable-volumes-cinder        0                  0
  devices.kubevirt.io/kvm          2                  2
  devices.kubevirt.io/tun          2                  2
  devices.kubevirt.io/vhost-net    1                  1
  ovs-cni.network.kubevirt.io/br0  0                  0
Events:                            <none>


===========================================
$ oc describe pod virt-launcher-rhel8-nfs-rr9tx
Name:                      virt-launcher-rhel8-nfs-rr9tx
Namespace:                 b4-upgarde
Priority:                  0
Node:                      ssp04-rvkqg-worker-0-zvfkw/192.168.1.21
Start Time:                Thu, 11 Feb 2021 13:17:48 +0000
Labels:                    kubevirt.io=virt-launcher
                           kubevirt.io/created-by=fee21d27-2344-4720-8472-c7b9550c1d71
                           kubevirt.io/domain=rhel8-nfs
                           kubevirt.io/size=tiny
Annotations:               k8s.v1.cni.cncf.io/network-status:
                             [{
                                 "name": "",
                                 "interface": "eth0",
                                 "ips": [
                                     "10.131.0.47"
                                 ],
                                 "default": true,
                                 "dns": {}
                             }]
                           k8s.v1.cni.cncf.io/networks-status:
                             [{
                                 "name": "",
                                 "interface": "eth0",
                                 "ips": [
                                     "10.131.0.47"
                                 ],
                                 "default": true,
                                 "dns": {}
                             }]
                           kubevirt.io/domain: rhel8-nfs
                           openshift.io/scc: kubevirt-controller
                           traffic.sidecar.istio.io/kubevirtInterfaces: k6t-eth0
Status:                    Terminating (lasts 4m55s)
Termination Grace Period:  210s
IP:                        10.131.0.47
IPs:
  IP:           10.131.0.47
Controlled By:  VirtualMachineInstance/rhel8-nfs
Init Containers:
  container-disk-binary:
    Container ID:  cri-o://c2d1b35e4ba2ad1ac71bb84e506375ead0baf769591313cae4f209fc37cadfd1
    Image:         registry.redhat.io/container-native-virtualization/virt-launcher@sha256:61b2083d39a867d87b09d56ab8eaca4734ffddfe02e4b25ac50798f7b672811b
    Image ID:      registry.redhat.io/container-native-virtualization/virt-launcher@sha256:61b2083d39a867d87b09d56ab8eaca4734ffddfe02e4b25ac50798f7b672811b
    Port:          <none>
    Host Port:     <none>
    Command:
      /usr/bin/cp
      /usr/bin/container-disk
      /init/usr/bin/container-disk
    State:          Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Thu, 11 Feb 2021 13:17:52 +0000
      Finished:     Thu, 11 Feb 2021 13:17:52 +0000
    Ready:          True
    Restart Count:  0
    Limits:
      cpu:     100m
      memory:  40M
    Requests:
      cpu:        10m
      memory:     1M
    Environment:  <none>
    Mounts:
      /init/usr/bin from virt-bin-share-dir (rw)
Containers:
  compute:
    Container ID:  cri-o://b9b7d0d53452e821c48c3b1e7f90b41853d5442f9811b1c4f7de076933711f2a
    Image:         registry.redhat.io/container-native-virtualization/virt-launcher@sha256:61b2083d39a867d87b09d56ab8eaca4734ffddfe02e4b25ac50798f7b672811b
    Image ID:      registry.redhat.io/container-native-virtualization/virt-launcher@sha256:61b2083d39a867d87b09d56ab8eaca4734ffddfe02e4b25ac50798f7b672811b
    Port:          <none>
    Host Port:     <none>
    Command:
      /usr/bin/virt-launcher
      --qemu-timeout
      5m
      --name
      rhel8-nfs
      --uid
      fee21d27-2344-4720-8472-c7b9550c1d71
      --namespace
      b4-upgarde
      --kubevirt-share-dir
      /var/run/kubevirt
      --ephemeral-disk-dir
      /var/run/kubevirt-ephemeral-disks
      --container-disk-dir
      /var/run/kubevirt/container-disks
      --grace-period-seconds
      195
      --hook-sidecars
      0
      --less-pvc-space-toleration
      10
      --ovmf-path
      /usr/share/OVMF
    State:          Running
      Started:      Thu, 11 Feb 2021 13:17:52 +0000
    Ready:          True
    Restart Count:  0
    Limits:
      devices.kubevirt.io/kvm:        1
      devices.kubevirt.io/tun:        1
      devices.kubevirt.io/vhost-net:  1
    Requests:
      cpu:                            100m
      devices.kubevirt.io/kvm:        1
      devices.kubevirt.io/tun:        1
      devices.kubevirt.io/vhost-net:  1
      memory:                         1709Mi
    Environment:                      <none>
    Mounts:
      /var/run/kubevirt-ephemeral-disks from ephemeral-disks (rw)
      /var/run/kubevirt-private/vmi-disks/rootdisk from rootdisk (rw)
      /var/run/kubevirt/container-disks from container-disks (rw)
      /var/run/kubevirt/sockets from sockets (rw)
      /var/run/libvirt from libvirt-runtime (rw)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  sockets:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  rootdisk:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  rhel8
    ReadOnly:   false
  virt-bin-share-dir:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  libvirt-runtime:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  ephemeral-disks:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     
    SizeLimit:  <unset>
  container-disks:
    Type:        EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:      
    SizeLimit:   <unset>
QoS Class:       Burstable
Node-Selectors:  kubevirt.io/schedulable=true
Tolerations:     node.kubernetes.io/memory-pressure:NoSchedule op=Exists
                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                 node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason          Age   From               Message
  ----     ------          ----  ----               -------
  Normal   Scheduled       103m  default-scheduler  Successfully assigned b4-upgarde/virt-launcher-rhel8-nfs-rr9tx to ssp04-rvkqg-worker-0-zvfkw
  Normal   AddedInterface  103m  multus             Add eth0 [10.131.0.47/23]
  Normal   Pulled          103m  kubelet            Container image "registry.redhat.io/container-native-virtualization/virt-launcher@sha256:61b2083d39a867d87b09d56ab8eaca4734ffddfe02e4b25ac50798f7b672811b" already present on machine
  Normal   Created         103m  kubelet            Created container container-disk-binary
  Normal   Started         103m  kubelet            Started container container-disk-binary
  Normal   Pulled          103m  kubelet            Container image "registry.redhat.io/container-native-virtualization/virt-launcher@sha256:61b2083d39a867d87b09d56ab8eaca4734ffddfe02e4b25ac50798f7b672811b" already present on machine
  Normal   Created         103m  kubelet            Created container compute
  Normal   Started         103m  kubelet            Started container compute
  Warning  NodeNotReady    13m   node-controller    Node is not ready

===========================================

Comment 6 Fabian Deutsch 2021-02-15 07:37:59 UTC
Ruth, is this only for runStrategy: Manual or also happening with running: true?

Comment 7 Ruth Netser 2021-02-15 08:26:52 UTC
Happens for either runStrategy (Manual or Always) or running.
With runStrategy:Manual, the VMI is not restarted.
With runStrategy:Always/running:true, the VMI is restarted. 

Steps to reproduce:
* Start a VMI 
rhel8-nfs        14s     Running   10.128.3.14    ssp04-rvkqg-worker-0-9knk9

* Taint the node the VMI is runnign on:
oc adm taint nodes <node name> kubevirt.io/drain="":NoExecute

* virt-launcher pod is terminaing
virt-launcher-rhel8-nfs-gfstw      1/1     Terminating   0          102s

* A new pod is created
virt-launcher-rhel8-nfs-prqk5      1/1     Running       0          21s

* No live migration.

Comment 8 aschuett 2021-02-15 09:58:28 UTC
Ruth, it looks like the taint used to reproduce this issue with `runStrategy:Always/running:true` is different than that of the issue with `runStrategy:Manual`. 

Does this issue occur when using taint `node.kubernetes.io/unreachable:NoExecute` with `runStrategy:Always/running: true` as well? The example above uses `kubevirt.io/drain=""` which seems like it would be a different issue?

Comment 9 Roman Mohr 2021-02-15 10:52:36 UTC
(In reply to aschuett from comment #8)
> Ruth, it looks like the taint used to reproduce this issue with
> `runStrategy:Always/running:true` is different than that of the issue with
> `runStrategy:Manual`. 
> 
> Does this issue occur when using taint
> `node.kubernetes.io/unreachable:NoExecute` with `runStrategy:Always/running:
> true` as well? The example above uses `kubevirt.io/drain=""` which seems
> like it would be a different issue?

Ruth just verified that *a* NoExec taint has the effect of bluntly deleting pods instead of doing a /evict call through the API.

Comment 10 Roman Mohr 2021-02-15 14:39:46 UTC
So, talked with Ryan and also checked the taint again:


Rayn confirms that during a normal healthy upgrade no NoExecute taints are applied.

Further after he asked me again which taint it is which we see applied, I realized that it is `node.kubernetes.io/unreachable:NoExecute` and NOT `node.kubernetes.io/unschedulable:NoExecute`.

This means that everything is fine on our side. The NotReady of the node is clearly the reason for this being added.

Comment 11 Ruth Netser 2021-02-16 06:44:07 UTC
Testing with small VMs (2 VMs running with Cirros DVs), the VMIs were live migrated.

Comment 13 Roman Mohr 2021-02-16 10:03:13 UTC
We have the final confirmation that, if there is no issue during the upgrade, no NoExecute taints appear. Both Ruth verifying it and the word from the node team.
Removing the blocker here.

Comment 14 sgott 2021-02-16 18:53:45 UTC
Marking this as TEST_ONLY because we expect it will be resolved when https://bugzilla.redhat.com/show_bug.cgi?id=1913532 is fixed.

Comment 15 Israel Pinto 2021-02-18 10:51:40 UTC
The fix is at: https://bugzilla.redhat.com/show_bug.cgi?id=1929278

Comment 16 Ruth Netser 2021-02-18 17:02:59 UTC
Tested with  4.7.0-0.nightly-2021-02-17-224627.
Everything was running after the upgrade but after a some time, the 2 migratable VMs were signaled to b shutdown.

VMI with runstrategy: Always:
{"component":"virt-handler","level":"info","msg":"Processing event b4-ugrade/win10-vm-ocs","pos":"vm.go:1175","timestamp":"2021-02-18T16:09:47.623885Z"}
{"component":"virt-handler","kind":"","level":"info","msg":"VMI is in phase: Running\n","name":"win10-vm-ocs","namespace":"b4-ugrade","pos":"vm.go:1177","timestamp":"2021-02-18T16:09:47.623910Z","uid":"83dc4f8d-
415b-4e0a-a983-2bf61a97bc74"}
{"component":"virt-handler","kind":"Domain","level":"info","msg":"Domain status: Running, reason: Unknown\n","name":"win10-vm-ocs","namespace":"b4-ugrade","pos":"vm.go:1182","timestamp":"2021-02-18T16:09:47.6239
31Z","uid":"83dc4f8d-415b-4e0a-a983-2bf61a97bc74"}
{"component":"virt-handler","kind":"Domain","level":"info","msg":"Received Domain Event of type MODIFIED","name":"win10-vm-ocs","namespace":"b4-ugrade","pos":"server.go:78","timestamp":"2021-02-18T16:09:47.63361
5Z","uid":"83dc4f8d-415b-4e0a-a983-2bf61a97bc74"}
{"component":"virt-handler","kind":"","level":"info","msg":"Signaled graceful shutdown for win10-vm-ocs","name":"win10-vm-ocs","namespace":"b4-ugrade","pos":"vm.go:1649","timestamp":"2021-02-18T16:09:47.659708Z","uid":"83dc4f8d-415b-4e0a-a983-2bf61a97bc74"}


VMI with runstrategy: Manual:
{"component":"virt-handler","level":"info","msg":"Processing event b4-ugrade/fed-nfs-vm","pos":"vm.go:1175","timestamp":"2021-02-18T10:44:20.705166Z"}
{"component":"virt-handler","kind":"","level":"info","msg":"VMI is in phase: Running\n","name":"fed-nfs-vm","namespace":"b4-ugrade","pos":"vm.go:1177","timestamp":"2021-02-18T10:44:20.705199
Z","uid":"218bb948-3110-4f77-ab9b-0d31403eae89"}
{"component":"virt-handler","kind":"Domain","level":"info","msg":"Domain status: Paused, reason: Migration\n","name":"fed-nfs-vm","namespace":"b4-ugrade","pos":"vm.go:1182","timestamp":"2021
-02-18T10:44:20.705219Z","uid":"218bb948-3110-4f77-ab9b-0d31403eae89"}
{"component":"virt-handler","kind":"Domain","level":"info","msg":"Received Domain Event of type MODIFIED","name":"fed-nfs-vm","namespace":"b4-ugrade","pos":"server.go:78","timestamp":"2021-0
2-18T10:44:21.179196Z","uid":"218bb948-3110-4f77-ab9b-0d31403eae89"}
{"component":"virt-handler","kind":"Domain","level":"info","msg":"Domain is in state Shutoff reason Migrated","name":"fed-nfs-vm","namespace":"b4-ugrade","pos":"vm.go:2175","timestamp":"2021-02-18T10:44:21.179311Z","uid":"218bb948-3110-4f77-ab9b-0d31403eae89"}
{"component":"virt-handler","level":"info","msg":"Processing event b4-ugrade/fed-nfs-vm","pos":"vm.go:1175","timestamp":"2021-02-18T10:44:21.179384Z"}
{"component":"virt-handler","kind":"","level":"info","msg":"VMI is in phase: Running\n","name":"fed-nfs-vm","namespace":"b4-ugrade","pos":"vm.go:1177","timestamp":"2021-02-18T10:44:21.179454Z","uid":"218bb948-3110-4f77-ab9b-0d31403eae89"}
{"component":"virt-handler","kind":"Domain","level":"info","msg":"Domain status: Shutoff, reason: Migrated\n","name":"fed-nfs-vm","namespace":"b4-ugrade","pos":"vm.go:1182","timestamp":"2021-02-18T10:44:21.179474Z","uid":"218bb948-3110-4f77-ab9b-0d31403eae89"}
{"component":"virt-handler","kind":"VirtualMachineInstance","level":"info","msg":"Using cached UID for vmi found in domain cache","name":"fed-nfs-vm","namespace":"b4-ugrade","pos":"vm.go:1350","timestamp":"2021-02-18T10:44:21.207149Z","uid":"218bb948-3110-4f77-ab9b-0d31403eae89"}
{"component":"virt-handler","level":"info","msg":"Processing event b4-ugrade/fed-nfs-vm","pos":"vm.go:1175","timestamp":"2021-02-18T10:44:21.207216Z"}
{"component":"virt-handler","kind":"Domain","level":"info","msg":"Domain status: Shutoff, reason: Migrated\n","name":"fed-nfs-vm","namespace":"b4-ugrade","pos":"vm.go:1182","timestamp":"2021-02-18T10:44:21.207263Z","uid":"218bb948-3110-4f77-ab9b-0d31403eae89"}



- 3 running VMs:
Windows10, OCP, runstrategy; Always
Fedora33, NFS, runstrategy: Manual
Rhel8.3, HPP

Started off from OCP 4.6.17, CNV 2.5.3
Upgraded OCP
VMs were live migrated (checked running process in the migated VMIs):
  ----     ------            ----                   ----                         -------
  Normal   SuccessfulCreate  4h8m                   disruptionbudget-controller  Created PodDisruptionBudget kubevirt-disruption-budget-78g8k
  Normal   SuccessfulCreate  4h8m                   virtualmachine-controller    Created virtual machine pod virt-launcher-win10-vm-ocs-vxjps
  Normal   Started           4h8m                   virt-handler                 VirtualMachineInstance started.
  Warning  SyncFailed        162m                   virt-handler                 unknown error encountered sending command SyncVMI: rpc error: code = DeadlineExceeded desc = context deadline exceeded
  Normal   Created           126m (x141 over 4h8m)  virt-handler                 VirtualMachineInstance defined.
  Normal   SuccessfulCreate  126m                   disruptionbudget-controller  Created Migration kubevirt-evacuation-xjj9z
  Normal   PreparingTarget   123m (x2 over 123m)    virt-handler                 VirtualMachineInstance Migration Target Prepared.
  Normal   PreparingTarget   123m                   virt-handler                 Migration Target is listening at 10.131.0.5, on ports: 39759,40051
  Warning  SyncFailed        122m                   virt-handler                 server error. command Migrate failed: "migration job already executed"
  Normal   SuccessfulCreate  122m                   disruptionbudget-controller  Created Migration kubevirt-evacuation-wfhcz
  Normal   PreparingTarget   120m (x2 over 120m)    virt-handler                 VirtualMachineInstance Migration Target Prepared.
  Normal   PreparingTarget   120m                   virt-handler                 Migration Target is listening at 10.129.2.4, on ports: 34763,43953
  Normal   Created           27m (x132 over 119m)   virt-handler                 VirtualMachineInstance defined.
  Normal   ShuttingDown      25s (x369 over 27m)    virt-handler                 Signaled Graceful Shutdown


$ oc get node
NAME                         STATUS   ROLES    AGE   VERSION
ssp09-c7g7r-master-0         Ready    master   26h   v1.20.0+ba45583
ssp09-c7g7r-master-1         Ready    master   26h   v1.20.0+ba45583
ssp09-c7g7r-master-2         Ready    master   26h   v1.20.0+ba45583
ssp09-c7g7r-worker-0-624qp   Ready    worker   26h   v1.20.0+ba45583
ssp09-c7g7r-worker-0-kwzsk   Ready    worker   26h   v1.20.0+ba45583
ssp09-c7g7r-worker-0-ndrjw   Ready    worker   26h   v1.20.0+ba45583

$ oc get vmi
NAME           AGE     PHASE     IP            NODENAME
fed-nfs-vm     8h      Running   10.129.2.46   ssp09-c7g7r-worker-0-624qp
rhel8-hpp-vm   53m     Running   10.131.0.16   ssp09-c7g7r-worker-0-ndrjw
win10-vm-ocs   3h10m   Running   10.129.2.48   ssp09-c7g7r-worker-0-624qp

Comment 17 Fabian Deutsch 2021-02-19 10:01:42 UTC
@rnetser can the most recent symptoms in comment 16 sound like a different/new bug. Please create a new bug to track it's resolution.

Comment 18 Fabian Deutsch 2021-02-19 10:07:12 UTC
NVM; I've split it to bug #1930630

Moving this bug back to ON_QA and TestOnly to cover the original issue

Comment 19 Ruth Netser 2021-02-19 10:11:34 UTC
Moving this to verified; with the fix the VMIs are migrated and nodes are Ready after upgrade

Comment 22 errata-xmlrpc 2021-03-10 11:23:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Virtualization 2.6.0 security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:0799


Note You need to log in before you can comment on or make changes to this bug.