Bug 2213262

Summary:	Lost connectivity after live migration of a VM with a hot-plugged disk
Product:	Container Native Virtualization (CNV)	Reporter:	Freddy E. Montero <fmontero>
Component:	Networking	Assignee:	Petr Horáček <phoracek>
Status:	CLOSED MIGRATED	QA Contact:	Yossi Segev <ysegev>
Severity:	high	Docs Contact:
Priority:	high
Version:	4.13.0	CC:	nrozen, thance, ysegev
Target Milestone:	---	Keywords:	Reopened
Target Release:	4.14.2
Hardware:	x86_64
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2023-12-14 16:14:18 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Freddy E. Montero 2023-06-07 16:53:00 UTC

Version-Release number of selected component (if applicable):
4.13.0

How reproducible:
Every time

Expected results:
Able to Live Migrate to any OCP Worker node

Additional Info:
OCP 4.13.0 BareMetal
Worker/Infra nodes have 4 links in a bond configuration
OCP Virt 4.13.0
ODF 4.12
nncp & nad: cnv-bridge

The issue is as follows:
* Created new VM (rhel8)with dual network stack (default net + bridge net)
* Single rootdisk as part of the initial creation (backed by ODF ceph rbd)
* Start VM

- Start pinging guest VM from an utility host.
* Live migrate the guest vm to any of 3 workers. Have done this several times to make sure I hit every worker node in the cluster.
  - Pinging works (it takes maybe 1-2 hits while the guest vm moves around worker node but it resumes normally)
  - Network inside the guest vm works. Can ping other hosts on the network, as well as gateway).

Here is where I run into issues
* Hot add a disk to the guest VM (blank pvc coming from ODF ceph rbd storageclass)
* Verify the disk is added via console
  - (pinging still working)
* Initiate a Live Migrate
* Wait for guest VM to finish migrating
  - (pinging stops responding)
* Log into guest console and check a few things (like ip route, ip neigh, etc)
  - Issue a systemctl restart NetworkManager on guest vm and although this succeeds, I can't ping anything like other hosts in the same br network or even the gateway.

In order for pinging (or the vm guest network for that matter) to resume, I do another Live Migrate and hope the guest vm lands on the original worker node where I added the host disk. <- this part is interesting. Why does network resumes only if the guest vm lands back in the worker node where it was when I added the hot disk?

I verified this with other test VMs where I wrote down the worker node, tested the network, then added 1 additional disk and then Live Migrated.... Network breaks until the guest vm is back on said worker node where I added the disk.

NNCP Config:
apiVersion: nmstate.io/v1
kind: NodeNetworkConfigurationPolicy
metadata:
  name: br1-bond0-policy
spec:
  desiredState:
    interfaces:
      - name: br1
        description: Bond0 br1
        type: linux-bridge
        state: up
        ipv4:
          enabled: false
        bridge:
          options:
            stp:
              enabled: false
          port:
            - name: bond0

NAD Config:
apiVersion: k8s.cni.cncf.io/v1
kind: NetworkAttachmentDefinition
metadata:
  annotations:
    description: Hypervisor
    k8s.v1.cni.cncf.io/resourceName: bridge.network.kubevirt.io/br1
  generation: 2
  name: br1-vlan192
  namespace: test-vmis
spec:
  config: >-
    {"name":"br1-vlan192","type":"cnv-bridge","cniVersion":"0.3.1","bridge":"br1","vlan":192,"macspoofchk":true,"ipam":{},"preserveDefaultVlan": false}


NetworkManager from Worker Nodes (this one from wrk1)
[connection]
id=bond0.104
uuid=912add91-19a5-4ac1-9f6c-1f137453dddd
type=vlan
interface-name=bond0.104
autoconnect=true
autoconnect-priority=1

[ethernet]

[vlan]
flags=1
id=104
parent=208a8ef4-8a95-4425-b4ad-58c7431614b9

[ipv4]
address1=10.176.104.170/22
dhcp-client-id=mac
dns=209.196.203.128;
dns-priority=40
dns-search=corp.CLIENTNAME.com;
method=manual
route1=0.0.0.0/0,10.176.107.254
route1_options=table=254

[ipv6]
addr-gen-mode=eui64
dhcp-duid=ll
dhcp-iaid=mac
method=disabled

[proxy]

----------------------------------------------------
[connection]
id=bond0
uuid=208a8ef4-8a95-4425-b4ad-58c7431614b9
type=bond
autoconnect-priority=1
autoconnect-slaves=1
interface-name=bond0
master=eda12b69-4e74-47b2-b7bf-6497855f226e
slave-type=bridge
timestamp=1685497889

[ethernet]
cloned-mac-address=3C:EC:EF:74:4D:80

[bond]
miimon=100
mode=802.3ad

[bridge-port]
vlans=2-4094

----------------------------------------------------
[connection]
id=br1
uuid=eda12b69-4e74-47b2-b7bf-6497855f226e
type=bridge
autoconnect-slaves=1
interface-name=br1
timestamp=1685726538

[ethernet]

[bridge]
stp=false
vlan-filtering=true

[ipv4]
method=disabled

[ipv6]
addr-gen-mode=default
method=disabled

[proxy]

[user]
nmstate.interface.description=Bond0 br1

----------------------------------------------------
[connection]
id=eno1np0
uuid=e469b9bd-c767-4819-80b2-5363f17ba870
type=ethernet
interface-name=eno1np0
master=208a8ef4-8a95-4425-b4ad-58c7431614b9
slave-type=bond
autoconnect=true
autoconnect-priority=1

----------------------------------------------------
[connection]
id=eno2np1
uuid=9d5a4724-54d7-4851-bd45-5262a7990908
type=ethernet
interface-name=eno2np1
master=208a8ef4-8a95-4425-b4ad-58c7431614b9
slave-type=bond
autoconnect=true
autoconnect-priority=1

----------------------------------------------------
[connection]
id=enp1s0f0
uuid=333f032e-42a8-41b3-94aa-5872ddb647e4
type=ethernet
interface-name=enp1s0f0
master=208a8ef4-8a95-4425-b4ad-58c7431614b9
slave-type=bond
autoconnect=true
autoconnect-priority=1

----------------------------------------------------
[connection]
id=enp1s0f1
uuid=3542e9b7-0bad-4e36-b734-0eff79071cac
type=ethernet
interface-name=enp1s0f1
master=208a8ef4-8a95-4425-b4ad-58c7431614b9
slave-type=bond
autoconnect=true
autoconnect-priority=1


VMI:
apiVersion: kubevirt.io/v1
kind: VirtualMachineInstance
metadata:
  annotations:
    kubevirt.io/latest-observed-api-version: v1
    kubevirt.io/storage-observed-api-version: v1alpha3
    vm.kubevirt.io/flavor: small
    vm.kubevirt.io/os: rhel8
    vm.kubevirt.io/workload: server
  creationTimestamp: "2023-06-06T15:42:49Z"
  finalizers:
  - kubevirt.io/virtualMachineControllerFinalize
  - foregroundDeleteVirtualMachine
  generation: 15
  labels:
    kubevirt.io/domain: dvuulvocpvmi02
    kubevirt.io/nodeName: dvuuopwkr03
    kubevirt.io/size: small
  name: dvuulvocpvmi02
  namespace: test-vmis
  ownerReferences:
  - apiVersion: kubevirt.io/v1
    blockOwnerDeletion: true
    controller: true
    kind: VirtualMachine
    name: dvuulvocpvmi02
    uid: 4c3b425a-4e4c-4533-9184-f0680cbf185d
  resourceVersion: "12322288"
  uid: 1811fa44-e509-431c-a937-3ad32e8d127f
spec:
  domain:
    cpu:
      cores: 2
      model: host-model
      sockets: 1
      threads: 1
    devices:
      disks:
      - bootOrder: 2
        disk:
          bus: virtio
        name: rootdisk
      - disk:
          bus: scsi
        name: disk-little-walrus
      interfaces:
      - macAddress: 02:bb:06:00:00:06
        masquerade: {}
        model: virtio
        name: default
      - bridge: {}
        macAddress: 02:bb:06:00:00:07
        model: virtio
        name: nic-liable-mollusk
      networkInterfaceMultiqueue: true
      rng: {}
    features:
      acpi:
        enabled: true
    firmware:
      uuid: 5a07e466-7638-51a5-9fdd-8ab5e24aebe4
    machine:
      type: pc-q35-rhel9.2.0
    resources:
      requests:
        memory: 4Gi
  evictionStrategy: LiveMigrate
  networks:
  - name: default
    pod: {}
  - multus:
      networkName: test-vmis/br1-vlan192
    name: nic-liable-mollusk
  terminationGracePeriodSeconds: 180
  volumes:
  - dataVolume:
      name: dvuulvocpvmi02
    name: rootdisk
  - dataVolume:
      hotpluggable: true
      name: dvuulvocpvmi02-disk-little-walrus
    name: disk-little-walrus
status:
  activePods:
    abd603fb-a0ff-4bd0-bf38-ac515cf49c83: dvuuopwkr03
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2023-06-06T15:43:10Z"
    status: "True"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: null
    status: "True"
    type: LiveMigratable
  - lastProbeTime: "2023-06-06T15:43:31Z"
    lastTransitionTime: null
    status: "True"
    type: AgentConnected
  guestOSInfo:
    id: rhel
    kernelRelease: 4.18.0-425.19.2.el8_7.x86_64
    kernelVersion: '#1 SMP Fri Mar 17 01:52:38 EDT 2023'
    name: Red Hat Enterprise Linux
    prettyName: Red Hat Enterprise Linux 8.7 (Ootpa)
    version: "8.7"
    versionId: "8.7"
  interfaces:
  - infoSource: domain, guest-agent
    interfaceName: enp1s0
    ipAddress: 192.168.6.231
    ipAddresses:
    - 192.168.6.231
    mac: 02:bb:06:00:00:06
    name: default
    queueCount: 2
  - infoSource: domain, guest-agent
    interfaceName: enp2s0
    ipAddress: 10.176.192.151
    ipAddresses:
    - 10.176.192.151
    - fe80::bb:6ff:fe00:7
    mac: 02:bb:06:00:00:07
    name: nic-liable-mollusk
    queueCount: 2
  launcherContainerImageVersion: registry.redhat.io/container-native-virtualization/virt-launcher-rhel9@sha256:8d493a50ff05c3b9f30d3ccdd93acec3b1d7fdc07324ce4b92521c6b084496b3
  migrationMethod: LiveMigration
  migrationTransport: Unix
  nodeName: dvuuopwkr03
  phase: Running
  phaseTransitionTimestamps:
  - phase: Pending
    phaseTransitionTimestamp: "2023-06-06T15:42:49Z"
  - phase: Scheduling
    phaseTransitionTimestamp: "2023-06-06T15:42:49Z"
  - phase: Scheduled
    phaseTransitionTimestamp: "2023-06-06T15:43:10Z"
  - phase: Running
    phaseTransitionTimestamp: "2023-06-06T15:43:13Z"
  qosClass: Burstable
  runtimeUser: 107
  selinuxContext: system_u:object_r:container_file_t:s0:c714,c978
  virtualMachineRevisionName: revision-start-vm-4c3b425a-4e4c-4533-9184-f0680cbf185d-17
  volumeStatus:
  - hotplugVolume:
      attachPodName: hp-volume-7vnjd
      attachPodUID: 843df1d6-191f-4581-a1e1-4ba9daa15a49
    message: Successfully attach hotplugged volume disk-little-walrus to VM
    name: disk-little-walrus
    persistentVolumeClaimInfo:
      accessModes:
      - ReadWriteMany
      capacity:
        storage: 5Gi
      filesystemOverhead: "0"
      requests:
        storage: "5368709120"
      volumeMode: Block
    phase: Ready
    reason: VolumeReady
    target: sda
  - name: rootdisk
    persistentVolumeClaimInfo:
      accessModes:
      - ReadWriteMany
      capacity:
        storage: 100Gi
      filesystemOverhead: "0"
      requests:
        storage: "107374182400"
      volumeMode: Block
    target: vda

Comment 1 Petr Horáček 2023-06-20 11:19:06 UTC

Thanks for collecting all that data and describing the reproducer in detail.

Comment 2 Petr Horáček 2023-07-03 09:43:04 UTC

Nir, since this is depending on integration between storage and network, could QE try to reproduce it internally first? If we see this is reproducible downstream, engineering would take over.

Comment 4 Yossi Segev 2023-07-13 09:02:15 UTC

CNV Network QE: I'm currently starting working on this bug, attempting to reproduce it.
As a first step I will try to reproduce it on an OCP/CNV 4.14 bare-metal cluster which I have available.
In parallel, I'm deploying an 4.13 bare-metal cluster, in case the issue will not be reproduced on the 4.14 cluster, and I would need an environment as close as possible to the one where the bug was found and reproduced.
I'll keep updating here.

Comment 5 Yossi Segev 2023-07-13 11:41:43 UTC

@fmontero Hi
Thanks for the detailed description.
There are some steps here I am missing, and it would help to reproduce the bug accurately if you provided them:
1. In the NetworkManager output you provided, I see that the bond interface (bond0) is bound to a VLAN. Can you please specify the exact methodology you used for creating this bond (i.e. the NodeNetworkConfigurationPolicy spec, or the nmcli command(s), or any other method you used).
2. Also for the bond setup - I need the bond mode you used, please (i.e. active-backup, balance-tlb, etc).
3. If you have the original VM (and not VMI) yaml you used, it is preferred.
If you were creating a VMI and not a VM - then the VMI *input* yaml you used is also good. If not possible, then I'll try to continue with theVMI *output* yaml you provided in the description.
4. Currently, when I apply the VMI yaml, it is pending on the PVC:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal FailedPvcNotFound 10m (x3 over 10m) virtualmachine-controller PVC disk-hotplug-debug-ns/dvuulvocpvmi02 does not exist, waiting for it to appear
Please also provide the PVC used for this scenario.

Generally, a step-by-step reproduction scenario would be the best, that way I can
a. avoid wasting time on trying to understand how to run certain steps.
b. be sure I am reproducing the scenario as accurately as possible.
5. (This one I also asked you in slack, so you can ignore that message and answer it all here): What are the exact OCP and CNV versions that run on your cluster?
I may have to deploy a cluster with the exact attributes as yours, currently I'm using what I have available, which is an OCP/CNV 4.14 bare-metal cluster.

Thank you very much for your cooperation.
Yoss

Comment 6 Yossi Segev 2023-07-13 12:00:47 UTC

Oh, and one more thing @fmontero :
6. Please also specify how you hot-plugged the disk. This will also help me save a lot of time reproducing it.
Thanks

Comment 7 Freddy E. Montero 2023-07-13 14:29:14 UTC

Versions of comnponents:
OCP: 4.13.1
ODF: odf-operator.v4.12.4-rhodf
CNV: kubevirt-hyperconverged-operator.v4.13.2


    During the initial agent-based cluster install, the bonds were created via the agent-install.yaml and baked into the CoreOS iso. A snippet from the AgentConfig looks like the following:

  - hostname: dvuuopwkr01

    role: worker

    rootDeviceHints:

      deviceName: "/dev/nvme0n1"

    interfaces:

      - name: eno1np0

        macAddress: 3C:EC:EF:74:4D:80

      - name: eno2np1

        macAddress: 3C:EC:EF:74:4D:81

    networkConfig:

      interfaces:

        - name: bond0.104

          type: vlan

          state: up

          vlan:

            base-iface: bond0

            id: 104

          ipv4:

            enabled: true

            address:

              - ip: 10.176.104.170

                prefix-length: 22

            dhcp: false

        - name: bond0

          type: bond

          state: up

          mac-address: 3C:EC:EF:74:4D:80

          ipv4:

            enabled: false

          ipv6:

            enabled: false

          link-aggregation:

            mode: 802.3ad

            options:

              miimon: "100"

            port:

              - eno1np0

              - eno2np1

              - enp1s0f0

              - enp1s0f1

      dns-resolver:

        config:

          search:

            - corp.clientname.com

          server:

            - 209.196.203.128

      routes:

        config:

          - destination: 0.0.0.0/0

            next-hop-address: 10.176.107.254

            next-hop-interface: bond0.104

            table-id: 254

Our bond mode is “802.3ad LACP”
The VM’s used for this testing were manually created “From Template” and the built-in default templates were used. rhel9-server-small in this instance. The VM’s yaml is attached.
apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
  annotations:
    kubemacpool.io/transaction-timestamp: '2023-06-08T20:01:51.930107092Z'
    kubevirt.io/latest-observed-api-version: v1
    kubevirt.io/storage-observed-api-version: v1alpha3
    vm.kubevirt.io/validations: |
      [
        {
          "name": "minimal-required-memory",
          "path": "jsonpath::.spec.domain.resources.requests.memory",
          "rule": "integer",
          "message": "This VM requires more memory.",
          "min": 1610612736
        }
      ]
  resourceVersion: '67992259'
  name: rhel9-ass
  uid: 4defd534-8c19-4491-a81d-ab2d9bbbedc9
  creationTimestamp: '2023-06-08T20:00:17Z'
  generation: 2
  managedFields:
    - apiVersion: kubevirt.io/v1
      fieldsType: FieldsV1
      fieldsV1:
        'f:metadata':
          'f:annotations':
            .: {}
            'f:kubemacpool.io/transaction-timestamp': {}
            'f:vm.kubevirt.io/validations': {}
          'f:labels':
            .: {}
            'f:app': {}
            'f:vm.kubevirt.io/template': {}
            'f:vm.kubevirt.io/template.namespace': {}
            'f:vm.kubevirt.io/template.revision': {}
        'f:spec':
          .: {}
          'f:dataVolumeTemplates': {}
          'f:running': {}
          'f:template':
            .: {}
            'f:metadata':
              .: {}
              'f:annotations':
                .: {}
                'f:vm.kubevirt.io/flavor': {}
                'f:vm.kubevirt.io/os': {}
                'f:vm.kubevirt.io/workload': {}
              'f:creationTimestamp': {}
              'f:labels':
                .: {}
                'f:kubevirt.io/domain': {}
                'f:kubevirt.io/size': {}
            'f:spec':
              .: {}
              'f:domain':
                .: {}
                'f:cpu':
                  .: {}
                  'f:cores': {}
                  'f:sockets': {}
                  'f:threads': {}
                'f:devices':
                  .: {}
                  'f:interfaces': {}
                  'f:networkInterfaceMultiqueue': {}
                  'f:rng': {}
                'f:features':
                  .: {}
                  'f:acpi': {}
                  'f:smm':
                    .: {}
                    'f:enabled': {}
                'f:firmware':
                  .: {}
                  'f:bootloader':
                    .: {}
                    'f:efi': {}
                'f:machine':
                  .: {}
                  'f:type': {}
                'f:resources':
                  .: {}
                  'f:requests':
                    .: {}
                    'f:memory': {}
              'f:evictionStrategy': {}
              'f:networks': {}
              'f:terminationGracePeriodSeconds': {}
      manager: Mozilla
      operation: Update
      time: '2023-06-08T20:00:17Z'
    - apiVersion: kubevirt.io/v1alpha3
      fieldsType: FieldsV1
      fieldsV1:
        'f:metadata':
          'f:annotations':
            'f:kubevirt.io/latest-observed-api-version': {}
            'f:kubevirt.io/storage-observed-api-version': {}
          'f:finalizers':
            .: {}
            'v:"kubevirt.io/virtualMachineControllerFinalize"': {}
        'f:spec':
          'f:template':
            'f:spec':
              'f:domain':
                'f:devices':
                  'f:disks': {}
              'f:volumes': {}
      manager: Go-http-client
      operation: Update
      time: '2023-06-08T20:01:51Z'
    - apiVersion: kubevirt.io/v1alpha3
      fieldsType: FieldsV1
      fieldsV1:
        'f:status':
          .: {}
          'f:conditions': {}
          'f:created': {}
          'f:printableStatus': {}
          'f:ready': {}
          'f:volumeSnapshotStatuses': {}
      manager: Go-http-client
      operation: Update
      subresource: status
      time: '2023-07-11T16:16:20Z'
  namespace: test-vmis
  finalizers:
    - kubevirt.io/virtualMachineControllerFinalize
  labels:
    app: rhel9-ass
    vm.kubevirt.io/template: rhel9-server-small-5taulps1t
    vm.kubevirt.io/template.namespace: test-vmis
    vm.kubevirt.io/template.revision: '1'
spec:
  dataVolumeTemplates:
    - metadata:
        creationTimestamp: null
        name: rhel9-ass
      spec:
        preallocation: false
        sourceRef:
          kind: DataSource
          name: rhel9
          namespace: openshift-virtualization-os-images
        storage:
          accessModes:
            - ReadWriteMany
          resources:
            requests:
              storage: 30Gi
          storageClassName: ocs-storagecluster-ceph-rbd
          volumeMode: Block
  running: true
  template:
    metadata:
      annotations:
        vm.kubevirt.io/flavor: small
        vm.kubevirt.io/os: rhel9
        vm.kubevirt.io/workload: server
      creationTimestamp: null
      labels:
        kubevirt.io/domain: rhel9-ass
        kubevirt.io/size: small
    spec:
      domain:
        cpu:
          cores: 2
          sockets: 1
          threads: 1
        devices:
          disks:
            - bootOrder: 1
              disk:
                bus: virtio
              name: rootdisk
            - bootOrder: 2
              disk:
                bus: virtio
              name: cloudinitdisk
            - disk:
                bus: scsi
              name: disk-naughty-angelfish
          interfaces:
            - bridge: {}
              macAddress: '02:bb:06:00:00:43'
              model: virtio
              name: nic1
          networkInterfaceMultiqueue: true
          rng: {}
        features:
          acpi: {}
          smm:
            enabled: true
        firmware:
          bootloader:
            efi: {}
        machine:
          type: pc-q35-rhel9.2.0
        resources:
          requests:
            memory: 4Gi
      evictionStrategy: LiveMigrate
      networks:
        - multus:
            networkName: test-vmis/br1-192
          name: nic1
      terminationGracePeriodSeconds: 180
      volumes:
        - dataVolume:
            name: rhel9-ass
          name: rootdisk
        - cloudInitNoCloud:
            networkData: |
              version: 2
              ethernets:
                eth0: 
                  addresses:
                  - 10.176.192.150/22
                  gateway4: 10.176.195.254
                  nameservers:
                    search: [corp.clientname.com]
                    addresses: [209.196.203.128]
                  routes:
                    - to: 0.0.0.0/0
                      via: 10.176.195.254
                      metric: 3
            userData: |
              #cloud-config
              user: usgadmin
              password: passwordhere
              chpasswd:
                expire: false
          name: cloudinitdisk
        - dataVolume:
            hotpluggable: true
            name: rhel9-ass-disk-naughty-angelfish
          name: disk-naughty-angelfish
status:
  conditions:
    - lastProbeTime: null
      lastTransitionTime: '2023-07-11T16:16:20Z'
      status: 'True'
      type: Ready
    - lastProbeTime: null
      lastTransitionTime: null
      status: 'True'
      type: LiveMigratable
    - lastProbeTime: '2023-06-08T20:01:17Z'
      lastTransitionTime: null
      status: 'True'
      type: AgentConnected
  created: true
  printableStatus: Running
  ready: true
  volumeSnapshotStatuses:
    - enabled: true
      name: rootdisk
    - enabled: false
      name: cloudinitdisk
      reason: 'Snapshot is not supported for this volumeSource type [cloudinitdisk]'
    - enabled: true
      name: disk-naughty-angelfish

Comment 8 Freddy E. Montero 2023-07-13 14:31:20 UTC

(In reply to Yossi Segev from comment #6)
> Oh, and one more thing @fmontero :
> 6. Please also specify how you hot-plugged the disk. This will also help me
> save a lot of time reproducing it.
> Thanks

Via the WebUI for the VM. 
In the disks section of the VM, added a new scsi disk (Only scsi option was allowed while VM was on).

Comment 9 Yossi Segev 2023-07-13 15:05:12 UTC

According to additional info provided by Freddy in slack, the setup for this test (including DV/PVC) was all done via UI:

1- Use the RHEL8/9 template. Nothing special. That creates 2 disks. The root disk and the cloud-init disk.
After the VM is up and running and live migrated to different hosts, we added a hot plugged disk. In the original host where you add a disk, everything works. Once you migrate the vm, thats when the issue starts.
2- The 2nd scenario is we migrate a vm from RHEV. Then again, we live migrated the vm to other hosts. It was after we added a hot plugged disk and live migrated when the issue starts

Comment 10 Yossi Segev 2023-07-16 11:45:35 UTC

I set up a session with storage QE in order to obtain the storage side of the setup.

Comment 11 Yossi Segev 2023-07-17 12:18:41 UTC

I tried reproducing this bug locally. This BZ lacks lots of necessary data (e.g. the specs of the rootdisk PVC, the hot-plugged PVC, etc.), so Freddy tried assisting me as much as possible (thanks again, Freddy).
In addition, I got the help of CNV storage QE.
The result was, that when running the VM (before even getting to the part of hot-plugging an additional disk), the VM got to `Running` state, but when trying to access its console - the following appears:

BdsDxe: failed to load Boot0001 "UEFI Misc Device" from PciRoot(0x0)/Pci(0x2,0x6)/Pci(0x0,0x0): Not Found
BdsDxe: No bootable option or device was found.
BdsDxe: Press any key to enter the Boot Manager Menu.

In addition - according to the description, the customer was trying to bridge the VM's secondary interface over a bond of mode 802.3ad, which we currently don't support, so I'm not even sure the customer was trying a valid scenario.
Therefore - I'm closing this bug.

I'm specifying the scenario I was running (until getting to the start-up failure; spec resources are attched):
On a bare-metal cluster with OCP/CNV 4.14:
1. Create a bond:
$ oc apply -f bond-nncp.yaml

2. Create a VLAN interface over the bond:
$ oc apply -f vlan-bond-nncp.yaml

3. Create a bridge interface over the bond:
$ oc apply -f bridge-nncp.yaml

4. Create a NetworkAttachmentDefinition which utilizes the bridge:
$ oc apply -f bridge-nad.yaml

5. Via the web UI - I created the rootdisk PVC:
Storage -> PersistentVolumeClaims -> Create PersistentVolumeClaims -> With Form
StorageClass: ocs-storagecluster-ceph-rbd
PersistentVolumeClaim name: rootdisk-pvc
Access mode: Shared Access (RWX)
size: 100 GiB
Volume mode: Block

6. Via the web UI - create the VM:
Virtualization -> VirtualMachines -> Create -> From template
Template: Red Hat Enterprise Linux 9 VM
Customize VirtualMachine -> Storage:
Disk source: PVC (clone PVC)
VC project: The same namespace where the PVC and the NetworkAttachemtDefinition exist
PVC name: rootdisk-pvc
disk size: 100 GiB
Hit Next button
Move to "Network interfaces" tab
Add network interface
Model: virtio (keep the default)
Network: NetworkAttachmentDefinition name
Type: Bridge
Save
Press the "Create VirtualMachibe" button
* If "Start this VirtualMachine after creation" is checked, then the VM should start immediately; otherwise - run "virtctl start <vm-name>" from the CLI.

7. Starting the VMI triggers cloning of the DV; you need to wait for it to complete (this takes a few minutes) before the VMI can move to running state:
$ oc get dv -w
NAME PHASE PROGRESS RESTARTS AGE
rhel9-crazy-pigeon CloneInProgress 17.13% 3m10s
rhel9-crazy-pigeon CloneInProgress 17.46% 3m12s
rhel9-crazy-pigeon CloneInProgress 17.78% 3m13s
...
rhel9-crazy-pigeon CloneInProgress 96.23% 11m
rhel9-crazy-pigeon CloneInProgress 96.83% 11m
rhel9-crazy-pigeon CloneInProgress 97.10% 11m
...
rhel9-crazy-pigeon CloneInProgress 99.35% 12m
rhel9-crazy-pigeon CloneInProgress 99.66% 12m
rhel9-crazy-pigeon CloneInProgress 100.00% 12m
rhel9-crazy-pigeon CloneInProgress 100.00% 12m
rhel9-crazy-pigeon Succeeded 100.0% 12m

Comment 16 Petr Horáček 2023-07-17 14:35:22 UTC

802.3ad is supported https://docs.openshift.com/container-platform/4.13/networking/k8s_nmstate/k8s-nmstate-updating-node-network-config.html#virt-example-bond-nncp_k8s_nmstate-updating-node-network-config. We may not be testing this mode in scope of CNV due to environment complexities, but customers can use it - we just rely on nmstate and NetworkManager configuring it correctly.

Comment 17 Petr Horáček 2023-07-17 14:39:58 UTC

Thanks for investigating this @ysegev. Would you please continue working with Freddy, looking for a reproducer?

The error does not seem related to LACP, but just to be sure, perhaps the customer can help us by trying to reproduce it with a simpler bond mode, e.g. active-backup, to help us get as close to their environment as possible.

Comment 18 Freddy E. Montero 2023-07-17 14:59:35 UTC

(In reply to Petr Horáček from comment #17)
> Thanks for investigating this @ysegev. Would you please continue
> working with Freddy, looking for a reproducer?
> 
> The error does not seem related to LACP, but just to be sure, perhaps the
> customer can help us by trying to reproduce it with a simpler bond mode,
> e.g. active-backup, to help us get as close to their environment as possible.

Yoss,
What additional information do you need.
Client is willing to help adding additional context to this BZ.
Let us know what is needed and will try to get. Thanks

Comment 19 Yossi Segev 2023-07-17 15:37:09 UTC

(In reply to Freddy E. Montero from comment #18)
> Yoss,
> What additional information do you need.
> Client is willing to help adding additional context to this BZ.
> Let us know what is needed and will try to get. Thanks

What I need is a step-by-step reproducible scenario, covering all steps - the bond creation, the bridge, original VM (i.e. before hot-plugging a disk), PVC, hot-plugging the disk, and every step involved in this bug.
The description above includes parts of setups, from which I tried to reverse-engineer and guess the actual scenario. Obviously, this methodology doesn't work :) .
Besides, I think that the description also includes parts that are not related to this bug, for example - the VLAN interface bond0.104 seems unrelated, as I don't see this interface used after it is created.
The best input would be a yaml resource of every part of this setup; if not possible - then I would need an exact cookbook. Otherwise, I don't have any way to reproduce it.

Comment 20 Yossi Segev 2023-07-31 07:58:40 UTC

@fmontero Any new info, so I can try reproducing this BZ again?
There is the specification of what is needed in comment #19, please let me know if something is not clear.
Thanks

Comment 25 Petr Horáček 2023-08-31 10:45:54 UTC

The blocker only is in a week and realistically we won't be able to reproduce and fix this until then. Deferring this BZ to 4.14.1