Bug 2213262
Summary: | Lost connectivity after live migration of a VM with a hot-plugged disk | ||
---|---|---|---|
Product: | Container Native Virtualization (CNV) | Reporter: | Freddy E. Montero <fmontero> |
Component: | Networking | Assignee: | Petr Horáček <phoracek> |
Status: | CLOSED MIGRATED | QA Contact: | Yossi Segev <ysegev> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 4.13.0 | CC: | nrozen, thance, ysegev |
Target Milestone: | --- | Keywords: | Reopened |
Target Release: | 4.14.2 | ||
Hardware: | x86_64 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2023-12-14 16:14:18 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Freddy E. Montero
2023-06-07 16:53:00 UTC
Thanks for collecting all that data and describing the reproducer in detail. Nir, since this is depending on integration between storage and network, could QE try to reproduce it internally first? If we see this is reproducible downstream, engineering would take over. CNV Network QE: I'm currently starting working on this bug, attempting to reproduce it. As a first step I will try to reproduce it on an OCP/CNV 4.14 bare-metal cluster which I have available. In parallel, I'm deploying an 4.13 bare-metal cluster, in case the issue will not be reproduced on the 4.14 cluster, and I would need an environment as close as possible to the one where the bug was found and reproduced. I'll keep updating here. @fmontero Hi Thanks for the detailed description. There are some steps here I am missing, and it would help to reproduce the bug accurately if you provided them: 1. In the NetworkManager output you provided, I see that the bond interface (bond0) is bound to a VLAN. Can you please specify the exact methodology you used for creating this bond (i.e. the NodeNetworkConfigurationPolicy spec, or the nmcli command(s), or any other method you used). 2. Also for the bond setup - I need the bond mode you used, please (i.e. active-backup, balance-tlb, etc). 3. If you have the original VM (and not VMI) yaml you used, it is preferred. If you were creating a VMI and not a VM - then the VMI *input* yaml you used is also good. If not possible, then I'll try to continue with theVMI *output* yaml you provided in the description. 4. Currently, when I apply the VMI yaml, it is pending on the PVC: Type Reason Age From Message ---- ------ ---- ---- ------- Normal FailedPvcNotFound 10m (x3 over 10m) virtualmachine-controller PVC disk-hotplug-debug-ns/dvuulvocpvmi02 does not exist, waiting for it to appear Please also provide the PVC used for this scenario. Generally, a step-by-step reproduction scenario would be the best, that way I can a. avoid wasting time on trying to understand how to run certain steps. b. be sure I am reproducing the scenario as accurately as possible. 5. (This one I also asked you in slack, so you can ignore that message and answer it all here): What are the exact OCP and CNV versions that run on your cluster? I may have to deploy a cluster with the exact attributes as yours, currently I'm using what I have available, which is an OCP/CNV 4.14 bare-metal cluster. Thank you very much for your cooperation. Yoss Oh, and one more thing @fmontero : 6. Please also specify how you hot-plugged the disk. This will also help me save a lot of time reproducing it. Thanks Versions of comnponents: OCP: 4.13.1 ODF: odf-operator.v4.12.4-rhodf CNV: kubevirt-hyperconverged-operator.v4.13.2 During the initial agent-based cluster install, the bonds were created via the agent-install.yaml and baked into the CoreOS iso. A snippet from the AgentConfig looks like the following: - hostname: dvuuopwkr01 role: worker rootDeviceHints: deviceName: "/dev/nvme0n1" interfaces: - name: eno1np0 macAddress: 3C:EC:EF:74:4D:80 - name: eno2np1 macAddress: 3C:EC:EF:74:4D:81 networkConfig: interfaces: - name: bond0.104 type: vlan state: up vlan: base-iface: bond0 id: 104 ipv4: enabled: true address: - ip: 10.176.104.170 prefix-length: 22 dhcp: false - name: bond0 type: bond state: up mac-address: 3C:EC:EF:74:4D:80 ipv4: enabled: false ipv6: enabled: false link-aggregation: mode: 802.3ad options: miimon: "100" port: - eno1np0 - eno2np1 - enp1s0f0 - enp1s0f1 dns-resolver: config: search: - corp.clientname.com server: - 209.196.203.128 routes: config: - destination: 0.0.0.0/0 next-hop-address: 10.176.107.254 next-hop-interface: bond0.104 table-id: 254 Our bond mode is “802.3ad LACP” The VM’s used for this testing were manually created “From Template” and the built-in default templates were used. rhel9-server-small in this instance. The VM’s yaml is attached. apiVersion: kubevirt.io/v1 kind: VirtualMachine metadata: annotations: kubemacpool.io/transaction-timestamp: '2023-06-08T20:01:51.930107092Z' kubevirt.io/latest-observed-api-version: v1 kubevirt.io/storage-observed-api-version: v1alpha3 vm.kubevirt.io/validations: | [ { "name": "minimal-required-memory", "path": "jsonpath::.spec.domain.resources.requests.memory", "rule": "integer", "message": "This VM requires more memory.", "min": 1610612736 } ] resourceVersion: '67992259' name: rhel9-ass uid: 4defd534-8c19-4491-a81d-ab2d9bbbedc9 creationTimestamp: '2023-06-08T20:00:17Z' generation: 2 managedFields: - apiVersion: kubevirt.io/v1 fieldsType: FieldsV1 fieldsV1: 'f:metadata': 'f:annotations': .: {} 'f:kubemacpool.io/transaction-timestamp': {} 'f:vm.kubevirt.io/validations': {} 'f:labels': .: {} 'f:app': {} 'f:vm.kubevirt.io/template': {} 'f:vm.kubevirt.io/template.namespace': {} 'f:vm.kubevirt.io/template.revision': {} 'f:spec': .: {} 'f:dataVolumeTemplates': {} 'f:running': {} 'f:template': .: {} 'f:metadata': .: {} 'f:annotations': .: {} 'f:vm.kubevirt.io/flavor': {} 'f:vm.kubevirt.io/os': {} 'f:vm.kubevirt.io/workload': {} 'f:creationTimestamp': {} 'f:labels': .: {} 'f:kubevirt.io/domain': {} 'f:kubevirt.io/size': {} 'f:spec': .: {} 'f:domain': .: {} 'f:cpu': .: {} 'f:cores': {} 'f:sockets': {} 'f:threads': {} 'f:devices': .: {} 'f:interfaces': {} 'f:networkInterfaceMultiqueue': {} 'f:rng': {} 'f:features': .: {} 'f:acpi': {} 'f:smm': .: {} 'f:enabled': {} 'f:firmware': .: {} 'f:bootloader': .: {} 'f:efi': {} 'f:machine': .: {} 'f:type': {} 'f:resources': .: {} 'f:requests': .: {} 'f:memory': {} 'f:evictionStrategy': {} 'f:networks': {} 'f:terminationGracePeriodSeconds': {} manager: Mozilla operation: Update time: '2023-06-08T20:00:17Z' - apiVersion: kubevirt.io/v1alpha3 fieldsType: FieldsV1 fieldsV1: 'f:metadata': 'f:annotations': 'f:kubevirt.io/latest-observed-api-version': {} 'f:kubevirt.io/storage-observed-api-version': {} 'f:finalizers': .: {} 'v:"kubevirt.io/virtualMachineControllerFinalize"': {} 'f:spec': 'f:template': 'f:spec': 'f:domain': 'f:devices': 'f:disks': {} 'f:volumes': {} manager: Go-http-client operation: Update time: '2023-06-08T20:01:51Z' - apiVersion: kubevirt.io/v1alpha3 fieldsType: FieldsV1 fieldsV1: 'f:status': .: {} 'f:conditions': {} 'f:created': {} 'f:printableStatus': {} 'f:ready': {} 'f:volumeSnapshotStatuses': {} manager: Go-http-client operation: Update subresource: status time: '2023-07-11T16:16:20Z' namespace: test-vmis finalizers: - kubevirt.io/virtualMachineControllerFinalize labels: app: rhel9-ass vm.kubevirt.io/template: rhel9-server-small-5taulps1t vm.kubevirt.io/template.namespace: test-vmis vm.kubevirt.io/template.revision: '1' spec: dataVolumeTemplates: - metadata: creationTimestamp: null name: rhel9-ass spec: preallocation: false sourceRef: kind: DataSource name: rhel9 namespace: openshift-virtualization-os-images storage: accessModes: - ReadWriteMany resources: requests: storage: 30Gi storageClassName: ocs-storagecluster-ceph-rbd volumeMode: Block running: true template: metadata: annotations: vm.kubevirt.io/flavor: small vm.kubevirt.io/os: rhel9 vm.kubevirt.io/workload: server creationTimestamp: null labels: kubevirt.io/domain: rhel9-ass kubevirt.io/size: small spec: domain: cpu: cores: 2 sockets: 1 threads: 1 devices: disks: - bootOrder: 1 disk: bus: virtio name: rootdisk - bootOrder: 2 disk: bus: virtio name: cloudinitdisk - disk: bus: scsi name: disk-naughty-angelfish interfaces: - bridge: {} macAddress: '02:bb:06:00:00:43' model: virtio name: nic1 networkInterfaceMultiqueue: true rng: {} features: acpi: {} smm: enabled: true firmware: bootloader: efi: {} machine: type: pc-q35-rhel9.2.0 resources: requests: memory: 4Gi evictionStrategy: LiveMigrate networks: - multus: networkName: test-vmis/br1-192 name: nic1 terminationGracePeriodSeconds: 180 volumes: - dataVolume: name: rhel9-ass name: rootdisk - cloudInitNoCloud: networkData: | version: 2 ethernets: eth0: addresses: - 10.176.192.150/22 gateway4: 10.176.195.254 nameservers: search: [corp.clientname.com] addresses: [209.196.203.128] routes: - to: 0.0.0.0/0 via: 10.176.195.254 metric: 3 userData: | #cloud-config user: usgadmin password: passwordhere chpasswd: expire: false name: cloudinitdisk - dataVolume: hotpluggable: true name: rhel9-ass-disk-naughty-angelfish name: disk-naughty-angelfish status: conditions: - lastProbeTime: null lastTransitionTime: '2023-07-11T16:16:20Z' status: 'True' type: Ready - lastProbeTime: null lastTransitionTime: null status: 'True' type: LiveMigratable - lastProbeTime: '2023-06-08T20:01:17Z' lastTransitionTime: null status: 'True' type: AgentConnected created: true printableStatus: Running ready: true volumeSnapshotStatuses: - enabled: true name: rootdisk - enabled: false name: cloudinitdisk reason: 'Snapshot is not supported for this volumeSource type [cloudinitdisk]' - enabled: true name: disk-naughty-angelfish (In reply to Yossi Segev from comment #6) > Oh, and one more thing @fmontero : > 6. Please also specify how you hot-plugged the disk. This will also help me > save a lot of time reproducing it. > Thanks Via the WebUI for the VM. In the disks section of the VM, added a new scsi disk (Only scsi option was allowed while VM was on). According to additional info provided by Freddy in slack, the setup for this test (including DV/PVC) was all done via UI: 1- Use the RHEL8/9 template. Nothing special. That creates 2 disks. The root disk and the cloud-init disk. After the VM is up and running and live migrated to different hosts, we added a hot plugged disk. In the original host where you add a disk, everything works. Once you migrate the vm, thats when the issue starts. 2- The 2nd scenario is we migrate a vm from RHEV. Then again, we live migrated the vm to other hosts. It was after we added a hot plugged disk and live migrated when the issue starts I set up a session with storage QE in order to obtain the storage side of the setup. I tried reproducing this bug locally. This BZ lacks lots of necessary data (e.g. the specs of the rootdisk PVC, the hot-plugged PVC, etc.), so Freddy tried assisting me as much as possible (thanks again, Freddy). In addition, I got the help of CNV storage QE. The result was, that when running the VM (before even getting to the part of hot-plugging an additional disk), the VM got to `Running` state, but when trying to access its console - the following appears: BdsDxe: failed to load Boot0001 "UEFI Misc Device" from PciRoot(0x0)/Pci(0x2,0x6)/Pci(0x0,0x0): Not Found BdsDxe: No bootable option or device was found. BdsDxe: Press any key to enter the Boot Manager Menu. In addition - according to the description, the customer was trying to bridge the VM's secondary interface over a bond of mode 802.3ad, which we currently don't support, so I'm not even sure the customer was trying a valid scenario. Therefore - I'm closing this bug. I'm specifying the scenario I was running (until getting to the start-up failure; spec resources are attched): On a bare-metal cluster with OCP/CNV 4.14: 1. Create a bond: $ oc apply -f bond-nncp.yaml 2. Create a VLAN interface over the bond: $ oc apply -f vlan-bond-nncp.yaml 3. Create a bridge interface over the bond: $ oc apply -f bridge-nncp.yaml 4. Create a NetworkAttachmentDefinition which utilizes the bridge: $ oc apply -f bridge-nad.yaml 5. Via the web UI - I created the rootdisk PVC: Storage -> PersistentVolumeClaims -> Create PersistentVolumeClaims -> With Form StorageClass: ocs-storagecluster-ceph-rbd PersistentVolumeClaim name: rootdisk-pvc Access mode: Shared Access (RWX) size: 100 GiB Volume mode: Block 6. Via the web UI - create the VM: Virtualization -> VirtualMachines -> Create -> From template Template: Red Hat Enterprise Linux 9 VM Customize VirtualMachine -> Storage: Disk source: PVC (clone PVC) VC project: The same namespace where the PVC and the NetworkAttachemtDefinition exist PVC name: rootdisk-pvc disk size: 100 GiB Hit Next button Move to "Network interfaces" tab Add network interface Model: virtio (keep the default) Network: NetworkAttachmentDefinition name Type: Bridge Save Press the "Create VirtualMachibe" button * If "Start this VirtualMachine after creation" is checked, then the VM should start immediately; otherwise - run "virtctl start <vm-name>" from the CLI. 7. Starting the VMI triggers cloning of the DV; you need to wait for it to complete (this takes a few minutes) before the VMI can move to running state: $ oc get dv -w NAME PHASE PROGRESS RESTARTS AGE rhel9-crazy-pigeon CloneInProgress 17.13% 3m10s rhel9-crazy-pigeon CloneInProgress 17.46% 3m12s rhel9-crazy-pigeon CloneInProgress 17.78% 3m13s ... rhel9-crazy-pigeon CloneInProgress 96.23% 11m rhel9-crazy-pigeon CloneInProgress 96.83% 11m rhel9-crazy-pigeon CloneInProgress 97.10% 11m ... rhel9-crazy-pigeon CloneInProgress 99.35% 12m rhel9-crazy-pigeon CloneInProgress 99.66% 12m rhel9-crazy-pigeon CloneInProgress 100.00% 12m rhel9-crazy-pigeon CloneInProgress 100.00% 12m rhel9-crazy-pigeon Succeeded 100.0% 12m 802.3ad is supported https://docs.openshift.com/container-platform/4.13/networking/k8s_nmstate/k8s-nmstate-updating-node-network-config.html#virt-example-bond-nncp_k8s_nmstate-updating-node-network-config. We may not be testing this mode in scope of CNV due to environment complexities, but customers can use it - we just rely on nmstate and NetworkManager configuring it correctly. Thanks for investigating this @ysegev. Would you please continue working with Freddy, looking for a reproducer? The error does not seem related to LACP, but just to be sure, perhaps the customer can help us by trying to reproduce it with a simpler bond mode, e.g. active-backup, to help us get as close to their environment as possible. (In reply to Petr Horáček from comment #17) > Thanks for investigating this @ysegev. Would you please continue > working with Freddy, looking for a reproducer? > > The error does not seem related to LACP, but just to be sure, perhaps the > customer can help us by trying to reproduce it with a simpler bond mode, > e.g. active-backup, to help us get as close to their environment as possible. Yoss, What additional information do you need. Client is willing to help adding additional context to this BZ. Let us know what is needed and will try to get. Thanks (In reply to Freddy E. Montero from comment #18) > Yoss, > What additional information do you need. > Client is willing to help adding additional context to this BZ. > Let us know what is needed and will try to get. Thanks What I need is a step-by-step reproducible scenario, covering all steps - the bond creation, the bridge, original VM (i.e. before hot-plugging a disk), PVC, hot-plugging the disk, and every step involved in this bug. The description above includes parts of setups, from which I tried to reverse-engineer and guess the actual scenario. Obviously, this methodology doesn't work :) . Besides, I think that the description also includes parts that are not related to this bug, for example - the VLAN interface bond0.104 seems unrelated, as I don't see this interface used after it is created. The best input would be a yaml resource of every part of this setup; if not possible - then I would need an exact cookbook. Otherwise, I don't have any way to reproduce it. @fmontero Any new info, so I can try reproducing this BZ again? There is the specification of what is needed in comment #19, please let me know if something is not clear. Thanks The blocker only is in a week and realistically we won't be able to reproduce and fix this until then. Deferring this BZ to 4.14.1 |