DescriptionBenjamin Schmaus
2016-10-27 11:22:54 UTC
Description of problem:
Cold migration of an instance that has an SR-IOV interface fails to migrate because on migrated compute's nova is trying to use the PCI device/address that has been allocated from the incoming compute. Obviously this is failing since the PCI device is not present on the migrated compute.
See the error "libvirtError: Device 0000:83:10.6 not found: could not access /sys/bus/pci/devices/0000:83:10.6/config: No such file or directory" in the log in the attachment.
Nova should allocate a new PCI device based the hardware configuration of the compute where the instance is being migrated and this PCI device should be use to create the instance XML.
Version-Release number of selected component (if applicable):
OSP9 - Fix requested for OSP9
How reproducible:
100%
Steps to Reproduce:
1.
2.
3.
Actual results:
Expected results:
Additional info:
Customer gets "message": "'MigrationContext' object has no attribute 'old_pci_devices'" when trying to migrate an instance w/sriov port with hotfix applied.
Awaiting logs and will put into collab for engineering review
I'm pretty sure I've identified the cause of the message they're seeing. I've filed bz 1393561 against rhos9 to track it. To repeat what I've said in that bz, we're missing [1] in rhos9 - that's the patch that introduced version 1.1 of MigrationContext, which added the old_pci_devices and new_pci_devices fields to the object. It's a big patch - 240 lines spread over a dozen files, so its backportability remains to be determined. In the meantime, we'll need to revert the fix that we did for this bz 1389284 out of rhos9 because without [1], it breaks migrations.
[1] https://review.openstack.org/#/c/307124/