Created attachment 1331830 [details] engine, vdsm and supervdsm logs Description of problem: After VM migration port mirroring is not set on the host for all port mirroring networks Version-Release number of selected component (if applicable): ovirt-engine-4.2.0-0.0.master.20170926175518.git0d20200.el7.centos.noarch vdsm-4.20.3-105.git216d21d.el7.centos.x86_64 Steps to Reproduce: Automation port mirroring case01 1.Create VM with 2 NICs. the VM NICs have a profile with port_mirroring enabled. 2.Migrate the VM to another host and back to the original host 3. Check if port_mirroring is enabled using tc filter show <network-name> Actual results: The second VM network is not port_mirroring enabled. Expected results: Additional info: I have 2 VNICs on the VM with port_mirroring enabled (ovirtmgmt and pm_net_1) While migration the VM only ovirtmgmt is set with port_mirroring. 2017-09-28 11:18:53,491+0300 ERROR (vm/b9f9b466) [virt.vm] (vmId='b9f9b466-a7c6-45d3-a6ff-b550389a97f1') PARAMS: {'nicModel': 'virtio', 'macAddr': '00:1a:4a:16:20:16', 'fil terParameters': [], 'vm_custom': {}, 'alias': 'net0', 'custom': {}, 'filter': 'vdsm-no-mac-spoofing', 'specParams': {}, 'address': {'slot': '0x03', 'bus': '0x00', 'domain': '0x0000', 'type': 'pci', 'function': '0x0'}, 'device': 'bridge', 'type': 'bridge', 'network': 'ovirtmgmt'} (network:115) 2017-09-28 11:18:53,491+0300 ERROR (vm/b9f9b466) [virt.vm] (vmId='b9f9b466-a7c6-45d3-a6ff-b550389a97f1') META: {'vmid': 'b9f9b466-a7c6-45d3-a6ff-b550389a97f1', 'portMirrori ng': ['ovirtmgmt'], 'custom': {}} (network:116) 2017-09-28 11:18:53,491+0300 ERROR (vm/b9f9b466) [virt.vm] (vmId='b9f9b466-a7c6-45d3-a6ff-b550389a97f1') PARAMS: {'nicModel': 'virtio', 'macAddr': '00:1a:4a:16:20:26', 'fil terParameters': [], 'vm_custom': {}, 'alias': 'net1', 'custom': {}, 'filter': 'vdsm-no-mac-spoofing', 'specParams': {}, 'address': {'slot': '0x0a', 'bus': '0x00', 'domain': '0x0000', 'type': 'pci', 'function': '0x0'}, 'device': 'bridge', 'type': 'bridge', 'network': 'pm_net_1'} (network:115) 2017-09-28 11:18:53,491+0300 ERROR (vm/b9f9b466) [virt.vm] (vmId='b9f9b466-a7c6-45d3-a6ff-b550389a97f1') META: {'vmid': 'b9f9b466-a7c6-45d3-a6ff-b550389a97f1'} (network:116 )
I have a little guess: it seems that pm_net_1 is added with portMirroring via hotplugNic, while ovirtmgmt is set with it on vm startup. 2017-09-28 11:08:14,804+0300 INFO (jsonrpc/0) [api.virt] START hotplugNic(params={'nic': {'nicModel': 'pv', 'macAddr': '00:1a:4a:16:20:26', 'linkActive': 'true', 'network': 'pm_net_1', 'filterParameters': [], 'filter': 'vdsm-no-mac-spoofing', 'specParams': {'inbound': {}, 'outbound': {}}, 'deviceId': '1e033140-b2d7-406e-ba5d-7d36132985d0', 'device': 'bridge', 'type': 'interface', 'portMirroring': ['pm_net_1']}, 'vmId': 'b9f9b466-a7c6-45d3-a6ff-b550389a97f1'}) from=::ffff:10.35.161.250,53958, flow_id=3e50a252 (api:46) Francesco, could it be that device xml is not passed to the hostplugNic verb? Did we have a plan for where would device custom properties are to be placed? After hotplug they have to be incorporated into the vm metadata area. If so, port mirroring would work on the host where hotplug took place (since old-stile conf is looked at), but would not be reinstated on migration destination (because xml exists, but has not request for portMirroring).
This bug report has Keywords: Regression or TestBlocker. Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.
(In reply to Dan Kenigsberg from comment #1) > I have a little guess: it seems that pm_net_1 is added with portMirroring > via hotplugNic, while ovirtmgmt is set with it on vm startup. > > 2017-09-28 11:08:14,804+0300 INFO (jsonrpc/0) [api.virt] START > hotplugNic(params={'nic': {'nicModel': 'pv', 'macAddr': '00:1a:4a:16:20:26', > 'linkActive': 'true', 'network': 'pm_net_1', 'filterParameters': [], > 'filter': 'vdsm-no-mac-spoofing', 'specParams': {'inbound': {}, 'outbound': > {}}, 'deviceId': '1e033140-b2d7-406e-ba5d-7d36132985d0', 'device': 'bridge', > 'type': 'interface', 'portMirroring': ['pm_net_1']}, 'vmId': > 'b9f9b466-a7c6-45d3-a6ff-b550389a97f1'}) from=::ffff:10.35.161.250,53958, > flow_id=3e50a252 (api:46) > > Francesco, could it be that device xml is not passed to the hostplugNic > verb? Yes, the hotplugNic verb receive params in vm.conf format (python dict) and builds the device XML itself. Please note this hasn't changed after the introduction of engine XML. > Did we have a plan for where would device custom properties are to be > placed? After hotplug they have to be incorporated into the vm metadata area. It should be enough to send them with the "custom" key among the nicParams. Storing metadata is simple, just a matter of adding a couple of lines if hotplugNic succeeds Something like (to be tested): attrs = vmdevices.common.get_drive_conf_identifying_attrs(nicParams) with self._md_desc.device(**attrs) as dev: dev.clear() dev.update(nicParams['custom']) We could probably add one simple helper in VM class to do this. Currently storage devices do not have this issue, but they will once we will enable them to consume the engine XML. The fix is the same for both (actually for all) the device classes.
Dan, seems you are taking ownership of it, moving to network.
is it not a better fix to actually migrate hotplug to vm xml? i know we still do that dirty stuff with metadata there...but at least it would be consistent, no?
(In reply to Michal Skrivanek from comment #5) > is it not a better fix to actually migrate hotplug to vm xml? i know we > still do that dirty stuff with metadata there...but at least it would be > consistent, no? Yes, it is surely it seems a better option - even though we still must support the old API for the time being
I don't see why it's marked as a blocker. Removing blocker flag.
(In reply to Yaniv Kaul from comment #7) > I don't see why it's marked as a blocker. Removing blocker flag. It is a regression due to EngineDomXML. I believe that people are using portMirroring, and unless fixed, they are going to loose it after migration.
The alternative fix is to finish the hotplug -> xml move. Milan just started on that
(In reply to Dan Kenigsberg from comment #8) > (In reply to Yaniv Kaul from comment #7) > > I don't see why it's marked as a blocker. Removing blocker flag. > > It is a regression due to EngineDomXML. I believe that people are using > portMirroring, and unless fixed, they are going to loose it after migration. We can't have port mirroring broken after the upgrade. I consider this a blocker.
note it's for hotplugged devices only
Looks like we unfortunately need some Engine changes to work with Milan's https://gerrit.ovirt.org/84602 , Engine needs to send new parameters for the XML and its metadata. Also, we need another patch (my https://gerrit.ovirt.org/#/c/84788/) to make sure the portMirroring info is stored when the NIC is hotplugged using the legacy params
- another attempt to remove the blocker flag - moving under virt - the original problem is supposedly fix, keeping the bug open to finish hotplug vm xml changes
(In reply to Michal Skrivanek from comment #13) > - another attempt to remove the blocker flag > - moving under virt > - the original problem is supposedly fix, keeping the bug open to finish > hotplug vm xml changes With http://gerrit.ovirt.org/84825 merged, we are good to go. Tests on my setup passed successfully. The BZ can move to MODIFIED.
the hotplug needs to be done correctly for 4.2 GA so we can at least have a chance to eliminate the old code when we stop supporting 4.2 Since the problem is fixed let me try to remove the blocker+ one more time and keep the bug for hotplug implementation, that should at least get it off the radar
http://gerrit.ovirt.org/84866 is not relevant for this BZ.
I believe that the bug is properly solved, but let us get QA's certification.
How come this bug is ON_QA while 3 patches are in POST status?
(In reply to Meni Yakove from comment #19) > How come this bug is ON_QA while 3 patches are in POST status? Good question, and Does the d/s build has this fix? vdsm-4.20.9-1.el7ev.x86_64 ?
I don't think that the recently-added patches are related to this bug, let's call in Milan to explain why he re-used the same bugurl. vdsm-4.20.9-1.el7ev.x86_64 and the later vdsm-4.20.9.1 has Edy's fix.
Hi Milan, Please remove the 4 patches you added to this bug, it's wrong bz process and the patches not relevant for this bug. Many thanks)
Ok So, Dan, to make things clear here, vdsm-4.20.9-1.el7ev.x86_64 didn't had the fix and it failedQA. Today we already got vdsm-4.20.9.1-1.el7ev.x86_64 and only now it fixed and work as expected. I wanted to fail this bug on vdsm-4.20.9-1.el7ev.x86_64, but you are lucky and we just got the new build so we could re-test it on latest vdsm. vdsm-4.20.9-1.el7ev.x86_64 - failedQA Verified only on - vdsm-4.20.9.1-1.el7ev.x86_64 with 4.2.0.2-0.1.el7
Hi Michael, removed, sorry for the confusion.
This bugzilla is included in oVirt 4.2.0 release, published on Dec 20th 2017. Since the problem described in this bug report should be resolved in oVirt 4.2.0 release, published on Dec 20th 2017, it has been closed with a resolution of CURRENT RELEASE. If the solution does not work for you, please open a new bug report.