Bug 1496719 - Port mirroring is not set after VM migration
Summary: Port mirroring is not set after VM migration
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: vdsm
Classification: oVirt
Component: Core
Version: ---
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ovirt-4.2.0
: 4.20.9.1
Assignee: Michal Skrivanek
QA Contact: Michael Burman
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-09-28 08:32 UTC by Meni Yakove
Modified: 2018-01-03 07:58 UTC (History)
8 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2017-12-20 11:34:21 UTC
oVirt Team: Virt
Embargoed:
rule-engine: ovirt-4.2+
ylavi: testing_plan_complete?


Attachments (Terms of Use)
engine, vdsm and supervdsm logs (1.30 MB, application/zip)
2017-09-28 08:32 UTC, Meni Yakove
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 82506 0 master ABANDONED [WIP] update VM customer properties after hotplug 2020-09-07 10:16:45 UTC
oVirt gerrit 84788 0 master MERGED virt: domainxml: save metadata when hot{,un}plug 2020-09-07 10:16:45 UTC
oVirt gerrit 84825 0 master MERGED virt: metadata: correctly store portMirroring 2020-09-07 10:16:45 UTC
oVirt gerrit 84866 0 master MERGED vm: net: save device metadata on update 2020-09-07 10:16:45 UTC
oVirt gerrit 85146 0 master MERGED virt: Use hotplugged XML, not original XML, in NIC hotplug rollback 2020-09-07 10:16:44 UTC

Description Meni Yakove 2017-09-28 08:32:43 UTC
Created attachment 1331830 [details]
engine, vdsm and supervdsm logs

Description of problem:
After VM migration port mirroring is not set on the host for all port mirroring networks

Version-Release number of selected component (if applicable):
ovirt-engine-4.2.0-0.0.master.20170926175518.git0d20200.el7.centos.noarch
vdsm-4.20.3-105.git216d21d.el7.centos.x86_64


Steps to Reproduce:
Automation port mirroring case01
1.Create VM with 2 NICs. the VM NICs have a profile with port_mirroring enabled.
2.Migrate the VM to another host and back to the original host
3. Check if port_mirroring is enabled using tc filter show <network-name>

Actual results:
The second VM network is not port_mirroring enabled. 

Expected results:


Additional info:
I have 2 VNICs on the VM with port_mirroring enabled (ovirtmgmt and pm_net_1)
While migration the VM only ovirtmgmt is set with port_mirroring.

2017-09-28 11:18:53,491+0300 ERROR (vm/b9f9b466) [virt.vm] (vmId='b9f9b466-a7c6-45d3-a6ff-b550389a97f1') PARAMS: {'nicModel': 'virtio', 'macAddr': '00:1a:4a:16:20:16', 'fil
terParameters': [], 'vm_custom': {}, 'alias': 'net0', 'custom': {}, 'filter': 'vdsm-no-mac-spoofing', 'specParams': {}, 'address': {'slot': '0x03', 'bus': '0x00', 'domain':
 '0x0000', 'type': 'pci', 'function': '0x0'}, 'device': 'bridge', 'type': 'bridge', 'network': 'ovirtmgmt'} (network:115)
2017-09-28 11:18:53,491+0300 ERROR (vm/b9f9b466) [virt.vm] (vmId='b9f9b466-a7c6-45d3-a6ff-b550389a97f1') META: {'vmid': 'b9f9b466-a7c6-45d3-a6ff-b550389a97f1', 'portMirrori
ng': ['ovirtmgmt'], 'custom': {}} (network:116)


2017-09-28 11:18:53,491+0300 ERROR (vm/b9f9b466) [virt.vm] (vmId='b9f9b466-a7c6-45d3-a6ff-b550389a97f1') PARAMS: {'nicModel': 'virtio', 'macAddr': '00:1a:4a:16:20:26', 'fil
terParameters': [], 'vm_custom': {}, 'alias': 'net1', 'custom': {}, 'filter': 'vdsm-no-mac-spoofing', 'specParams': {}, 'address': {'slot': '0x0a', 'bus': '0x00', 'domain':
 '0x0000', 'type': 'pci', 'function': '0x0'}, 'device': 'bridge', 'type': 'bridge', 'network': 'pm_net_1'} (network:115)
2017-09-28 11:18:53,491+0300 ERROR (vm/b9f9b466) [virt.vm] (vmId='b9f9b466-a7c6-45d3-a6ff-b550389a97f1') META: {'vmid': 'b9f9b466-a7c6-45d3-a6ff-b550389a97f1'} (network:116
)

Comment 1 Dan Kenigsberg 2017-09-28 17:22:43 UTC
I have a little guess: it seems that pm_net_1 is added with portMirroring via hotplugNic, while ovirtmgmt is set with it on vm startup.

2017-09-28 11:08:14,804+0300 INFO  (jsonrpc/0) [api.virt] START hotplugNic(params={'nic': {'nicModel': 'pv', 'macAddr': '00:1a:4a:16:20:26', 'linkActive': 'true', 'network': 'pm_net_1', 'filterParameters': [], 'filter': 'vdsm-no-mac-spoofing', 'specParams': {'inbound': {}, 'outbound': {}}, 'deviceId': '1e033140-b2d7-406e-ba5d-7d36132985d0', 'device': 'bridge', 'type': 'interface', 'portMirroring': ['pm_net_1']}, 'vmId': 'b9f9b466-a7c6-45d3-a6ff-b550389a97f1'}) from=::ffff:10.35.161.250,53958, flow_id=3e50a252 (api:46)

Francesco, could it be that device xml is not passed to the hostplugNic verb? Did we have a plan for where would device custom properties are to be placed? After hotplug they have to be incorporated into the vm metadata area.

If so, port mirroring would work on the host where hotplug took place (since old-stile conf is looked at), but would not be reinstated on migration destination (because xml exists, but has not request for portMirroring).

Comment 2 Red Hat Bugzilla Rules Engine 2017-09-28 17:22:47 UTC
This bug report has Keywords: Regression or TestBlocker.
Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.

Comment 3 Francesco Romani 2017-09-29 10:05:36 UTC
(In reply to Dan Kenigsberg from comment #1)
> I have a little guess: it seems that pm_net_1 is added with portMirroring
> via hotplugNic, while ovirtmgmt is set with it on vm startup.
> 
> 2017-09-28 11:08:14,804+0300 INFO  (jsonrpc/0) [api.virt] START
> hotplugNic(params={'nic': {'nicModel': 'pv', 'macAddr': '00:1a:4a:16:20:26',
> 'linkActive': 'true', 'network': 'pm_net_1', 'filterParameters': [],
> 'filter': 'vdsm-no-mac-spoofing', 'specParams': {'inbound': {}, 'outbound':
> {}}, 'deviceId': '1e033140-b2d7-406e-ba5d-7d36132985d0', 'device': 'bridge',
> 'type': 'interface', 'portMirroring': ['pm_net_1']}, 'vmId':
> 'b9f9b466-a7c6-45d3-a6ff-b550389a97f1'}) from=::ffff:10.35.161.250,53958,
> flow_id=3e50a252 (api:46)
> 
> Francesco, could it be that device xml is not passed to the hostplugNic
> verb?

Yes, the hotplugNic verb receive params in vm.conf format (python dict) and builds the device XML itself. Please note this hasn't changed after the introduction of engine XML.

> Did we have a plan for where would device custom properties are to be
> placed? After hotplug they have to be incorporated into the vm metadata area.

It should be enough to send them with the "custom" key among the nicParams.

Storing metadata is simple, just a matter of adding a couple of lines if hotplugNic succeeds

Something like (to be tested):

  attrs = vmdevices.common.get_drive_conf_identifying_attrs(nicParams)
  with self._md_desc.device(**attrs) as dev:
    dev.clear()
    dev.update(nicParams['custom'])

We could probably add one simple helper in VM class to do this.

Currently storage devices do not have this issue, but they will once we will enable them to consume the engine XML.
The fix is the same for both (actually for all) the device classes.

Comment 4 Tomas Jelinek 2017-10-04 08:04:20 UTC
Dan, seems you are taking ownership of it, moving to network.

Comment 5 Michal Skrivanek 2017-10-18 17:42:29 UTC
is it not a better fix to actually migrate hotplug to vm xml? i know we still do that dirty stuff with metadata there...but at least it would be consistent, no?

Comment 6 Francesco Romani 2017-10-30 10:42:33 UTC
(In reply to Michal Skrivanek from comment #5)
> is it not a better fix to actually migrate hotplug to vm xml? i know we
> still do that dirty stuff with metadata there...but at least it would be
> consistent, no?

Yes, it is surely it seems a better option - even though we still must support the old API for the time being

Comment 7 Yaniv Kaul 2017-11-16 08:11:21 UTC
I don't see why it's marked as a blocker. Removing blocker flag.

Comment 8 Dan Kenigsberg 2017-11-21 16:43:32 UTC
(In reply to Yaniv Kaul from comment #7)
> I don't see why it's marked as a blocker. Removing blocker flag.

It is a regression due to EngineDomXML. I believe that people are using portMirroring, and unless fixed, they are going to loose it after migration.

Comment 9 Michal Skrivanek 2017-11-21 18:03:22 UTC
The alternative fix is to finish the hotplug -> xml move. Milan just started on that

Comment 10 Yaniv Lavi 2017-11-22 08:54:49 UTC
(In reply to Dan Kenigsberg from comment #8)
> (In reply to Yaniv Kaul from comment #7)
> > I don't see why it's marked as a blocker. Removing blocker flag.
> 
> It is a regression due to EngineDomXML. I believe that people are using
> portMirroring, and unless fixed, they are going to loose it after migration.

We can't have port mirroring broken after the upgrade. I consider this a blocker.

Comment 11 Michal Skrivanek 2017-11-22 08:59:28 UTC
note it's for hotplugged devices only

Comment 12 Francesco Romani 2017-11-28 14:39:18 UTC
Looks like we unfortunately need some Engine changes to work with Milan's https://gerrit.ovirt.org/84602 , Engine needs to send new parameters for the XML and its metadata.

Also, we need another patch (my https://gerrit.ovirt.org/#/c/84788/) to make sure the portMirroring info is stored when the NIC is hotplugged using the legacy params

Comment 13 Michal Skrivanek 2017-11-29 10:38:18 UTC
- another attempt to remove the blocker flag
- moving under virt
- the original problem is supposedly fix, keeping the bug open to finish hotplug vm xml changes

Comment 14 Red Hat Bugzilla Rules Engine 2017-11-29 10:38:25 UTC
This bug report has Keywords: Regression or TestBlocker.
Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.

Comment 15 Edward Haas 2017-11-29 11:56:09 UTC
(In reply to Michal Skrivanek from comment #13)
> - another attempt to remove the blocker flag
> - moving under virt
> - the original problem is supposedly fix, keeping the bug open to finish
> hotplug vm xml changes

With http://gerrit.ovirt.org/84825 merged, we are good to go.
Tests on my setup passed successfully.

The BZ can move to MODIFIED.

Comment 16 Michal Skrivanek 2017-11-29 12:02:33 UTC
the hotplug needs to be done correctly for 4.2 GA so we can at least have a chance to eliminate the old code when we stop supporting 4.2

Since the problem is fixed let me try to remove the blocker+ one more time and keep the bug for hotplug implementation, that should at least get it off the radar

Comment 17 Edward Haas 2017-11-30 11:56:23 UTC
http://gerrit.ovirt.org/84866 is not relevant for this BZ.

Comment 18 Dan Kenigsberg 2017-11-30 12:03:32 UTC
I believe that the bug is properly solved, but let us get QA's certification.

Comment 19 Meni Yakove 2017-12-12 07:17:51 UTC
How come this bug is ON_QA while 3 patches are in POST status?

Comment 20 Michael Burman 2017-12-12 07:21:54 UTC
(In reply to Meni Yakove from comment #19)
> How come this bug is ON_QA while 3 patches are in POST status?

Good question, and 
Does the d/s build has this fix?

vdsm-4.20.9-1.el7ev.x86_64 ?

Comment 21 Dan Kenigsberg 2017-12-12 09:55:05 UTC
I don't think that the recently-added patches are related to this bug, let's call in Milan to explain why he re-used the same bugurl.

vdsm-4.20.9-1.el7ev.x86_64 and the later vdsm-4.20.9.1 has Edy's fix.

Comment 22 Michael Burman 2017-12-13 07:34:18 UTC
Hi Milan,
Please remove the 4 patches you added to this bug, it's wrong bz process and the patches not relevant for this bug.

Many thanks)

Comment 23 Michael Burman 2017-12-13 09:29:32 UTC
Ok So, 

Dan, to make things clear here, vdsm-4.20.9-1.el7ev.x86_64 didn't had the fix and it failedQA. 
Today we already got vdsm-4.20.9.1-1.el7ev.x86_64 and only now it fixed and work as expected. 

I wanted to fail this bug on vdsm-4.20.9-1.el7ev.x86_64, but you are lucky and we just got the new build so we could re-test it on latest vdsm.

vdsm-4.20.9-1.el7ev.x86_64 - failedQA

Verified only on - vdsm-4.20.9.1-1.el7ev.x86_64 with 4.2.0.2-0.1.el7

Comment 24 Milan Zamazal 2017-12-13 09:33:47 UTC
Hi Michael, removed, sorry for the confusion.

Comment 25 Sandro Bonazzola 2017-12-20 11:34:21 UTC
This bugzilla is included in oVirt 4.2.0 release, published on Dec 20th 2017.

Since the problem described in this bug report should be
resolved in oVirt 4.2.0 release, published on Dec 20th 2017, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.