Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1533778

Summary: Hotplug VM vNIC with port mirroring failed
Product: [oVirt] vdsm Reporter: Meni Yakove <myakove>
Component: CoreAssignee: Francesco Romani <fromani>
Status: CLOSED CURRENTRELEASE QA Contact: Mor <mkalfon>
Severity: high Docs Contact:
Priority: high    
Version: 4.20.11CC: bugs, danken, fromani, lveyde, mburman, michal.skrivanek, myakove, ylavi
Target Milestone: ovirt-4.2.2Keywords: Automation, Regression
Target Release: ---Flags: rule-engine: ovirt-4.2+
ylavi: blocker-
ylavi: exception+
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: vdsm v4.20.15 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-03-29 11:19:36 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Network RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1542013    
Bug Blocks:    
Attachments:
Description Flags
engine, vdsm and supervdsm logs none

Description Meni Yakove 2018-01-12 08:17:45 UTC
Created attachment 1380326 [details]
engine, vdsm and supervdsm logs

Description of problem:
Hotplug VM vNIC with port mirroring failed on RHEL 7.5

from vdsm.log:
2018-01-12 10:05:33,082+0200 ERROR (jsonrpc/7) [virt.vm] (vmId='72b85ef7-dbd0-4a07-9afe-a3b90e2317fb') setPortMirroring for network pm_net_1 failed (vm:2891)
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/vdsm/virt/vm.py", line 2883, in hotplugNic
    supervdsm.getProxy().setPortMirroring(network, nic.name)
  File "/usr/lib/python2.7/site-packages/vdsm/common/supervdsm.py", line 55, in __call__
    return callMethod()
  File "/usr/lib/python2.7/site-packages/vdsm/common/supervdsm.py", line 53, in <lambda>
    **kwargs)
  File "<string>", line 2, in setPortMirroring
  File "/usr/lib64/python2.7/multiprocessing/managers.py", line 773, in _callmethod
    raise convert_to_error(kind, result)
TrafficControlException: (22, 'RTNETLINK answers: Invalid argument', ['/sbin/tc', 'filter', 'replace', 'dev', 'pm_net_1', 'protocol', 'all', 'parent', 'ffff:', 'handle', '8
00::800', 'pref', '49152', 'u32', 'match', 'u8', '0', '0', 'action', 'mirred', 'egress', 'mirror', 'dev', 'vnet5', 'action', 'mirred', 'egress', 'mirror', 'dev', 'vnet7'])
2018-01-12 10:05:33,087+0200 INFO  (jsonrpc/7) [virt.vm] (vmId='72b85ef7-dbd0-4a07-9afe-a3b90e2317fb') Hotunplug NIC xml: <?xml version='1.0' encoding='utf-8'?>
<interface type="bridge">
    <address bus="0x00" domain="0x0000" function="0x0" slot="0x0a" type="pci" />
    <mac address="00:1a:4a:16:20:23" />
    <model type="virtio" />
    <source bridge="pm_net_1" />
    <filterref filter="vdsm-no-mac-spoofing" />
    <link state="up" />
</interface>
 (vm:3198)


from engine.log:
2018-01-12 10:05:31,760+02 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.HotPlugNicVDSCommand] (default task-1) [344bc439] NIC hot-set: <?xml version="1.0" encoding="UTF
-8"?><hotplug>
  <devices>
    <interface type="bridge">
      <model type="virtio"/>
      <link state="up"/>
      <source bridge="pm_net_1"/>
      <address bus="0x00" domain="0x0000" function="0x0" slot="0x0a" type="pci"/>
      <mac address="00:1a:4a:16:20:23"/>
      <filterref filter="vdsm-no-mac-spoofing"/>
      <bandwidth/>
    </interface>
  </devices>
  <metadata xmlns:ovirt-vm="http://ovirt.org/vm/1.0">
    <ovirt-vm:vm>
      <ovirt-vm:device mac_address="00:1a:4a:16:20:23">
        <ovirt-vm:portMirroring>
          <ovirt-vm:network>pm_net_1</ovirt-vm:network>
        </ovirt-vm:portMirroring>
        <ovirt-vm:custom/>
      </ovirt-vm:device>
    </ovirt-vm:vm>
  </metadata>
</hotplug>

2018-01-12 10:05:32,712+02 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.DumpXmlsVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-90) [] START, DumpXmlsVDSCom
mand(HostName = host_mixed_1, Params:{hostId='d0f198af-6eea-4539-8501-2b752f2e8db4', vmIds='[72b85ef7-dbd0-4a07-9afe-a3b90e2317fb]'}), log id: 765f2cfe
2018-01-12 10:05:32,841+02 INFO  [org.ovirt.engine.core.vdsbroker.vdsbroker.DumpXmlsVDSCommand] (EE-ManagedThreadFactory-engineScheduled-Thread-90) [] FINISH, DumpXmlsVDSCo
mmand, return: [{vmId=72b85ef7-dbd0-4a07-9afe-a3b90e2317fb, devices=[Ljava.util.Map;@5588bdef, guestDiskMapping={36ec24a1-ab7c-4066-85ac-d020ddc60c77={name=/dev/vda}}}], log id: 765f2cfe
2018-01-12 10:05:33,787+02 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.HotPlugNicVDSCommand] (default task-1) [344bc439] Failed in 'HotPlugNicVDS' method
2018-01-12 10:05:33,797+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-1) [344bc439] EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,

Version-Release number of selected component (if applicable):
vdsm-4.20.11-1.el7ev.x86_64
ovirt-engine-4.2.1-0.2.el7.noarch
libvirt-3.9.0-6.el7.x86_64
Red Hat Enterprise Linux Server 7.5 Beta (Maipo)


Steps to Reproduce:
1. Start VM with vNIC (no port mirroring profile)
2. Hot unplug the vNIC from the VM
3. Change the vNIC profile to profile with port mirroring
4. Hotplug the vNIC

Comment 1 Yaniv Kaul 2018-01-14 08:15:34 UTC
I assume it works on a RHEL 7.4 based host? So it's somewhat of a 7.5 issue?

Comment 2 Dan Kenigsberg 2018-01-14 14:08:52 UTC
Meni, can you try port mirroring of multiple VMs without hotplug?

If hotplug is failing due to bug 1533762, you cannot expect port-mirroring to work. I am asking for multiple VMs because the failing command tries to set it on two:
  /sbin/tc filter replace dev pm_net_1 protocol all parent 800c: u32 match u8 0 0 action mirred egress mirror dev vnet5

Comment 3 Meni Yakove 2018-01-15 09:17:19 UTC
(In reply to Yaniv Kaul from comment #1)
> I assume it works on a RHEL 7.4 based host? So it's somewhat of a 7.5 issue?

It has worked in 7.4 but I guess that since then VDSM was changed as well.

(In reply to Dan Kenigsberg from comment #2)
> Meni, can you try port mirroring of multiple VMs without hotplug?

port mirroring on a fresh VM works just fine.

> 
> If hotplug is failing due to bug 1533762, you cannot expect port-mirroring
> to work. 

bug 1533762 is limited to Empty vNICs; here the vNIC is connected.

> I am asking for multiple VMs because the failing command tries to
> set it on two:
>   /sbin/tc filter replace dev pm_net_1 protocol all parent 800c: u32 match
> u8 0 0 action mirred egress mirror dev vnet5

I'll try and report back.

Comment 4 Meni Yakove 2018-01-15 12:04:52 UTC
/sbin/tc filter replace dev pm_net_1 protocol all parent ffff: handle 800::800 pref 49152 u32 match u8 0 0 action mirred egress mirror dev vnet5 action mirred egress mirror dev vnet7
Cannot find device "vnet7"
bad action parsing
parse_action: bad value (5:mirred)!
Illegal "action"

Seems like we try to update vnet7 but this vnet doesn't exist in the host.
brctl show
bridge name	bridge id		STP enabled	interfaces
;vdsmdummy;		8000.000000000000	no		
ovirtmgmt		8000.002590c6d982	no		enp1s0f0
							vnet0
							vnet1
							vnet2
							vnet3
							vnet4
pm_net_1		8000.002590c6d983	no		enp1s0f1.162
							vnet11
							vnet13
							vnet5
							vnet9
pm_net_2		8000.00e0ed33c092	no		bond01.163
							vnet10
							vnet12
							vnet14
							vnet6
							vnet8

I guess that vnet7 is the vNIC that we hot-unpluged.

Comment 5 Meni Yakove 2018-01-15 12:34:09 UTC
Before hotplug:
brctl show
bridge name	bridge id		STP enabled	interfaces
;vdsmdummy;		8000.000000000000	no		
ovirtmgmt		8000.002590c6d982	no		enp1s0f0
							vnet0
							vnet1
							vnet2
							vnet3
							vnet4
pm_net_1		8000.002590c6d983	no		enp1s0f1.162
							vnet11
							vnet13
							vnet5
							vnet7
							vnet9
pm_net_2		8000.00e0ed33c092	no		bond01.163
							vnet10
							vnet12
							vnet14
							vnet6
							vnet8


After hot-unplug:
brctl show
bridge name	bridge id		STP enabled	interfaces
;vdsmdummy;		8000.000000000000	no		
ovirtmgmt		8000.002590c6d982	no		enp1s0f0
							vnet0
							vnet1
							vnet2
							vnet3
							vnet4
pm_net_1		8000.002590c6d983	no		enp1s0f1.162
							vnet11
							vnet13
							vnet5
							vnet9
pm_net_2		8000.00e0ed33c092	no		bond01.163
							vnet10
							vnet12
							vnet14
							vnet6
							vnet8

Comment 6 Red Hat Bugzilla Rules Engine 2018-01-16 11:25:33 UTC
This bug report has Keywords: Regression or TestBlocker.
Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.

Comment 7 Francesco Romani 2018-01-17 15:05:11 UTC
http://gerrit.ovirt.org/84866 merged -> MODIFIED

Comment 8 Meni Yakove 2018-01-23 08:40:01 UTC
The fix is not in the current build vdsm-4.20.14-1.el7ev.x86_64

/usr/lib/python2.7/site-packages/vdsm/virt/vm.py:
with self.setLinkAndNetwork(netDev, netConf, linkValue, network,
                                        custom, specParams):
                with self.updatePortMirroring(netConf, netsToMirror):
                    return {'status': doneCode, 'vmList': self.status()}

Comment 9 Meni Yakove 2018-01-29 08:37:00 UTC
with vdsm-4.20.17-1.el7ev.x86_64 we still fail with the same error.

Comment 10 Michal Skrivanek 2018-01-29 14:49:17 UTC
sure, because tag 4.20.17 doesn't contain the code. It shouldn't have been moved to 4.2.1 and the bot shouldn't move it to ON_QA

Comment 11 Dan Kenigsberg 2018-01-29 20:40:47 UTC
(In reply to Michal Skrivanek from comment #10)
> sure, because tag 4.20.17 doesn't contain the code. It shouldn't have been
> moved to 4.2.1 and the bot shouldn't move it to ON_QA

Would you point me to that code? The only patch mentioned in this bug is https://gerrit.ovirt.org/#/c/84866/ and it is in v4.20.15

Comment 12 Michal Skrivanek 2018-01-29 21:12:49 UTC
Oh,   my bad, I misread the comments, I've meant 4.20.14, the one from comment #8...now I read it again I do not know what I was thinking....my apologies.
 
I wonder if the warning during unplug is relevant, as it looks like perhaps teh nic wasn't removed from the internal device list and then we try set portMirroring on non-existent device.
2018-01-12 10:05:29,610+0200 WARN  (libvirt/events) [virt.vm] (vmId='72b85ef7-dbd0-4a07-9afe-a3b90e2317fb') Removed device not found in conf: net1 (vm:5956)

Comment 13 Francesco Romani 2018-01-30 17:03:58 UTC
This is the call that fails:
2018-01-12 10:05:31,759+0200 INFO  (jsonrpc/7) [api.virt] START hotplugNic(params={'xml': '<?xml version="1.0" encoding="UTF-8"?><hotplug><devices><interface type="bridge"><model type="virtio"></model><link state="up"></link><source bridge="pm_net_1"></source><address bus="0x00" domain="0x0000" function="0x0" slot="0x0a" type="pci"></address><mac address="00:1a:4a:16:20:23"></mac><filterref filter="vdsm-no-mac-spoofing"></filterref><bandwidth></bandwidth></interface></devices><metadata xmlns:ovirt-vm="http://ovirt.org/vm/1.0"><ovirt-vm:vm><ovirt-vm:device mac_address="00:1a:4a:16:20:23"><ovirt-vm:portMirroring><ovirt-vm:network>pm_net_1</ovirt-vm:network></ovirt-vm:portMirroring><ovirt-vm:custom></ovirt-vm:custom></ovirt-vm:device></ovirt-vm:vm></metadata></hotplug>', 'nic': {'nicModel': 'pv', 'macAddr': '00:1a:4a:16:20:23', 'linkActive': 'true', 'network': 'pm_net_1', 'filterParameters': [], 'filter': 'vdsm-no-mac-spoofing', 'specParams': {'inbound': {}, 'outbound': {}}, 'deviceId': 'eba87202-035f-45e6-8201-f0c0f1d4e251', 'address': {'function': '0x0', 'bus': '0x00', 'domain': '0x0000', 'type': 'pci', 'slot': '0x0a'}, 'device': 'bridge', 'type': 'interface', 'portMirroring': ['pm_net_1']}, 'vmId': '72b85ef7-dbd0-4a07-9afe-a3b90e2317fb'})

I find suspicious that we are trying to mirror on the same network on which the vNIC seems attached to:

network='pm_net_1'
portMirroring=['pm_net_1']

(same in XML)

Dan, is this a supported operation?

Comment 14 Francesco Romani 2018-02-01 13:51:32 UTC
patch https://gerrit.ovirt.org/87011 will not fix this issue.

I tried to reproduce with

Vdsm master snapshot 2fe542a9c
Engine 4.2.1.5
libvirt-daemon-3.2.0-14.el7_4.7.x86_64
libvirt-python-3.2.0-3.el7_4.1.x86_64
qemu-kvm-ev-2.9.0-16.el7_4.13.1.x86_64

tried the following:
1. run VM, hotplug one NIC with port mirroring -> OK
2. run VM, hotplug 6 (SIX) NICs withOUT port mirroring, then one with port mirroring -> OK

Notes:
1. during tests, I killed QEMU and Vdsm failed to clean up, hence patch 87011
2. hotUNplug failed, known libvirt issue

So, does it reproduce on every box or just on some?
Does it work if yoi just run one VM, let it go UP, then attach one NIC with port mirroring?

Comment 15 Dan Kenigsberg 2018-02-01 15:49:44 UTC
Michael, maybe you can reproduce this bug manually, as the automated reproduction is too convoluted to follow?

Comment 16 Michael Burman 2018-02-01 16:01:26 UTC
(In reply to Dan Kenigsberg from comment #15)
> Michael, maybe you can reproduce this bug manually, as the automated
> reproduction is too convoluted to follow?

I didn't managed to reproduce it manually, only with automation.
Meni?

Comment 17 Meni Yakove 2018-02-04 09:58:32 UTC
We saw this only in automation.
Francesco, I have setup to reproduce. please content me.

Comment 18 Francesco Romani 2018-02-13 11:12:49 UTC
Looks like it is actually a kernel bug

Comment 19 Dan Kenigsberg 2018-02-13 11:14:25 UTC
I believe that the ovirt-side bug has been fixed, however, we now depend on kernel bug 1542013

Comment 20 Francesco Romani 2018-02-14 09:48:15 UTC
no doc_text needed, should Just Work

Comment 21 RHV bug bot 2018-02-16 16:25:08 UTC
INFO: Bug status wasn't changed from MODIFIED to ON_QA due to the following reason:

[Open patch attached]

For more info please contact: infra

Comment 22 Dan Kenigsberg 2018-02-21 21:25:18 UTC
please retest with kernel-3.10.0-854.el7

Comment 23 Meni Yakove 2018-02-26 08:31:08 UTC
ovirt-engine-4.2.2.1-0.1.el7.noarch
kernel 3.10.0-855.el7.x86_64

Comment 24 Sandro Bonazzola 2018-03-29 11:19:36 UTC
This bugzilla is included in oVirt 4.2.2 release, published on March 28th 2018.

Since the problem described in this bug report should be
resolved in oVirt 4.2.2 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.