Bug 1261352

Summary: [SR-IOV] - 'pci-passthrough' vNIC reported as unplugged in UI once running the VM, although the vNICs state is UP and plugged
Product: Red Hat Enterprise Virtualization Manager Reporter: Michael Burman <mburman>
Component: vdsmAssignee: Dan Kenigsberg <danken>
Status: CLOSED CURRENTRELEASE QA Contact: Michael Burman <mburman>
Severity: urgent Docs Contact:
Priority: medium    
Version: 3.6.0CC: alkaplan, bazulay, danken, gklein, lsurette, mpoledni, myakove, rbalakri, Rhev-m-bugs, srevivo, ycui, ykaul, ylavi
Target Milestone: ovirt-3.6.1Flags: ylavi: Triaged+
Target Release: 3.6.1   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1274316 (view as bug list) Environment:
Last Closed: 2016-04-20 01:39:41 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Network RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1274316    
Attachments:
Description Flags
logs and screen shots none

Description Michael Burman 2015-09-09 08:06:23 UTC
Created attachment 1071604 [details]
logs and screen shots

Description of problem:
[SR-IOV] - 'pci-passthrough' vNIC reported as unplugged in UI once running the VM, although the vNICs state is UP and plugged.

Once running a VM with 'pci-passthrough' vNIC(vNIC is plugged), the vNIC changes his state to unplugged in UI, but the actual state of the vNIC reported as plugged in the client's OS. 


vdsm log -->

Traceback (most recent call last):
  File "/usr/share/vdsm/virt/vm.py", line 1384, in _getRunningVmStats
    vm_sample.interval)
  File "/usr/share/vdsm/virt/vmstats.py", line 42, in produce
    networks(vm, stats, first_sample, last_sample, interval)
  File "/usr/share/vdsm/virt/vmstats.py", line 213, in networks
    first_indexes = _find_bulk_stats_reverse_map(first_sample, 'net')
  File "/usr/share/vdsm/virt/vmstats.py", line 340, in _find_bulk_stats_reverse_map
    name_to_idx[stats['%s.%d.name' % (group, idx)]] = idx
KeyError: 'net.0.name'
Thread-35436::INFO::2015-09-09 08:53:55,272::xmlrpc::92::vds.XMLRPCServer::(_process_requests) Request handler for 127.0.0.1:47532 stopped
Thread-35437::ERROR::2015-09-09 08:54:01,080::vm::1387::virt.vm::(_getRunningVmStats) vmId=`c7758e8d-610e-4f43-8504-ed6acf5e2513`::Error fetching vm stats
Traceback (most recent call last):
  File "/usr/share/vdsm/virt/vm.py", line 1384, in _getRunningVmStats
    vm_sample.interval)
  File "/usr/share/vdsm/virt/vmstats.py", line 42, in produce
    networks(vm, stats, first_sample, last_sample, interval)
  File "/usr/share/vdsm/virt/vmstats.py", line 213, in networks
    first_indexes = _find_bulk_stats_reverse_map(first_sample, 'net')
  File "/usr/share/vdsm/virt/vmstats.py", line 340, in _find_bulk_stats_reverse_map
    name_to_idx[stats['%s.%d.name' % (group, idx)]] = idx
KeyError: 'net.0.name'


Version-Release number of selected component (if applicable):
3.6.0-0.13.master.el6

How reproducible:
50%-90% sometimes

Steps to Reproduce:
1. SR-IOV setup , enable 1 VF on host
2. Create network with 'passthrough' profile and add vNIC with this profile to VM(pci-passthrough type)
3. Run VM

Actual results:
vNIC changes his state to unplugged once starting the VM in the UI.
vNIC reported as plugged in the client's OS and vNIC got ip from dhcp.

Expected results:
vNIC shouldn't change his state to unplugged in the UI. should stay plugged and UI should report that vNIC plugged.

Comment 1 Michael Burman 2015-09-10 15:08:39 UTC
*** Bug 1261369 has been marked as a duplicate of this bug. ***

Comment 2 Michael Burman 2015-09-10 15:09:33 UTC
*** Bug 1261368 has been marked as a duplicate of this bug. ***

Comment 3 Michael Burman 2015-09-10 15:11:12 UTC
*** Bug 1261365 has been marked as a duplicate of this bug. ***

Comment 4 Michael Burman 2015-09-10 15:11:15 UTC
*** Bug 1261363 has been marked as a duplicate of this bug. ***

Comment 5 Michael Burman 2015-09-10 15:11:45 UTC
*** Bug 1261357 has been marked as a duplicate of this bug. ***

Comment 6 Michael Burman 2015-10-14 05:59:57 UTC
This bug causing problems testing sr-iov feature and maybe blocking the feature.

- I can't get ip for vlan tagged passthrough profiles. 
- Sometimes VM coming up and vNIC not reported at all in the client's OS
- This bug should be fixed as soon as possible.

Comment 7 Yaniv Lavi 2015-10-14 13:29:01 UTC
Can we assign someone to look?

Comment 8 Alona Kaplan 2015-10-19 12:48:27 UTC
I think the bug is cause because- the output of vdsClient -s 0  list contains the same device tw        
devices = [{'device': 'pci_0000_05_10_1', 'specParams': {'macAddr': '00:00:00:00:00:22'}, 'type': 'hostdev', 'deviceId': '6940d5e7-9814-4ae0-94ef-f78e68229e76'},
{'nicModel': 'passthrough', 'macAddr': '00:00:00:00:00:22', 'linkActive': True, 'name': 'hostdev0', 'alias': 'hostdev0', 'address': {'slot': '0x10', 'bus': '0x05', 'domain': '0x0000', 'type': 'pci', 'function': '0x1'}, 'device': 'hostdev', 'type': 'interface'}

Both of the entries represent the same device. I'm not sure (didn't debug it), but I think the engine tries to get the info of the device from the first entry, while the actual info about the device reside in the second entry.
Those two entries should be merged to one entry that contains the deviceId as reported by the engine and the device's actual data.

Ido/Martin P- This bug was discovered a long time ago, and should have been already fixed by one of you.
Do you know why it still exists?

Comment 9 Martin Polednik 2015-10-20 07:36:58 UTC
The code wasn't really touched afaik. The fix requires some expertise in networking as I'm not aware of the implications of not using the interface device.

note:
This happens right when we parse the XML libvirt has constructed (getUnderlying*). We use hostdev for the device creation, but then expect the "nic device" to be populated. "Merging" is the answer, but I'm not sure which element will eventually be a better choice.

Comment 11 Michael Burman 2015-11-29 09:38:29 UTC
Verified on - 3.6.1-0.2.el6 and vdsm-4.17.11-0.el7ev.noarch