Bug 1370036 - Management interface is going down when trying to boot PF-VM when there is already VM associated to VF port
Summary: Management interface is going down when trying to boot PF-VM when there is al...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-nova
Version: 10.0 (Newton)
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 10.0 (Newton)
Assignee: Vladik Romanovsky
QA Contact: Prasanth Anbalagan
URL:
Whiteboard:
Depends On:
Blocks: 1233921
TreeView+ depends on / blocked
 
Reported: 2016-08-25 06:36 UTC by Eran Kuris
Modified: 2019-09-09 15:21 UTC (History)
12 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-01-08 15:10:49 UTC
Target Upstream Version:


Attachments (Terms of Use)
network log (172.12 KB, text/plain)
2016-08-25 06:36 UTC, Eran Kuris
no flags Details


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 363884 0 None MERGED PCI: Fix PCI with fully qualified address 2020-02-06 17:17:47 UTC

Description Eran Kuris 2016-08-25 06:36:22 UTC
Created attachment 1193886 [details]
network log

Description of problem:
I booted vm  with direct port (regular SRIOV 0-  it was booted as expected )
After that I booted VM with direct-physical port (I expected it will be fail)  and what I see that the management port is going down .


attached log and SOS reports 
Version-Release number of selected component (if applicable):
]# rpm -qa |grep nova
python-novaclient-5.0.1-0.20160724130722.6b11a1c.el7ost.noarch
openstack-nova-api-14.0.0-0.20160817225441.04cef3b.el7ost.noarch
puppet-nova-9.1.0-0.20160813014843.b94f0a0.el7ost.noarch
openstack-nova-common-14.0.0-0.20160817225441.04cef3b.el7ost.noarch
openstack-nova-novncproxy-14.0.0-0.20160817225441.04cef3b.el7ost.noarch
openstack-nova-conductor-14.0.0-0.20160817225441.04cef3b.el7ost.noarch
python-nova-14.0.0-0.20160817225441.04cef3b.el7ost.noarch
openstack-nova-scheduler-14.0.0-0.20160817225441.04cef3b.el7ost.noarch
openstack-nova-cert-14.0.0-0.20160817225441.04cef3b.el7ost.noarch
openstack-nova-console-14.0.0-0.20160817225441.04cef3b.el7ost.noarch
[root@controller1 ~(keystone_admin)]# rpm -qa |grep neutron
python-neutron-lib-0.3.0-0.20160803002107.405f896.el7ost.noarch
openstack-neutron-9.0.0-0.20160817153328.b9169e3.el7ost.noarch
puppet-neutron-9.1.0-0.20160813031056.7cf5e07.el7ost.noarch
python-neutron-9.0.0-0.20160817153328.b9169e3.el7ost.noarch
openstack-neutron-lbaas-9.0.0-0.20160816191643.4e7301e.el7ost.noarch
python-neutron-fwaas-9.0.0-0.20160817171450.e1ac68f.el7ost.noarch
python-neutron-lbaas-9.0.0-0.20160816191643.4e7301e.el7ost.noarch
openstack-neutron-ml2-9.0.0-0.20160817153328.b9169e3.el7ost.noarch
openstack-neutron-metering-agent-9.0.0-0.20160817153328.b9169e3.el7ost.noarch
openstack-neutron-openvswitch-9.0.0-0.20160817153328.b9169e3.el7ost.noarch
python-neutronclient-5.0.0-0.20160812094704.ec20f7f.el7ost.noarch
openstack-neutron-common-9.0.0-0.20160817153328.b9169e3.el7ost.noarch
openstack-neutron-fwaas-9.0.0-0.20160817171450.e1ac68f.el7ost.noarch


How reproducible:
always 

Steps to Reproduce:
1.on SRIOV ENV set Dynamic allocation of Physical Function and Virtual Functions 
https://docs.google.com/document/d/1qQbJlLI1hSlE4uwKpmVd0BoGSDBd8Z0lTzx5itQ6WL0/edit#
2.create direct port & direct-physical port 
3.Boot VM to direct port - should work well 
4.Boot VM to direct-physical port --> the management nic in compute node is going down 
Actual results:


Expected results:


Additional info:
https://drive.google.com/a/redhat.com/file/d/0B_izhJVSkOTDT0htZUhvZXFEVWc/view?usp=sharing

https://drive.google.com/a/redhat.com/file/d/0B_izhJVSkOTDOFRWdGgxVlB2ak0/view?usp=sharing

Comment 2 Eran Kuris 2016-08-25 06:38:17 UTC
due to bug : https://bugzilla.redhat.com/show_bug.cgi?id=1370036
The RFE is block.

Comment 3 Eran Kuris 2016-09-08 07:24:05 UTC
I think its happens because the MGMT interface and the SRIOV port have same ID  [8086:154d] .
In nova.conf I using pci_passthrough_whitelist from [{“vendor_id”:”vendor_id_value”, “product_id”:”product_id_value”, “physical_network”:”physical_network_label”}].

because while using devname seems to work fine for  virtual functions, it does not seem to work for physical functions.


05:00.0 Ethernet controller [0200]: Intel Corporation Ethernet 10G 2P X520 Adapter [8086:154d] (rev 01)
05:00.1 Ethernet controller [0200]: Intel Corporation Ethernet 10G 2P X520 Adapter [8086:154d] (rev 01)

maybe we need to use other form to be more specific like bus-info: 0000:05:00.1

Comment 5 Vladik Romanovsky 2016-10-17 13:30:53 UTC
Hi Eran,

There is a patch upstream [1] that will allow to whitelist a specific PF device, by its address or a partial address, using a a wildcard:
"address":"0000:05:00.1" or "address":"0000:05:00.*"
I'll backport it, once it's merged if you find it necessary.

However, I think, the SR-IOV functionality has never been designed to consider one of the ports to be host management interface.

Thanks,
Vladik

[1] https://review.openstack.org/#/c/363884/

Comment 6 Ollie Walsh 2018-01-08 15:10:49 UTC
Closing this as https://review.openstack.org/#/c/363884/ was merged over a year ago.


Note You need to log in before you can comment on or make changes to this bug.