Bug 1454624 - SRIOV Minor update OSP10 to OSP10z3 failed when PF assign to instance
Summary: SRIOV Minor update OSP10 to OSP10z3 failed when PF assign to instance
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: puppet-tripleo
Version: 10.0 (Newton)
Hardware: Unspecified
OS: Unspecified
high
urgent
Target Milestone: z4
: 10.0 (Newton)
Assignee: Brent Eagles
QA Contact: Eran Kuris
URL:
Whiteboard:
Depends On: 1482390 1485452
Blocks: 1454634 1479029
TreeView+ depends on / blocked
 
Reported: 2017-05-23 08:13 UTC by Eran Kuris
Modified: 2017-09-06 17:09 UTC (History)
18 users (show)

Fixed In Version: puppet-tripleo-5.6.0-6.el7ost
Doc Type: Release Note
Doc Text:
Workaround: Before you upgrade or update OpenStack, delete the guest that attached to the PF. Then you can proceed to update or upgrade and it will pass.
Clone Of:
: 1454634 (view as bug list)
Environment:
Last Closed: 2017-09-06 17:09:30 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
openstack stack failures list --long overcloud (22.77 KB, text/plain)
2017-05-23 08:13 UTC, Eran Kuris
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1701284 0 None None None 2017-06-29 17:43:02 UTC
OpenStack gerrit 483919 0 None MERGED Do not fail if PCI device is missing 2020-09-17 02:57:06 UTC
Red Hat Product Errata RHBA-2017:2654 0 normal SHIPPED_LIVE Red Hat OpenStack Platform 10 director Bug Fix Advisory 2017-09-06 20:55:36 UTC

Description Eran Kuris 2017-05-23 08:13:40 UTC
Created attachment 1281362 [details]
openstack stack failures list --long overcloud

Description of problem:
Deployed OSP10- with ovs 2.5 (1 controller,2 computes) and created 3 types of instances. normal port, direct port (VF), direct-physical port (PF port).
When I ran an update to OSP10z3 with ovs 2.6  the process failed because the system could not find the PF nic.  

   Warning: Scope(Class[Nova]): Could not look up qualified variable '::nova::scheduler::filter::cpu_allocation_ratio'; class ::nova::scheduler::filter has not been evaluated
    Warning: Scope(Class[Nova]): Could not look up qualified variable '::nova::scheduler::filter::ram_allocation_ratio'; class ::nova::scheduler::filter has not been evaluated
    Warning: Scope(Class[Nova]): Could not look up qualified variable '::nova::scheduler::filter::disk_allocation_ratio'; class ::nova::scheduler::filter has not been evaluated
    Warning: Scope(Class[Nova::Compute]): compute_manager is marked as deprecated in Nova but still needed when Ironic is used. It will be removed once Nova removes it.
    Warning: Scope(Class[Nova::Vncproxy::Common]): Could not look up qualified variable '::nova::vncproxy::host'; class ::nova::vncproxy has not been evaluated
    Warning: Scope(Class[Nova::Vncproxy::Common]): Could not look up qualified variable '::nova::vncproxy::vncproxy_protocol'; class ::nova::vncproxy has not been evaluated
    Warning: Scope(Class[Nova::Vncproxy::Common]): Could not look up qualified variable '::nova::vncproxy::port'; class ::nova::vncproxy has not been evaluated
    Warning: Scope(Class[Nova::Vncproxy::Common]): Could not look up qualified variable '::nova::vncproxy::vncproxy_path'; class ::nova::vncproxy has not been evaluated
    Warning: Scope(Class[Ceilometer]): Both $metering_secret and $telemetry_secret defined, using $telemetry_secret
    Warning: Scope(Class[Ceilometer::Agent::Compute]): This class is deprecated. Please use ceilometer::agent::polling with compute namespace instead.
    Error: /sys/class/net/p1p1/device/sriov_numvfs doesn't exist. Check if p1p1 is a valid network interface supporting SR-IOV
    Error: /Stage[main]/Tripleo::Host::Sriov/Sriov_vf_config[p1p1:5]/ensure: change from absent to present failed: /sys/class/net/p1p1/device/sriov_numvfs doesn't exist. Check if p1p1 is a valid network interface supporting SR-IOV                                                                                                                                                      
    Warning: /Firewall[998 log all]: Skipping because of failed dependencies
    Warning: /Firewall[999 drop all]: Skipping because of failed dependencies


I think the issue is relevant to upgrade process too from OSP-10 to OSP11 

Version-Release number of selected component (if applicable):

python-neutron-lib-0.4.0-1.el7ost.noarch
openstack-neutron-common-9.2.0-2.el7ost.noarch
puppet-neutron-9.5.0-1.el7ost.noarch
openstack-neutron-9.2.0-2.el7ost.noarch
python-neutronclient-6.0.0-2.el7ost.noarch
openstack-neutron-ml2-9.2.0-2.el7ost.noarch
openstack-neutron-openvswitch-9.2.0-2.el7ost.noarch
python-neutron-9.2.0-2.el7ost.noarch
openstack-tripleo-heat-templates-5.2.0-15.el7ost.noarch
How reproducible:
always

Steps to Reproduce:
1.deploy SRIOV setup osp10 latest (at least 2 computes)
2.create on overcloud 3 types of instances. normal port, direct port (VF), direct-physical port (PF port)
3.run update process to osp10-z3 with ovs2.6 

openstack overcloud deploy --update-plan-only \
--templates \
--environment-file "$HOME/extra_env.yaml" \
--libvirt-type kvm \
--ntp-server clock.redhat.com \
-e /home/stack/ospd-10-multiple-nic-vlans-ovs-dpdk-single-port/network-environment.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/neutron-ovs-dpdk.yaml \
--log-file overcloud_install.log &> overcloud_install.log


OpenStack overcloud update stack -i overcloud
Actual results:
failed 

Expected results:
update success

Additional info:

Comment 1 Eran Kuris 2017-05-23 08:36:47 UTC
Workaround: Delete the PF instance and run again update/upgrade process and it will pass.

Comment 13 Brent Eagles 2017-07-14 12:51:11 UTC
Hi, the backport has not merged upstream yet. We'll push and try and get it in today.

Comment 15 Brent Eagles 2017-07-14 17:12:04 UTC
patches merge u/s should be in next respin

Comment 16 Brent Eagles 2017-07-14 17:27:01 UTC
Posted downstream patch in case we are not planning on a rebase before next release.

https://code.engineering.redhat.com/gerrit/#/c/112349/

Comment 19 Eran Kuris 2017-08-22 05:41:39 UTC
blocked by https://bugzilla.redhat.com/show_bug.cgi?id=1482390

Comment 20 Eran Kuris 2017-09-04 05:21:54 UTC
Fixed verified on minor update from OSP10Z3 to latest OSP10Z4 
puppet-tripleo-5.6.1-2.el7ost.noarch
$ nova list
+--------------------------------------+------+--------+------------+------------
| ID                                   | Name | Status | Task State | Power State | Networks           |
+--------------------------------------+------+--------+------------+------------
| 915b91e7-7592-40ff-bf73-d0b371b78455 | PF   | ACTIVE | -          | Running     | net-64-2=10.0.2.5  |
| 5ea2f497-ad6d-4fb8-ae73-cc8ec4c9232f | VF   | ACTIVE | -          | Running     | net-64-2=10.0.2.10 |
| 49e22ce5-95ba-44ff-bf98-96ae6312a3c8 | VM   | ACTIVE | -          | Running     | net-64-2=10.0.2.6  |

Checked full connectivity before & after update process.

Comment 22 errata-xmlrpc 2017-09-06 17:09:30 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:2654


Note You need to log in before you can comment on or make changes to this bug.