Description of problem: After running upgrade process from OSP12 with SRIOV deployment to OSP13 it looks like SRIOV agent docker is not existed on compute nodes. the agent is running but not containerized. [root@compute-0 ~]# systemctl status neutron-sriov-nic-agent.service ● neutron-sriov-nic-agent.service - OpenStack Neutron SR-IOV NIC Agent Loaded: loaded (/usr/lib/systemd/system/neutron-sriov-nic-agent.service; enabled; vendor preset: disabled) Active: active (running) since Tue 2018-05-29 14:05:24 UTC; 16h ago Main PID: 369369 (neutron-sriov-n) Tasks: 4 Memory: 107.3M CGroup: /system.slice/neutron-sriov-nic-agent.service ├─369369 /usr/bin/python2 /usr/bin/neutron-sriov-nic-agent --config-file /usr/share/neutron/neutron-dist.conf --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/... ├─369471 sudo neutron-rootwrap-daemon /etc/neutron/rootwrap.conf └─369472 /usr/bin/python2 /usr/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf May 29 14:05:24 compute-0 systemd[1]: Started OpenStack Neutron SR-IOV NIC Agent. May 29 14:05:24 compute-0 systemd[1]: Starting OpenStack Neutron SR-IOV NIC Agent... May 29 14:05:28 compute-0 sudo[369471]: neutron : TTY=unknown ; PWD=/ ; USER=root ; COMMAND=/bin/neutron-rootwrap-daemon /etc/neutron/rootwrap.conf May 29 19:31:06 compute-0 neutron-sriov-nic-agent[369369]: /usr/lib/python2.7/site-packages/amqp/connection.py:304: AMQPDeprecationWarning: The .transport attribute on the conne...ssed before May 29 19:31:06 compute-0 neutron-sriov-nic-agent[369369]: the connection was established. This is supported for now, but will May 29 19:31:06 compute-0 neutron-sriov-nic-agent[369369]: be deprecated in amqp 2.2.0. May 29 19:31:06 compute-0 neutron-sriov-nic-agent[369369]: Since amqp 2.0 you have to explicitly call Connection.connect() May 29 19:31:06 compute-0 neutron-sriov-nic-agent[369369]: before using the connection. May 29 19:31:06 compute-0 neutron-sriov-nic-agent[369369]: W_FORCE_CONNECT.format(attr=attr))) Hint: Some lines were ellipsized, use -l to show in full. Version-Release number of selected component (if applicable): OS12 latest to OSP13 latest. [root@compute-0 ~]# rpm -qa | grep sriov openstack-neutron-sriov-nic-agent-12.0.2-0.20180421011359.0ec54fd.el7ost.noarch [root@compute-0 ~]# rpm -qa | grep neutron openstack-neutron-metering-agent-12.0.2-0.20180421011359.0ec54fd.el7ost.noarch openstack-neutron-common-12.0.2-0.20180421011359.0ec54fd.el7ost.noarch puppet-neutron-12.4.1-0.20180412211913.el7ost.noarch openstack-neutron-lbaas-12.0.1-0.20180424200349.cdbf25c.el7ost.noarch openstack-neutron-openvswitch-12.0.2-0.20180421011359.0ec54fd.el7ost.noarch openstack-neutron-ml2-12.0.2-0.20180421011359.0ec54fd.el7ost.noarch openstack-neutron-lbaas-ui-4.0.1-0.20180326210834.a2c502e.el7ost.noarch python-neutron-lbaas-12.0.1-0.20180424200349.cdbf25c.el7ost.noarch openstack-neutron-linuxbridge-12.0.2-0.20180421011359.0ec54fd.el7ost.noarch openstack-neutron-12.0.2-0.20180421011359.0ec54fd.el7ost.noarch python2-neutron-lib-1.13.0-1.el7ost.noarch openstack-neutron-sriov-nic-agent-12.0.2-0.20180421011359.0ec54fd.el7ost.noarch python2-neutronclient-6.7.0-1.el7ost.noarch python-neutron-12.0.2-0.20180421011359.0ec54fd.el7ost.noarch (overcloud) [stack@undercloud-0 ~]$ rpm -qa | grep triple python-tripleoclient-9.2.1-11.el7ost.noarch openstack-tripleo-ui-8.3.1-2.el7ost.noarch openstack-tripleo-common-8.6.1-17.el7ost.noarch openstack-tripleo-puppet-elements-8.0.0-2.el7ost.noarch puppet-tripleo-8.3.2-6.el7ost.noarch openstack-tripleo-common-containers-8.6.1-17.el7ost.noarch ansible-tripleo-ipsec-8.1.1-0.20180308133440.8f5369a.el7ost.noarch openstack-tripleo-heat-templates-8.0.2-27.el7ost.noarch openstack-tripleo-validations-8.4.1-5.el7ost.noarch openstack-tripleo-image-elements-8.0.1-1.el7ost.noarch (overcloud) [stack@undercloud-0 ~]$ How reproducible: 100% Steps to Reproduce: 1. Deploy osp12 with sriov 2. run upgrade to OSP13 with sriov 3. check that sriov docker exist on compute nodes
The SR-IOV agent was removed from the compute role in https://review.openstack.org/#/c/501999/ which I think would produce this kind of behavior. I believe the NFV have upgrade guidance on this. Adding needinfo on Saravanan.
The guidance for upgrade with compute role in osp12 to osp13 is in the release notes https://review.openstack.org/#/c/501999/8/releasenotes/notes/sriov-role-1ef30615048239c7.yaml roles_data.yaml file has to be customized to support such deployments, by adding the sriov agent service to the compute role.
I was mistaken about the root cause here. From what I can tell, the deployment is actually using the ComputeSriov role. The fix appear to have been to add support for deprecated params to the ComputeSriov role that was being used during upgrading: uses_deprecated_params: True deprecated_param_image: 'NovaImage' deprecated_param_extraconfig: 'NovaComputeExtraConfig' deprecated_param_metadata: 'NovaComputeServerMetadata' deprecated_param_scheduler_hints: 'NovaComputeSchedulerHints' deprecated_param_ips: 'NovaComputeIPs' deprecated_server_resource_name: 'NovaCompute' deprecated_nic_config_name: 'compute.yaml'
(In reply to Brent Eagles from comment #5) > I was mistaken about the root cause here. From what I can tell, the > deployment is actually using the ComputeSriov role. The fix appear to have > been to add support for deprecated params to the ComputeSriov role that was > being used during upgrading: > > uses_deprecated_params: True > deprecated_param_image: 'NovaImage' > deprecated_param_extraconfig: 'NovaComputeExtraConfig' > deprecated_param_metadata: 'NovaComputeServerMetadata' > deprecated_param_scheduler_hints: 'NovaComputeSchedulerHints' > deprecated_param_ips: 'NovaComputeIPs' > deprecated_server_resource_name: 'NovaCompute' > deprecated_nic_config_name: 'compute.yaml' These deprecated parameters are specific "Compute" role only. New roles should not have these deprecated parameters.
So, should this go in upgrade documentation then? Following https://review.openstack.org/#/c/501999/8/releasenotes/notes/sriov-role-1ef30615048239c7.yaml and detailing the parameters?
Infrared (In reply to Bernard Cafarelli from comment #8) > Marking as documentation from comment 4 release notes > > skramaja, can you confirm the correct fix here? If the deployment is using SR-IOV in Compute role, then comment #4 has the notes for the steps. If the deployment is using ComputeSriov role, then there should no deprecated parameters in the deployment. QE samples have those, which should be removed. And finally as per the description, for the issue of sriov agent not containerized, the environment file environments/services-docker/neutron-sriov.yaml should be used.