Bug 1310745 - During scale test raised error Failed to allocate the network(s), not rescheduling which prevent launch instances
During scale test raised error Failed to allocate the network(s), not resched...
Status: POST
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-neutron (Show other bugs)
7.0 (Kilo)
x86_64 Linux
high Severity high
: ---
: 7.0 (Kilo)
Assigned To: anil venkata
Toni Freger
: Triaged, ZStream
Depends On:
Blocks: 1393437
  Show dependency treegraph
 
Reported: 2016-02-22 10:25 EST by Yuri Obshansky
Modified: 2017-07-20 06:29 EDT (History)
18 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1393437 (view as bug list)
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
nova-compute.log file (4.11 MB, text/plain)
2016-02-22 10:25 EST, Yuri Obshansky
no flags Details
ospd node neutron-server log file (2.52 MB, application/x-gzip)
2016-02-22 10:28 EST, Yuri Obshansky
no flags Details
ospd node neutron-openswitch log file (1.51 MB, application/x-gzip)
2016-02-22 10:28 EST, Yuri Obshansky
no flags Details
openvswitch-agent.log file from overcloud-compute-2 (9.64 MB, application/x-gzip)
2016-06-16 07:47 EDT, Yuri Obshansky
no flags Details
neutron server log contr-0 after disable firewall driver (14.22 MB, application/x-gzip)
2016-06-21 08:31 EDT, Yuri Obshansky
no flags Details
neutron server log contr-1 after disable firewall driver (14.25 MB, application/x-gzip)
2016-06-21 08:32 EDT, Yuri Obshansky
no flags Details
neutron server log contr-2 after disable firewall driver (215.79 KB, application/x-gzip)
2016-06-21 08:33 EDT, Yuri Obshansky
no flags Details
openvswitch-agent.log compute 2 - part aa (16.00 MB, application/x-gzip)
2016-06-21 08:36 EDT, Yuri Obshansky
no flags Details
openvswitch-agent.log compute 2 - part ab (16.00 MB, application/octet-stream)
2016-06-21 08:38 EDT, Yuri Obshansky
no flags Details
openvswitch-agent.log compute 2 - part ac (16.00 MB, application/octet-stream)
2016-06-21 08:40 EDT, Yuri Obshansky
no flags Details
openvswitch-agent.log compute 2 - part ad (12.01 MB, application/octet-stream)
2016-06-21 08:42 EDT, Yuri Obshansky
no flags Details
error (7.69 KB, text/plain)
2017-05-03 16:21 EDT, Yuri Obshansky
no flags Details
openvswitch-agent.log and ovs-vswitchd.log (13.68 KB, text/plain)
2017-05-04 10:08 EDT, Yuri Obshansky
no flags Details
sosreport controller part aa (15.00 MB, application/x-xz)
2017-05-04 12:35 EDT, Yuri Obshansky
no flags Details
sosreport controller part ab (15.00 MB, application/octet-stream)
2017-05-04 12:36 EDT, Yuri Obshansky
no flags Details
sosreport controller part ac (12.84 MB, application/octet-stream)
2017-05-04 12:37 EDT, Yuri Obshansky
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
OpenStack gerrit 337064 None None None 2016-07-26 10:53 EDT
OpenStack gerrit 351773 None None None 2017-07-20 06:29 EDT

  None (edit)
Description Yuri Obshansky 2016-02-22 10:25:22 EST
Created attachment 1129364 [details]
nova-compute.log file

Description of problem:
During scale test launched 3217 instances/vms and all other failed to be launched with Error: "Build of instance 3899eea1-1367-4932-9207-0474fbaaafd5 aborted: Failed to allocate the network(s), not rescheduling"

In nova-compute.log found exceptions:
2016-02-18 10:38:12.474 17443 WARNING nova.virt.libvirt.driver
[req-03e59ac6-649c-4571-951b-ef496440a4e9
ea29bb495c1f43f89e8dc4f812f8f180 178100f38c934c29824db1a6ee40d7e2 - - -]
Timeout waiting for vif plugging callback for instance
9e39ceb1-1177-4267-b7ed-299d116927fc
2016-02-18 10:38:15.919 17443 ERROR nova.compute.manager
[req-03e59ac6-649c-4571-951b-ef496440a4e9
ea29bb495c1f43f89e8dc4f812f8f180 178100f38c934c29824db1a6ee40d7e2 - - -]
[instance: 9e39ceb1-1177-4267-b7ed-299d116927fc] Instance failed to spawn
2016-02-18 10:38:15.919 17443 TRACE nova.compute.manager [instance:
9e39ceb1-1177-4267-b7ed-299d116927fc] Traceback (most recent call last):
2016-02-18 10:38:15.919 17443 TRACE nova.compute.manager [instance:
9e39ceb1-1177-4267-b7ed-299d116927fc]   File
"/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2486,
in _build_resources
2016-02-18 10:38:15.919 17443 TRACE nova.compute.manager [instance:
9e39ceb1-1177-4267-b7ed-299d116927fc]     yield resources
2016-02-18 10:38:15.919 17443 TRACE nova.compute.manager [instance:
9e39ceb1-1177-4267-b7ed-299d116927fc]   File
"/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2358,
in _build_and_run_instance
2016-02-18 10:38:15.919 17443 TRACE nova.compute.manager [instance:
9e39ceb1-1177-4267-b7ed-299d116927fc]    
block_device_info=block_device_info)
2016-02-18 10:38:15.919 17443 TRACE nova.compute.manager [instance:
9e39ceb1-1177-4267-b7ed-299d116927fc]   File
"/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line
2492, in spawn
2016-02-18 10:38:15.919 17443 TRACE nova.compute.manager [instance:
9e39ceb1-1177-4267-b7ed-299d116927fc]    
block_device_info=block_device_info)
2016-02-18 10:38:15.919 17443 TRACE nova.compute.manager [instance:
9e39ceb1-1177-4267-b7ed-299d116927fc]   File
"/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line
4592, in _create_domain_and_network
2016-02-18 10:38:15.919 17443 TRACE nova.compute.manager [instance:
9e39ceb1-1177-4267-b7ed-299d116927fc]     raise
exception.VirtualInterfaceCreateException()
2016-02-18 10:38:15.919 17443 TRACE nova.compute.manager [instance:
9e39ceb1-1177-4267-b7ed-299d116927fc] VirtualInterfaceCreateException:
Virtual Interface creation failed
2016-02-18 10:38:15.919 17443 TRACE nova.compute.manager [instance:
9e39ceb1-1177-4267-b7ed-299d116927fc]
2016-02-18 10:38:16.873 17443 ERROR nova.compute.manager
[req-03e59ac6-649c-4571-951b-ef496440a4e9
ea29bb495c1f43f89e8dc4f812f8f180 178100f38c934c29824db1a6ee40d7e2 - - -]
[instance: 9e39ceb1-1177-4267-b7ed-299d116927fc] Failed to allocate
network(s)
2016-02-18 10:38:16.873 17443 TRACE nova.compute.manager [instance:
9e39ceb1-1177-4267-b7ed-299d116927fc] Traceback (most recent call last):
2016-02-18 10:38:16.873 17443 TRACE nova.compute.manager [instance:
9e39ceb1-1177-4267-b7ed-299d116927fc]   File
"/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2358,
in _build_and_run_instance
2016-02-18 10:38:16.873 17443 TRACE nova.compute.manager [instance:
9e39ceb1-1177-4267-b7ed-299d116927fc]    
block_device_info=block_device_info)
2016-02-18 10:38:16.873 17443 TRACE nova.compute.manager [instance:
9e39ceb1-1177-4267-b7ed-299d116927fc]   File
"/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line
2492, in spawn
2016-02-18 10:38:16.873 17443 TRACE nova.compute.manager [instance:
9e39ceb1-1177-4267-b7ed-299d116927fc]    
block_device_info=block_device_info)
2016-02-18 10:38:16.873 17443 TRACE nova.compute.manager [instance:
9e39ceb1-1177-4267-b7ed-299d116927fc]   File
"/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line
4592, in _create_domain_and_network
2016-02-18 10:38:16.873 17443 TRACE nova.compute.manager [instance:
9e39ceb1-1177-4267-b7ed-299d116927fc]     raise
exception.VirtualInterfaceCreateException()
2016-02-18 10:38:16.873 17443 TRACE nova.compute.manager [instance:
9e39ceb1-1177-4267-b7ed-299d116927fc] VirtualInterfaceCreateException:
Virtual Interface creation failed
2016-02-18 10:38:16.873 17443 TRACE nova.compute.manager [instance:
9e39ceb1-1177-4267-b7ed-299d116927fc]
2016-02-18 10:38:16.876 17443 ERROR nova.compute.manager
[req-03e59ac6-649c-4571-951b-ef496440a4e9
ea29bb495c1f43f89e8dc4f812f8f180 178100f38c934c29824db1a6ee40d7e2 - - -]
[instance: 9e39ceb1-1177-4267-b7ed-299d116927fc] Build of instance
9e39ceb1-1177-4267-b7ed-299d116927fc aborted: Failed to allocate the
network(s), not rescheduling.
2016-02-18 10:38:16.876 17443 TRACE nova.compute.manager [instance:
9e39ceb1-1177-4267-b7ed-299d116927fc] Traceback (most recent call last):
2016-02-18 10:38:16.876 17443 TRACE nova.compute.manager [instance:
9e39ceb1-1177-4267-b7ed-299d116927fc]   File
"/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2261,
in _do_build_and_run_instance
2016-02-18 10:38:16.876 17443 TRACE nova.compute.manager [instance:
9e39ceb1-1177-4267-b7ed-299d116927fc]     filter_properties)
2016-02-18 10:38:16.876 17443 TRACE nova.compute.manager [instance:
9e39ceb1-1177-4267-b7ed-299d116927fc]   File
"/usr/lib/python2.7/site-packages/nova/compute/manager.py", line 2393,
in _build_and_run_instance
2016-02-18 10:38:16.876 17443 TRACE nova.compute.manager [instance:
9e39ceb1-1177-4267-b7ed-299d116927fc]     reason=msg)
2016-02-18 10:38:16.876 17443 TRACE nova.compute.manager [instance:
9e39ceb1-1177-4267-b7ed-299d116927fc] BuildAbortException: Build of
instance 9e39ceb1-1177-4267-b7ed-299d116927fc aborted: Failed to
allocate the network(s), not rescheduling.
2016-02-18 10:38:16.876 17443 TRACE nova.compute.manager [instance:
9e39ceb1-1177-4267-b7ed-299d116927fc]
2016-02-18 10:38:17.536 17443 ERROR nova.network.neutronv2.api
[req-03e59ac6-649c-4571-951b-ef496440a4e9
ea29bb495c1f43f89e8dc4f812f8f180 178100f38c934c29824db1a6ee40d7e2 - - -]
Unable to clear device ID for port 'None'
2016-02-18 10:38:17.536 17443 TRACE nova.network.neutronv2.api Traceback
(most recent call last):
2016-02-18 10:38:17.536 17443 TRACE nova.network.neutronv2.api   File
"/usr/lib/python2.7/site-packages/nova/network/neutronv2/api.py", line
366, in _unbind_ports
2016-02-18 10:38:17.536 17443 TRACE nova.network.neutronv2.api    
port_client.update_port(port_id, port_req_body)
2016-02-18 10:38:17.536 17443 TRACE nova.network.neutronv2.api   File
"/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line
102, in with_params
2016-02-18 10:38:17.536 17443 TRACE nova.network.neutronv2.api     ret =
self.function(instance, *args, **kwargs)
2016-02-18 10:38:17.536 17443 TRACE nova.network.neutronv2.api   File
"/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line
549, in update_port
2016-02-18 10:38:17.536 17443 TRACE nova.network.neutronv2.api    
return self.put(self.port_path % (port), body=body)
2016-02-18 10:38:17.536 17443 TRACE nova.network.neutronv2.api   File
"/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line
302, in put
2016-02-18 10:38:17.536 17443 TRACE nova.network.neutronv2.api    
headers=headers, params=params)
2016-02-18 10:38:17.536 17443 TRACE nova.network.neutronv2.api   File
"/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line
270, in retry_request
2016-02-18 10:38:17.536 17443 TRACE nova.network.neutronv2.api    
headers=headers, params=params)
2016-02-18 10:38:17.536 17443 TRACE nova.network.neutronv2.api   File
"/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line
211, in do_request
2016-02-18 10:38:17.536 17443 TRACE nova.network.neutronv2.api    
self._handle_fault_response(status_code, replybody)
2016-02-18 10:38:17.536 17443 TRACE nova.network.neutronv2.api   File
"/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line
185, in _handle_fault_response
2016-02-18 10:38:17.536 17443 TRACE nova.network.neutronv2.api    
exception_handler_v20(status_code, des_error_body)
2016-02-18 10:38:17.536 17443 TRACE nova.network.neutronv2.api   File
"/usr/lib/python2.7/site-packages/neutronclient/v2_0/client.py", line
83, in exception_handler_v20
2016-02-18 10:38:17.536 17443 TRACE nova.network.neutronv2.api    
message=message)
2016-02-18 10:38:17.536 17443 TRACE nova.network.neutronv2.api
NeutronClientException: 404 Not Found
2016-02-18 10:38:17.536 17443 TRACE nova.network.neutronv2.api
2016-02-18 10:38:17.536 17443 TRACE nova.network.neutronv2.api The
resource could not be found.

OpenStack environment is: 
3 controller nodes and 6 compute nodes, HA.
where each is 24 CPUs and 64 G RAM
Instances/vms were created using:
- image: cirros-0.3.2-sc (12.6 MB)
- flavor: m1.nano (1VCPU, 64MB RAM, 1GB Disk)

Version-Release number of selected component (if applicable):
rhos-release 7 -p 2015-12-16.1
# rpm -qa | grep nova
openstack-nova-console-2015.1.2-7.el7ost.noarch
openstack-nova-novncproxy-2015.1.2-7.el7ost.noarch
python-nova-2015.1.2-7.el7ost.noarch
openstack-nova-api-2015.1.2-7.el7ost.noarch
openstack-nova-common-2015.1.2-7.el7ost.noarch
python-novaclient-2.23.0-2.el7ost.noarch
openstack-nova-compute-2015.1.2-7.el7ost.noarch
openstack-nova-conductor-2015.1.2-7.el7ost.noarch
openstack-nova-scheduler-2015.1.2-7.el7ost.noarch
openstack-nova-cert-2015.1.2-7.el7ost.noarch

How reproducible:
Run scale test which creates 50 tenants, 50 network per tenant 
and as more as possible instances/vm

Steps to Reproduce:
1.
2.
3.

Actual results:
Failed create more than 3217 instances
Raise exception in nova-compute.log

Expected results:


Additional info:
Comment 2 Yuri Obshansky 2016-02-22 10:27:10 EST
See test results
https://mojo.redhat.com/docs/DOC-1067371
Comment 3 Yuri Obshansky 2016-02-22 10:28 EST
Created attachment 1129365 [details]
ospd node neutron-server log file
Comment 4 Yuri Obshansky 2016-02-22 10:28 EST
Created attachment 1129366 [details]
ospd node neutron-openswitch log file
Comment 5 Dan Smith 2016-02-24 14:59:10 EST
The problem is that neutron has not called to nova to deliver the vif-plugged event, which causes nova to time out. I don't see any errors in the neutron log that indicate that something failed which prevented this call, but after nova fails the request, it isn't able to talk to the neutron API to deallocate the resources either.

Suggest we point some neutron people at this to take a look.
Comment 7 Yuri Obshansky 2016-03-07 02:29:20 EST
Bug reproduced (OSP 7.3) on rhos-release 7 -p 2016-02-24.1
Comment 8 Yuri Obshansky 2016-03-07 02:30:27 EST
See test results on 
https://mojo.redhat.com/docs/DOC-1070135
Comment 9 Assaf Muller 2016-06-07 19:23:09 EDT
Would it be possible to gain access to a setup as the issue reproduces?
Comment 10 Yuri Obshansky 2016-06-08 06:57:44 EDT
I don't have a setup of OPS7 right now.
Bug was reported before 4-5 months.
I have a setup of OSP 8 -p 2016-05-31.1
and I'm going to rerun the same scale test soon.
If you would like to take a look at the issue on OSP 8 let me know, 
Because I applied a workaround into the setup
"Workaround for an issue "Failed to allocate the network(s), not rescheduling"
Need update nova settings on compute nodes - vif_plugging_is_fatal=False in files /etc/nova/nova.conf instead of vif_plugging_is_fatal=True."
I'll update you when the test will be start
Comment 11 Yuri Obshansky 2016-06-14 03:05:12 EDT
The issue reproduced on OSP 8 -p 2016-05-31.1
Tonight I executed the same scale test and created only 3661 instances
All other created with Status Error.
I provided to Assaf setup details and access.
Comment 12 Yuri Obshansky 2016-06-16 07:47 EDT
Created attachment 1168705 [details]
openvswitch-agent.log file from overcloud-compute-2
Comment 13 anil venkata 2016-06-16 09:40:24 EDT
Here, nova is able to create port for vm and able to bind to compute node(checked through neutron server log).
Then it is plugging the vif(confirmed through ovsdb log message in ovs agent log) and waiting for 5 minutes for the network-vif-plugged from neutron server. 
(When timeout happend, it is deleting vif, checked through ovsdb log message in ovs agent)

When nova plugged vif, ovs agent received information about new port through ovsdb, but it was unable to process this port as it was busy in rpc_loop handling existing ports(>600 ports in compute node). While processing these ports, agent called update_device_list RPC method, but failed with RPC timeout error.  I see response comping for this RPC after 2 minutes(Default RPC timeout is 1 minute). As there is timeout, agent is again trying to process these 600 ports.
As ovs agent is busy processing these ports all time, it is ignoring wiring up of new port added by nova and hence nova is not receiving network-vif-plugged notification, and failing building of instance with "Timeout waiting for vif plugging callback for instance".
 
One of the solution can be increasing the RPC timeout. But there is a u/s patch https://review.openstack.org/#/c/314319/ which adds an exponential backoff mechanism for timeout values on RPC call. Yuri will be testing with this patch on his test setup with OSP 8 -p 2016-05-31.1
Comment 14 Yuri Obshansky 2016-06-21 08:31 EDT
Created attachment 1170248 [details]
neutron server log contr-0 after disable firewall driver
Comment 15 Yuri Obshansky 2016-06-21 08:32 EDT
Created attachment 1170249 [details]
neutron server log contr-1 after disable firewall driver
Comment 16 Yuri Obshansky 2016-06-21 08:33 EDT
Created attachment 1170251 [details]
neutron server log contr-2 after disable firewall driver
Comment 17 Yuri Obshansky 2016-06-21 08:36 EDT
Created attachment 1170254 [details]
openvswitch-agent.log compute 2 - part aa
Comment 18 Yuri Obshansky 2016-06-21 08:38 EDT
Created attachment 1170255 [details]
openvswitch-agent.log compute 2 - part ab
Comment 19 Yuri Obshansky 2016-06-21 08:40 EDT
Created attachment 1170257 [details]
openvswitch-agent.log compute 2 - part ac
Comment 20 Yuri Obshansky 2016-06-21 08:42 EDT
Created attachment 1170258 [details]
openvswitch-agent.log compute 2 - part ad
Comment 21 anil venkata 2016-07-26 10:53:57 EDT
Need to try with this change https://review.openstack.org/#/c/337064/
Comment 22 anil venkata 2016-08-29 09:56:53 EDT
@Yuri

Any update on getting setup to try the fix https://review.openstack.org/#/c/337064/

Thanks
Anil
Comment 23 Yuri Obshansky 2016-08-30 01:41:35 EDT
Hi Anil, 

I still don't have a hardware for bug reproducing.
But, hopefully, I'll got the servers on next week.
I'll update you with status.

Thanks,
Yuri
Comment 24 anil venkata 2016-11-21 10:33:46 EST
These changes improve the performance of ovs agent

[1] https://review.openstack.org/#/c/337064/
[2] https://review.openstack.org/#/c/352440/
[3] https://review.openstack.org/#/c/351773/

Change [1] is already merged in u/s and is also part of OSP9. Changes [2] and [3] are in waiting for review in u/s. These changes avoid unnecessary duplicate ipset calls, making ovs agent run faster when many ports are added.
Comment 26 Yuri Obshansky 2016-11-23 09:11:57 EST
Hi Toni, 

Not yet, I'll reproduce it at once I have a working setup.
I've opened a ticket at eng-ops 
https://engineering.redhat.com/rt/Ticket/Display.html?id=427891
Waiting for the fix

Yuri
Comment 28 anil venkata 2017-02-09 05:14:47 EST
Thanks, Can you please provide me the access details to investigate further?
Comment 30 Yuri Obshansky 2017-04-30 22:01:22 EDT
Hi, Anil
Finally, I've got a hardware for reproducing the bug.
Now, it's reproduced on rhos 8 
[root@overcloud-compute-0 nova]#  rpm -qa | grep nova
openstack-nova-novncproxy-12.0.5-9.el7ost.noarch
openstack-nova-compute-12.0.5-9.el7ost.noarch
openstack-nova-api-12.0.5-9.el7ost.noarch
python-novaclient-3.1.0-2.el7ost.noarch
python-nova-12.0.5-9.el7ost.noarch
openstack-nova-conductor-12.0.5-9.el7ost.noarch
openstack-nova-common-12.0.5-9.el7ost.noarch
openstack-nova-scheduler-12.0.5-9.el7ost.noarch
openstack-nova-console-12.0.5-9.el7ost.noarch
openstack-nova-cert-12.0.5-9.el7ost.noarch
with the same error:
2017-05-01 01:56:40.239 33560 ERROR nova.compute.manager [instance: ef090168-e140-45ba-894b-7bad74172217] BuildAbortException: Build of instance ef090168-e140-45ba-894b-7bad74172217 aborted: Failed to allocate the network(s), not rescheduling.
Now, how we can apply your fix and rerun the test?
Yuri
Comment 31 Yuri Obshansky 2017-05-01 16:26:03 EDT
Forget add neutron versions
[stack@c02-h02-r620 ~]$ rpm -qa | grep neutron
openstack-neutron-7.2.0-5.el7ost.noarch
openstack-neutron-common-7.2.0-5.el7ost.noarch
openstack-neutron-ml2-7.2.0-5.el7ost.noarch
openstack-neutron-openvswitch-7.2.0-5.el7ost.noarch
python-neutron-7.2.0-5.el7ost.noarch
python-neutronclient-3.1.0-3.el7ost.noarch
Comment 32 Yuri Obshansky 2017-05-03 16:20:36 EDT
Hi Anil,
Unfortunately, the bug reproduced on rhos-release 9 -p 2017-04-07.5 also
[root@c02-h02-r620 ~]# rpm -qa | grep neutron
python-neutron-lib-0.0.2-1.el7ost.noarch
openstack-neutron-8.3.0-5.el7ost.noarch
python-neutronclient-4.1.1-2.el7ost.noarch
python-neutron-vpnaas-tests-8.0.0-1.el7ost.noarch
python-neutron-tests-8.3.0-5.el7ost.noarch
python-neutron-8.3.0-5.el7ost.noarch
python-neutron-fwaas-tests-8.0.0-3.el7ost.noarch
openstack-neutron-ml2-8.3.0-5.el7ost.noarch
python-neutron-lbaas-tests-8.1.0-2.el7ost.noarch
python-neutron-vpnaas-8.0.0-1.el7ost.noarch
openstack-neutron-openvswitch-8.3.0-5.el7ost.noarch
python-neutron-lbaas-8.1.0-2.el7ost.noarch
openstack-neutron-common-8.3.0-5.el7ost.noarch
python-neutron-fwaas-8.0.0-3.el7ost.noarch

See attached txt file with error.
I didn't clean stack, waiting when you'll take a look.

Yuri
Comment 33 Yuri Obshansky 2017-05-03 16:21 EDT
Created attachment 1275991 [details]
error
Comment 34 Sahid Ferdjaoui 2017-05-04 07:59:51 EDT
(In reply to Yuri Obshansky from comment #33)
> Created attachment 1275991 [details]
> error

Can you provide the logs of neutron and ovs agent related to that error ?

Also can you ensure that "/etc/neutron/neutron.conf" is well configured with:

 notify_nova_on_port_status_changes=True
 notify_nova_on_port_data_changes=True
 nova_url=http://<IP>/v2.1

Thanks
Comment 35 Yuri Obshansky 2017-05-04 10:07:07 EDT
Hi, 
I checked /etc/neutron/neutron.conf on compute node
...
# Send notification to nova when port status changes (boolean value)
#notify_nova_on_port_status_changes = true

# Send notification to nova when port data (fixed_ips/floatingip) changes so
# nova can update its cache. (boolean value)
#notify_nova_on_port_data_changes = true
....

There is not  nova_url=http://<IP>/v2.1 setting in it.

Attached part of openvswitch-agent.log and ovs-vswitchd.log
in the same time frame.

Thanks
Comment 36 Yuri Obshansky 2017-05-04 10:08 EDT
Created attachment 1276361 [details]
openvswitch-agent.log and ovs-vswitchd.log
Comment 37 Sahid Ferdjaoui 2017-05-04 10:36:45 EDT
(In reply to Yuri Obshansky from comment #35)
> Hi, 
> I checked /etc/neutron/neutron.conf on compute node
> ...
> # Send notification to nova when port status changes (boolean value)
> #notify_nova_on_port_status_changes = true
> 
> # Send notification to nova when port data (fixed_ips/floatingip) changes so
> # nova can update its cache. (boolean value)
> #notify_nova_on_port_data_changes = true
> ....
> 
> There is not  nova_url=http://<IP>/v2.1 setting in it.

Ok, can you check with what is indicated here [0] and let us to know the result?

[0] https://access.redhat.com/solutions/1212643
Comment 38 Yuri Obshansky 2017-05-04 10:49:03 EDT
Hi, 
I cannot see solution -> Subscriber exclusive content !!!
Ping me in IRC - yobshans on neutron/rhos-dev/rhos-qe channels
Thanks
Comment 39 Yuri Obshansky 2017-05-04 12:21:26 EDT
Hi, 
We double check the configuration on Controller node
[root@overcloud-controller-0 ~]# egrep '(^nova_|^notify_nova)' /etc/neutron/neutron.conf
notify_nova_on_port_status_changes=True
notify_nova_on_port_data_changes=True
nova_url=http://172.16.0.10:8774/v2.1/%(tenant_id)s 
It looks ok.
I'll generate sosreport for for controller and compute nodes.
Thanks
Comment 40 Yuri Obshansky 2017-05-04 12:35 EDT
Created attachment 1276394 [details]
sosreport controller part aa

Original file is large, about 44 MB
Split by using
split -b 15M sosreport-overcloud-controller-0.localdomain-20170504153651.tar.xz "sos-contr-"
Could join by 
cat sos-contr-* > sosreport-contr.tar.gz
Comment 41 Yuri Obshansky 2017-05-04 12:36 EDT
Created attachment 1276395 [details]
sosreport controller part ab
Comment 42 Yuri Obshansky 2017-05-04 12:37 EDT
Created attachment 1276396 [details]
sosreport controller part ac

Note You need to log in before you can comment on or make changes to this bug.