Bug 1512274

Summary: OVS timeout while adding router ports
Product: Red Hat OpenStack Reporter: Pablo Iranzo Gómez <pablo.iranzo>
Component: openstack-neutronAssignee: anil venkata <vkommadi>
Status: CLOSED ERRATA QA Contact: Roee Agiman <ragiman>
Severity: urgent Docs Contact:
Priority: high    
Version: 10.0 (Newton)CC: acavalla, amuller, chrisw, cshastri, ealcaniz, ihrachys, jlibosva, lasilva, lpeer, majopela, nyechiel, ragiman, srevivo, ssigwald, vkommadi
Target Milestone: z7Keywords: Triaged, ZStream
Target Release: 10.0 (Newton)   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: openstack-neutron-9.4.1-9.el7ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1513849 1513850 1513853 1513860 (view as bug list) Environment:
Last Closed: 2018-02-27 16:41:21 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1513849, 1513850, 1513853, 1513860    

Description Pablo Iranzo Gómez 2017-11-12 11:19:00 UTC
Description of problem:

After upgrade form OSP7 to OSP10 Loadbalancers are not coming up in the hosts.

MariaDB [neutron]> select * from  lbaas_loadbalancers where not operating_status='ONLINE'\G;
*************************** 1. row ***************************
         project_id: 416cea51612d4818a238eac544d71444
                 id: 3c52a1b5-2ad3-42c9-aa29-4b7291a3e9ca
               name: Load Balancer http-torcedores
        description:
        vip_port_id: 6c2e2d4c-65bf-4a7e-bf3e-c11a001fbaa9
      vip_subnet_id: a4884476-0935-4519-89e1-97c8d7ff854c
        vip_address: 172.20.10.100
     admin_state_up: 1
provisioning_status: PENDING_CREATE
   operating_status: OFFLINE
          flavor_id: NULL
*************************** 2. row ***************************
         project_id: 416cea51612d4818a238eac544d71444
                 id: 51523dc2-0d3d-4b1a-bb7b-261dc7481032
               name: LoadBalancer Torcedores
        description:
        vip_port_id: 3a561985-77d7-4b9b-aa52-5a2843d806ee
      vip_subnet_id: a4884476-0935-4519-89e1-97c8d7ff854c
        vip_address: 172.20.10.19
     admin_state_up: 1
provisioning_status: PENDING_CREATE
   operating_status: OFFLINE
          flavor_id: NULL
*************************** 3. row ***************************
         project_id: 416cea51612d4818a238eac544d71444
                 id: d543d72a-41f3-4df3-abaf-ce61bc72a667
               name: Load Balancer Torcedores
        description:
        vip_port_id: 8ae61c03-37a9-4077-b792-7cd7ab4f0a11
      vip_subnet_id: a4884476-0935-4519-89e1-97c8d7ff854c
        vip_address: 172.20.10.18
     admin_state_up: 1
provisioning_status: PENDING_CREATE
   operating_status: OFFLINE
          flavor_id: NULL
3 rows in set (0.01 sec)


Version-Release number of selected component (if applicable):
openstack-neutron-9.2.0-2.el7ost.noarch                     Sat Nov  4 03:55:09 2017
openstack-neutron-common-9.2.0-2.el7ost.noarch              Sat Nov  4 03:55:09 2017
openstack-neutron-lbaas-9.1.0-4.el7ost.noarch               Sat Nov  4 03:55:20 2017
openstack-neutron-metering-agent-9.2.0-2.el7ost.noarch      Sat Nov  4 03:55:20 2017
openstack-neutron-ml2-9.2.0-2.el7ost.noarch                 Sat Nov  4 03:55:20 2017
openstack-neutron-openvswitch-9.2.0-2.el7ost.noarch         Sat Nov  4 03:55:20 2017
python-neutron-9.2.0-2.el7ost.noarch                        Sat Nov  4 03:55:08 2017
python-neutron-lbaas-9.1.0-4.el7ost.noarch                  Sat Nov  4 03:55:09 2017
python-neutron-lib-0.4.0-1.el7ost.noarch                    Sat Nov  4 03:55:06 2017
python-neutron-tests-9.2.0-2.el7ost.noarch                  Sat Nov  4 03:55:22 2017
python-neutronclient-6.0.0-2.el7ost.noarch                  Sat Nov  4 03:54:58 2017


On ovs cleanup we do get:

_mysql_exceptions.IntegrityError: (1452, 'Cannot add or update a child row: a foreign key constraint fails (`neutron`.`lbaas_loadbalanceragentbindings`, CONSTRAINT `lbaas_loadbalanceragentbindings_ibfk_2` FOREIGN KEY (`agent_id`) REFERENCES `agents` (`id`) ON DELETE CASCADE)')



How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:
LB

Expected results:


Additional info:

Comment 8 anil venkata 2017-11-13 11:27:44 UTC
Can you please share ovs-vswitchd.log and ovsdb-server.log logs(from /var/log/openvswitch/) along with neutron logs when you see the ovs timeout issue?

Comment 9 Jakub Libosvar 2017-11-13 14:31:45 UTC
Anil is looking into this one and he'll triage it.

Comment 11 anil venkata 2017-11-13 17:55:44 UTC
Hi Andrea

I have already gone through those logs(but couldn't find openvswitch logs when the ovs timeout happend). When ovs timeout issue is seen, I need neutron logs and openvswitch(/var/log/openvswitch/) logs.

Comment 13 anil venkata 2017-11-14 02:11:53 UTC
can you please let me know the time?

Comment 17 anil venkata 2017-11-15 04:52:53 UTC
Looks like l3 agent is not having access rights for "/var/lib/neutron/lock/" folder i.e below error

2017-11-06 03:36:05.778 22148 ERROR neutron.agent.linux.iptables_manager IOError: [Errno 13] Permission denied: u'/var/lib/neutron/lock/neutron-iptables-qrouter-78eba66f-44f9-4ff8-b29b-d456ae5dfac3'


from http://collab-shell.usersys.redhat.com/01963268/neutron2.tar.gz-1510676806/var/log/neutron/l3-agent.log


Can customer check and provide proper permissions?

Comment 18 anil venkata 2017-11-15 05:08:50 UTC
/var/lib/neutron should have owner and group as neutron and 766 as permissions like below
drwxr-xr-x.  6 neutron          neutron           115 Oct 31 11:22 neutron

Comment 19 Ihar Hrachyshka 2017-11-15 17:47:00 UTC
This upstream bug may be related to ovsdb timeouts.

Comment 20 Ihar Hrachyshka 2017-11-15 17:47:32 UTC
Eh, forgot the link to the bug: https://bugs.launchpad.net/neutron/+bug/1627106

Comment 21 Miguel Angel Ajo 2017-11-21 14:40:30 UTC
There's a hotfix that could be used on customer deployment

Comment 29 errata-xmlrpc 2018-02-27 16:41:21 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0357