Bug 1809088

Summary: Error creating port when two servers create a port group simultaneously
Product: Red Hat OpenStack Reporter: Eduardo Olivares <eolivare>
Component: python-networking-ovnAssignee: Jakub Libosvar <jlibosva>
Status: CLOSED EOL QA Contact: Eran Kuris <ekuris>
Severity: high Docs Contact:
Priority: medium    
Version: 15.0 (Stein)CC: afazekas, apevec, jlibosva, lhh, majopela, scohen
Target Milestone: ---Keywords: Triaged, ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-09-18 09:12:07 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
neutron server logs none

Description Eduardo Olivares 2020-03-02 12:20:42 UTC
Description of problem:
Tempest failed during a test setup, specifically during port creation:
2020-02-29 02:54:16.416 286732 INFO tempest.lib.common.rest_client [req-ab175c93-b83b-4345-bec6-deef1829aec6 ] Request (ExtraDHCPOptionsIpV6TestJSON:setUpClass): 500 POST http://10.0.0.147:9696/v2.0/ports 1.993s
2020-02-29 02:54:16.417 286732 DEBUG tempest.lib.common.rest_client [req-ab175c93-b83b-4345-bec6-deef1829aec6 ] Request - Headers: {'Content-Type': 'application/json', 'Accept': 'application/json', 'X-Auth-Token': '<omitted>'}
        Body: {"port": {"network_id": "b20fa426-2874-4807-a863-f2d3d5817854"}}
    Response - Headers: {'content-type': 'application/json', 'content-length': '150', 'x-openstack-request-id': 'req-ab175c93-b83b-4345-bec6-deef1829aec6', 'date': 'Sat, 29 Feb 2020 02:54:16 GMT', 'connection': 'close', 'status': '500', '
content-location': 'http://10.0.0.147:9696/v2.0/ports'}
        Body: b'{"NeutronError": {"type": "HTTPInternalServerError", "message": "Request Failed: internal server error while processing your request.", "detail": ""}}' _log_request_full /usr/lib/python3.6/site-packages/tempest/lib/common/
rest_client.py:440


controller-1 neutron logs show that PgAddPortCommand fails here, while creating neutron_pg_drop:
2020-02-29 02:54:15.684 22 DEBUG ovsdbapp.backend.ovs_idl.transaction [-] Running txn n=1 command(idx=1): PgAddCommand(name=neutron_pg_drop, may_exist=True, columns={'acls': []}) do_commit /usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/transaction.py:84
...
2020-02-29 02:54:15.690 22 ERROR ovsdbapp.backend.ovs_idl.transaction [-] OVSDB Error: {"details":"Transaction causes multiple rows in \"Port_Group\" table to have identical values (neutron_pg_drop) for index on column \"name\".  First row, with UUID 2025ff8b-f8b0-4f39-9fdf-9b48b1a02b7a, was inserted by this transaction.  Second row, with UUID f516334e-d700-4f65-8aaa-abb5c4fd465a, existed in the database before this transaction and was not modified by the transaction.","error":"constraint violation"}
2020-02-29 02:54:15.691 22 ERROR ovsdbapp.backend.ovs_idl.transaction [req-ab175c93-b83b-4345-bec6-deef1829aec6 3cbd95c02380422ca96bc3de8418bcc0 99ee3dee19754d168d47c35e9337db20 - default default] Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/connection.py", line 122, in run
    txn.results.put(txn.do_commit())
  File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/transaction.py", line 115, in do_commit
    raise RuntimeError(msg)
RuntimeError: OVSDB Error: {"details":"Transaction causes multiple rows in \"Port_Group\" table to have identical values (neutron_pg_drop) for index on column \"name\".  First row, with UUID 2025ff8b-f8b0-4f39-9fdf-9b48b1a02b7a, was inserted by this transaction.  Second row, with UUID f516334e-d700-4f65-8aaa-abb5c4fd465a, existed in the database before this transaction and was not modified by the transaction.","error":"constraint violation"}


controller-2 logs show that neutron_pg_drop was created simultaneously (time difference is 1 ms):
2020-02-29 02:54:15.683 23 DEBUG ovsdbapp.backend.ovs_idl.transaction [-] Running txn n=1 command(idx=1): PgAddCommand(name=neutron_pg_drop, may_exist=True, columns={'acls': []}) do_commit /usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/transaction.py:84



There was a similar bug that is fixed on this puddle (https://bugzilla.redhat.com/show_bug.cgi?id=1737432), but seems that not all the cases were covered by this fix.

This has been reproduced with a CI job, but is an unlikely race condition. With the same CI job with previous OSP15 puddle (which included the same networking-ovn rpm), all tempest tests passed.



Version-Release number of selected component (if applicable):
RHOS_TRUNK-15.0-RHEL-8-20200227.n.0
python3-networking-ovn-6.0.1-0.20200116080451.e669382.el8ost.noarch.rpm


How reproducible:
it was been reproduced 1/2, but probably it's even less than 50% 

Steps to Reproduce:
1. run https://rhos-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/DFG-network-networking-ovn-15_director-rhel-virthost-3cont_2comp_3net-ipv4-geneve-composable/
2.
3.



Actual results:
Tempest failed during
.setUpClass (tempest.api.network.test_extra_dhcp_options.ExtraDHCPOptionsIpV6TestJSON) 
See https://rhos-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/DFG-network-networking-ovn-15_director-rhel-virthost-3cont_2comp_3net-ipv4-geneve-composable/59/testReport/junit/(root)/(empty)/setUpClass__tempest_api_network_test_extra_dhcp_options_ExtraDHCPOptionsIpV6TestJSON_/
(port was not created)


Expected results:
all tempest test should pass (ports should be successfully created)


Additional info:

Comment 1 Eduardo Olivares 2020-03-02 12:53:52 UTC
Created attachment 1666953 [details]
neutron server logs