Bug 1809088 - Error creating port when two servers create a port group simultaneously
Summary: Error creating port when two servers create a port group simultaneously
Keywords:
Status: CLOSED EOL
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: python-networking-ovn
Version: 15.0 (Stein)
Hardware: Unspecified
OS: Unspecified
medium
high
Target Milestone: ---
: ---
Assignee: Jakub Libosvar
QA Contact: Eran Kuris
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-03-02 12:20 UTC by Eduardo Olivares
Modified: 2022-08-16 08:24 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-09-18 09:12:07 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
neutron server logs (5.11 MB, application/gzip)
2020-03-02 12:53 UTC, Eduardo Olivares
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1866068 0 None None None 2020-03-04 13:26:41 UTC
OpenStack gerrit 711404 0 None MERGED [ovn]: Create neutron_pg_drop Port Group on init 2020-12-03 14:27:38 UTC
Red Hat Issue Tracker OSP-5231 0 None None None 2022-08-16 08:24:44 UTC

Description Eduardo Olivares 2020-03-02 12:20:42 UTC
Description of problem:
Tempest failed during a test setup, specifically during port creation:
2020-02-29 02:54:16.416 286732 INFO tempest.lib.common.rest_client [req-ab175c93-b83b-4345-bec6-deef1829aec6 ] Request (ExtraDHCPOptionsIpV6TestJSON:setUpClass): 500 POST http://10.0.0.147:9696/v2.0/ports 1.993s
2020-02-29 02:54:16.417 286732 DEBUG tempest.lib.common.rest_client [req-ab175c93-b83b-4345-bec6-deef1829aec6 ] Request - Headers: {'Content-Type': 'application/json', 'Accept': 'application/json', 'X-Auth-Token': '<omitted>'}
        Body: {"port": {"network_id": "b20fa426-2874-4807-a863-f2d3d5817854"}}
    Response - Headers: {'content-type': 'application/json', 'content-length': '150', 'x-openstack-request-id': 'req-ab175c93-b83b-4345-bec6-deef1829aec6', 'date': 'Sat, 29 Feb 2020 02:54:16 GMT', 'connection': 'close', 'status': '500', '
content-location': 'http://10.0.0.147:9696/v2.0/ports'}
        Body: b'{"NeutronError": {"type": "HTTPInternalServerError", "message": "Request Failed: internal server error while processing your request.", "detail": ""}}' _log_request_full /usr/lib/python3.6/site-packages/tempest/lib/common/
rest_client.py:440


controller-1 neutron logs show that PgAddPortCommand fails here, while creating neutron_pg_drop:
2020-02-29 02:54:15.684 22 DEBUG ovsdbapp.backend.ovs_idl.transaction [-] Running txn n=1 command(idx=1): PgAddCommand(name=neutron_pg_drop, may_exist=True, columns={'acls': []}) do_commit /usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/transaction.py:84
...
2020-02-29 02:54:15.690 22 ERROR ovsdbapp.backend.ovs_idl.transaction [-] OVSDB Error: {"details":"Transaction causes multiple rows in \"Port_Group\" table to have identical values (neutron_pg_drop) for index on column \"name\".  First row, with UUID 2025ff8b-f8b0-4f39-9fdf-9b48b1a02b7a, was inserted by this transaction.  Second row, with UUID f516334e-d700-4f65-8aaa-abb5c4fd465a, existed in the database before this transaction and was not modified by the transaction.","error":"constraint violation"}
2020-02-29 02:54:15.691 22 ERROR ovsdbapp.backend.ovs_idl.transaction [req-ab175c93-b83b-4345-bec6-deef1829aec6 3cbd95c02380422ca96bc3de8418bcc0 99ee3dee19754d168d47c35e9337db20 - default default] Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/connection.py", line 122, in run
    txn.results.put(txn.do_commit())
  File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/transaction.py", line 115, in do_commit
    raise RuntimeError(msg)
RuntimeError: OVSDB Error: {"details":"Transaction causes multiple rows in \"Port_Group\" table to have identical values (neutron_pg_drop) for index on column \"name\".  First row, with UUID 2025ff8b-f8b0-4f39-9fdf-9b48b1a02b7a, was inserted by this transaction.  Second row, with UUID f516334e-d700-4f65-8aaa-abb5c4fd465a, existed in the database before this transaction and was not modified by the transaction.","error":"constraint violation"}


controller-2 logs show that neutron_pg_drop was created simultaneously (time difference is 1 ms):
2020-02-29 02:54:15.683 23 DEBUG ovsdbapp.backend.ovs_idl.transaction [-] Running txn n=1 command(idx=1): PgAddCommand(name=neutron_pg_drop, may_exist=True, columns={'acls': []}) do_commit /usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/transaction.py:84



There was a similar bug that is fixed on this puddle (https://bugzilla.redhat.com/show_bug.cgi?id=1737432), but seems that not all the cases were covered by this fix.

This has been reproduced with a CI job, but is an unlikely race condition. With the same CI job with previous OSP15 puddle (which included the same networking-ovn rpm), all tempest tests passed.



Version-Release number of selected component (if applicable):
RHOS_TRUNK-15.0-RHEL-8-20200227.n.0
python3-networking-ovn-6.0.1-0.20200116080451.e669382.el8ost.noarch.rpm


How reproducible:
it was been reproduced 1/2, but probably it's even less than 50% 

Steps to Reproduce:
1. run https://rhos-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/DFG-network-networking-ovn-15_director-rhel-virthost-3cont_2comp_3net-ipv4-geneve-composable/
2.
3.



Actual results:
Tempest failed during
.setUpClass (tempest.api.network.test_extra_dhcp_options.ExtraDHCPOptionsIpV6TestJSON) 
See https://rhos-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/job/DFG-network-networking-ovn-15_director-rhel-virthost-3cont_2comp_3net-ipv4-geneve-composable/59/testReport/junit/(root)/(empty)/setUpClass__tempest_api_network_test_extra_dhcp_options_ExtraDHCPOptionsIpV6TestJSON_/
(port was not created)


Expected results:
all tempest test should pass (ports should be successfully created)


Additional info:

Comment 1 Eduardo Olivares 2020-03-02 12:53:52 UTC
Created attachment 1666953 [details]
neutron server logs


Note You need to log in before you can comment on or make changes to this bug.