Due to a recent update on Javascript code a full page refresh on your browser might be needed.
Bug 1720947 - osp15 After controller restart ovn-controller containers are in Dead state
Summary: osp15 After controller restart ovn-controller containers are in Dead state
Keywords:
Status: VERIFIED
Alias: None
Product: Red Hat Enterprise Linux Fast Datapath
Classification: Red Hat
Component: openvswitch2.11
Version: FDP 19.F
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: ---
Assignee: Numan Siddique
QA Contact: haidong li
URL:
Whiteboard:
: 1714949 1721560 1726217 1732070 (view as bug list)
Depends On:
Blocks: 1731269 1726217 1740114 1740115 1757254
TreeView+ depends on / blocked
 
Reported: 2019-06-16 18:21 UTC by pkomarov
Modified: 2020-03-23 14:53 UTC (History)
33 users (show)

Fixed In Version: puppet-tripleo-10.5.1-0.20190812120435.ed6c6b0.el8ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1746120 1746200 1757254 (view as bug list)
Environment:
Last Closed:
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
OpenStack gerrit 669610 'None' MERGED Close OVN VIP race by adding an ordering constraint 2020-04-01 16:08:28 UTC
Red Hat Bugzilla 1760211 'medium' 'ASSIGNED' 'Bump up default pacemaker monitor timeout value for OVN DBs' 2019-12-01 05:46:26 UTC

Description pkomarov 2019-06-16 18:21:06 UTC
Description of problem:
 After controller restart ovn-controller containers are in Dead state 

Version-Release number of selected component (if applicable):
RHOS_TRUNK-15.0-RHEL-8-20190523.n.1

How reproducible:
rerun : 
https://rhos-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/DFG/view/pidone/view/sanity/job/DFG-pidone-sanity-15_director-rhel-virthost-3cont_2comp_3ceph-ipv4-geneve-sanity/

Steps to Reproduce:
1.deploy osp15
2.restart a controller
3.notice network agents are in dead state

Actual results:
(overcloud) [stack@undercloud-0 ~]$ openstack network agent list 
+--------------------------------------+----------------------+--------------------------+-------------------+-------+-------+-------------------------------+
| ID                                   | Agent Type           | Host                     | Availability Zone | Alive | State | Binary                        |
+--------------------------------------+----------------------+--------------------------+-------------------+-------+-------+-------------------------------+
| a70204c4-c5a6-4fa6-8927-39b5cb4392e1 | OVN Controller agent | controller-0.localdomain | n/a               | XXX   | UP    | ovn-controller                |
| 6a19511d-8bb2-4485-b338-41165481ebac | OVN Controller agent | compute-0.localdomain    | n/a               | XXX   | UP    | ovn-controller                |
| 51d095a9-3e19-4f04-bcbd-70e74f7e302e | OVN Metadata agent   | compute-0.localdomain    | n/a               | :-)   | UP    | networking-ovn-metadata-agent |
| 7ac54a99-2a0a-44a1-9a77-e48e8c81de87 | OVN Controller agent | controller-1.localdomain | n/a               | XXX   | UP    | ovn-controller                |
| f41e8e6a-e6a2-4a36-8736-32080a6cc8fd | OVN Controller agent | compute-1.localdomain    | n/a               | XXX   | UP    | ovn-controller                |
| e9bdb550-4cc6-4ab5-bfc9-55d68d0caa30 | OVN Metadata agent   | compute-1.localdomain    | n/a               | :-)   | UP    | networking-ovn-metadata-agent |
| 0018c150-8e34-4806-b795-d772a6bbac52 | OVN Controller agent | controller-2.localdomain | n/a               | XXX   | UP    | ovn-controller                |
+--------------------------------------+----------------------+--------------------------+-------------------+-------+-------+-------------------------------+


Expected results:


Additional info:

Comment 1 pkomarov 2019-06-16 18:49:45 UTC
sosreports and stack home are at : http://rhos-release.virt.bos.redhat.com/log/pkomarov_sosreports/BZ_1720947/

Comment 2 pkomarov 2019-06-16 18:55:31 UTC
I'm seeing an ovsdbapp.backend error : 

server.log.11.gz:2019-06-15 20:27:12.946 76 ERROR ovsdbapp.event [req-cc22c728-3c61-4d2e-97a5-a945e5d04759 - - - - -] Unexpected exception in notify_loop: RuntimeError: OVSDB Error: The transaction failed because the IDL has been configured to require a database lock but didn't get it yet or has already lost it
server.log.11.gz:2019-06-15 20:27:12.946 76 ERROR ovsdbapp.event Traceback (most recent call last):
server.log.11.gz:2019-06-15 20:27:12.946 76 ERROR ovsdbapp.event   File "/usr/lib/python3.6/site-packages/ovsdbapp/api.py", line 104, in transaction
server.log.11.gz:2019-06-15 20:27:12.946 76 ERROR ovsdbapp.event     yield self._nested_txns_map[cur_thread_id]
server.log.11.gz:2019-06-15 20:27:12.946 76 ERROR ovsdbapp.event KeyError: 139842817010120
server.log.11.gz:2019-06-15 20:27:12.946 76 ERROR ovsdbapp.event 
server.log.11.gz:2019-06-15 20:27:12.946 76 ERROR ovsdbapp.event During handling of the above exception, another exception occurred:
server.log.11.gz:2019-06-15 20:27:12.946 76 ERROR ovsdbapp.event 
server.log.11.gz:2019-06-15 20:27:12.946 76 ERROR ovsdbapp.event Traceback (most recent call last):
server.log.11.gz:2019-06-15 20:27:12.946 76 ERROR ovsdbapp.event   File "/usr/lib/python3.6/site-packages/ovsdbapp/event.py", line 137, in notify_loop
server.log.11.gz:2019-06-15 20:27:12.946 76 ERROR ovsdbapp.event     match.run(event, row, updates)
server.log.11.gz:2019-06-15 20:27:12.946 76 ERROR ovsdbapp.event   File "/usr/lib/python3.6/site-packages/networking_ovn/ovsdb/ovsdb_monitor.py", line 183, in run
server.log.11.gz:2019-06-15 20:27:12.946 76 ERROR ovsdbapp.event     self.driver.set_port_status_up(row.name)
server.log.11.gz:2019-06-15 20:27:12.946 76 ERROR ovsdbapp.event   File "/usr/lib/python3.6/site-packages/networking_ovn/ml2/mech_driver.py", line 756, in set_port_status_up
server.log.11.gz:2019-06-15 20:27:12.946 76 ERROR ovsdbapp.event     self._update_dnat_entry_if_needed(port_id)
server.log.11.gz:2019-06-15 20:27:12.946 76 ERROR ovsdbapp.event   File "/usr/lib/python3.6/site-packages/networking_ovn/ml2/mech_driver.py", line 744, in _update_dnat_entry_if_needed
server.log.11.gz:2019-06-15 20:27:12.946 76 ERROR ovsdbapp.event     ('external_mac', mac)).execute(check_error=True)
server.log.11.gz:2019-06-15 20:27:12.946 76 ERROR ovsdbapp.event   File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/command.py", line 40, in execute
server.log.11.gz:2019-06-15 20:27:12.946 76 ERROR ovsdbapp.event     txn.add(self)
server.log.11.gz:2019-06-15 20:27:12.946 76 ERROR ovsdbapp.event   File "/usr/lib64/python3.6/contextlib.py", line 88, in __exit__
server.log.11.gz:2019-06-15 20:27:12.946 76 ERROR ovsdbapp.event     next(self.gen)
server.log.11.gz:2019-06-15 20:27:12.946 76 ERROR ovsdbapp.event   File "/usr/lib/python3.6/site-packages/networking_ovn/ovsdb/impl_idl_ovn.py", line 183, in transaction
server.log.11.gz:2019-06-15 20:27:12.946 76 ERROR ovsdbapp.event     yield t
server.log.11.gz:2019-06-15 20:27:12.946 76 ERROR ovsdbapp.event   File "/usr/lib64/python3.6/contextlib.py", line 88, in __exit__
server.log.11.gz:2019-06-15 20:27:12.946 76 ERROR ovsdbapp.event     next(self.gen)
server.log.11.gz:2019-06-15 20:27:12.946 76 ERROR ovsdbapp.event   File "/usr/lib/python3.6/site-packages/ovsdbapp/api.py", line 112, in transaction
server.log.11.gz:2019-06-15 20:27:12.946 76 ERROR ovsdbapp.event     del self._nested_txns_map[cur_thread_id]
server.log.11.gz:2019-06-15 20:27:12.946 76 ERROR ovsdbapp.event   File "/usr/lib/python3.6/site-packages/ovsdbapp/api.py", line 69, in __exit__
server.log.11.gz:2019-06-15 20:27:12.946 76 ERROR ovsdbapp.event     self.result = self.commit()
server.log.11.gz:2019-06-15 20:27:12.946 76 ERROR ovsdbapp.event   File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/transaction.py", line 62, in commit
server.log.11.gz:2019-06-15 20:27:12.946 76 ERROR ovsdbapp.event     raise result.ex
server.log.11.gz:2019-06-15 20:27:12.946 76 ERROR ovsdbapp.event   File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/connection.py", line 122, in run
server.log.11.gz:2019-06-15 20:27:12.946 76 ERROR ovsdbapp.event     txn.results.put(txn.do_commit())
server.log.11.gz:2019-06-15 20:27:12.946 76 ERROR ovsdbapp.event   File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/transaction.py", line 115, in do_commit
server.log.11.gz:2019-06-15 20:27:12.946 76 ERROR ovsdbapp.event     raise RuntimeError(msg)
server.log.11.gz:2019-06-15 20:27:12.946 76 ERROR ovsdbapp.event RuntimeError: OVSDB Error: The transaction failed because the IDL has been configured to require a database lock but didn't get it yet or has already lost it
server.log.11.gz:2019-06-15 20:27:12.946 76 ERROR ovsdbapp.event 
server.log.11.gz:2019-06-15 20:27:26.703 76 ERROR ovsdbapp.backend.ovs_idl.transaction [-] OVSDB Error: The transaction failed because the IDL has been configured to require a database lock but didn't get it yet or has already lost it
server.log.11.gz:2019-06-15 20:27:26.704 76 ERROR ovsdbapp.backend.ovs_idl.transaction [req-b1096a70-2581-4e5b-a879-02f0a662240e - - - - -] Traceback (most recent call last):

Comment 3 pkomarov 2019-06-16 18:56:46 UTC
a workaround is : 
podman restart ovn_controller
then the container gets up with a healthy state

Comment 5 Lucas Alvares Gomes 2019-06-25 12:10:26 UTC
*** Bug 1721560 has been marked as a duplicate of this bug. ***

Comment 12 Jakub Libosvar 2019-07-25 11:31:40 UTC
*** Bug 1732070 has been marked as a duplicate of this bug. ***

Comment 33 Jakub Libosvar 2019-09-17 14:26:43 UTC
*** Bug 1714949 has been marked as a duplicate of this bug. ***

Comment 40 Jakub Libosvar 2020-02-06 14:28:22 UTC
*** Bug 1726217 has been marked as a duplicate of this bug. ***


Note You need to log in before you can comment on or make changes to this bug.