Bug 1818844 - ovn_controller crashes after setup is up for a day
Summary: ovn_controller crashes after setup is up for a day
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux Fast Datapath
Classification: Red Hat
Component: ovn2.11
Version: FDP 20.D
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: ---
Assignee: Dumitru Ceara
QA Contact: ying xu
URL:
Whiteboard:
Depends On:
Blocks: 1812009 1819604 1822542
TreeView+ depends on / blocked
 
Reported: 2020-03-30 13:59 UTC by Numan Siddique
Modified: 2020-09-09 07:02 UTC (History)
22 users (show)

Fixed In Version: ovn2.11-2.11.1-41.el7fdn
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1812009
: 1819604 1822542 (view as bug list)
Environment:
Last Closed: 2020-05-26 14:07:41 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2020:2318 0 None None None 2020-05-26 14:08:02 UTC

Comment 13 ying xu 2020-04-21 09:13:07 UTC
hi,Itzik Brown,

Could you pls help verify this bug in your env?
thanks very much!

Comment 14 Itzik Brown 2020-04-26 08:30:23 UTC
With the new images after two days the neutron_api crashes

# docker ps -a |grep neutron
901af86d68b6  undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-neutron-server-ovn:20200416.1      kolla_start           2 days ago  Exited (0) 19 minutes ago         neutron_api
ff00b5cd352e  undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-neutron-server-ovn:20200416.1      /usr/bin/bootstra...  2 days ago  Exited (0) 2 days ago             neutron_db_sync
dca975009797  undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-neutron-server-ovn:20200416.1      /bin/bash -c chow...  2 days ago  Exited (0) 2 days ago             neutron_init_logs

From neutron log:

    2020-04-26 03:43:45.119 33 DEBUG neutron.service [req-f4a8fb18-1ab1-4927-8c5d-4103a955aa17 - - - - -] calling wait on <oslo_messaging.rpc.server.RPCServer object at 0x7f79322629b0> _wait /usr/lib/python3.6/site-packages/neutron/service.py:131
    2020-04-26 03:43:45.139 33 DEBUG neutron.service [req-f4a8fb18-1ab1-4927-8c5d-4103a955aa17 - - - - -] calling wait on <oslo_messaging.rpc.server.RPCServer object at 0x7f7932262438> _wait /usr/lib/python3.6/site-packages/neutron/service.py:131
    2020-04-26 03:43:45.144 34 DEBUG oslo_concurrency.lockutils [req-f4a8fb18-1ab1-4927-8c5d-4103a955aa17 - - - - -] Acquired lock "singleton_lock" lock /usr/lib/python3.6/site-packages/oslo_concurrency/lockutils.py:265
    2020-04-26 03:43:45.144 34 DEBUG oslo_concurrency.lockutils [req-f4a8fb18-1ab1-4927-8c5d-4103a955aa17 - - - - -] Releasing lock "singleton_lock" lock /usr/lib/python3.6/site-packages/oslo_concurrency/lockutils.py:281
    2020-04-26 03:43:45.145 34 DEBUG neutron.service [req-f4a8fb18-1ab1-4927-8c5d-4103a955aa17 - - - - -] calling RpcWorker wait() _wait /usr/lib/python3.6/site-packages/neutron/service.py:128
    2020-04-26 03:43:45.145 34 DEBUG neutron.service [req-f4a8fb18-1ab1-4927-8c5d-4103a955aa17 - - - - -] calling wait on <oslo_messaging.rpc.server.RPCServer object at 0x7f79322b74e0> _wait /usr/lib/python3.6/site-packages/neutron/service.py:131
    2020-04-26 03:43:45.156 33 DEBUG neutron.service [req-f4a8fb18-1ab1-4927-8c5d-4103a955aa17 - - - - -] calling wait on <oslo_messaging.rpc.server.RPCServer object at 0x7f7932262c50> _wait /usr/lib/python3.6/site-packages/neutron/service.py:131
    2020-04-26 03:43:45.162 34 DEBUG neutron.service [req-f4a8fb18-1ab1-4927-8c5d-4103a955aa17 - - - - -] returning from RpcWorker wait() _wait /usr/lib/python3.6/site-packages/neutron/service.py:135
    2020-04-26 03:43:45.170 33 DEBUG neutron.service [req-f4a8fb18-1ab1-4927-8c5d-4103a955aa17 - - - - -] returning from RpcWorker wait() _wait /usr/lib/python3.6/site-packages/neutron/service.py:135
    2020-04-26 03:43:45.175 7 INFO oslo_service.service [-] Child 34 exited with status 0
    2020-04-26 03:43:45.184 7 INFO oslo_service.service [-] Child 33 exited with status 0
    2020-04-26 03:43:45.185 7 INFO oslo_service.service [-] Wait called after thread killed. Cleaning up.
    2020-04-26 03:43:45.185 7 DEBUG oslo_service.service [-] Stop services. stop /usr/lib/python3.6/site-packages/oslo_service/service.py:699
    2020-04-26 03:43:45.185 7 DEBUG oslo_service.service [-] Killing children. stop /usr/lib/python3.6/site-packages/oslo_service/service.py:704

Comment 15 Dumitru Ceara 2020-04-27 07:49:37 UTC
(In reply to Itzik Brown from comment #14)
> With the new images after two days the neutron_api crashes
> 
> # docker ps -a |grep neutron
> 901af86d68b6 
> undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-neutron-
> server-ovn:20200416.1      kolla_start           2 days ago  Exited (0) 19
> minutes ago         neutron_api
> ff00b5cd352e 
> undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-neutron-
> server-ovn:20200416.1      /usr/bin/bootstra...  2 days ago  Exited (0) 2
> days ago             neutron_db_sync
> dca975009797 
> undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-neutron-
> server-ovn:20200416.1      /bin/bash -c chow...  2 days ago  Exited (0) 2
> days ago             neutron_init_logs
> 
> From neutron log:
> 
>     2020-04-26 03:43:45.119 33 DEBUG neutron.service
> [req-f4a8fb18-1ab1-4927-8c5d-4103a955aa17 - - - - -] calling wait on
> <oslo_messaging.rpc.server.RPCServer object at 0x7f79322629b0> _wait
> /usr/lib/python3.6/site-packages/neutron/service.py:131
>     2020-04-26 03:43:45.139 33 DEBUG neutron.service
> [req-f4a8fb18-1ab1-4927-8c5d-4103a955aa17 - - - - -] calling wait on
> <oslo_messaging.rpc.server.RPCServer object at 0x7f7932262438> _wait
> /usr/lib/python3.6/site-packages/neutron/service.py:131
>     2020-04-26 03:43:45.144 34 DEBUG oslo_concurrency.lockutils
> [req-f4a8fb18-1ab1-4927-8c5d-4103a955aa17 - - - - -] Acquired lock
> "singleton_lock" lock
> /usr/lib/python3.6/site-packages/oslo_concurrency/lockutils.py:265
>     2020-04-26 03:43:45.144 34 DEBUG oslo_concurrency.lockutils
> [req-f4a8fb18-1ab1-4927-8c5d-4103a955aa17 - - - - -] Releasing lock
> "singleton_lock" lock
> /usr/lib/python3.6/site-packages/oslo_concurrency/lockutils.py:281
>     2020-04-26 03:43:45.145 34 DEBUG neutron.service
> [req-f4a8fb18-1ab1-4927-8c5d-4103a955aa17 - - - - -] calling RpcWorker
> wait() _wait /usr/lib/python3.6/site-packages/neutron/service.py:128
>     2020-04-26 03:43:45.145 34 DEBUG neutron.service
> [req-f4a8fb18-1ab1-4927-8c5d-4103a955aa17 - - - - -] calling wait on
> <oslo_messaging.rpc.server.RPCServer object at 0x7f79322b74e0> _wait
> /usr/lib/python3.6/site-packages/neutron/service.py:131
>     2020-04-26 03:43:45.156 33 DEBUG neutron.service
> [req-f4a8fb18-1ab1-4927-8c5d-4103a955aa17 - - - - -] calling wait on
> <oslo_messaging.rpc.server.RPCServer object at 0x7f7932262c50> _wait
> /usr/lib/python3.6/site-packages/neutron/service.py:131
>     2020-04-26 03:43:45.162 34 DEBUG neutron.service
> [req-f4a8fb18-1ab1-4927-8c5d-4103a955aa17 - - - - -] returning from
> RpcWorker wait() _wait
> /usr/lib/python3.6/site-packages/neutron/service.py:135
>     2020-04-26 03:43:45.170 33 DEBUG neutron.service
> [req-f4a8fb18-1ab1-4927-8c5d-4103a955aa17 - - - - -] returning from
> RpcWorker wait() _wait
> /usr/lib/python3.6/site-packages/neutron/service.py:135
>     2020-04-26 03:43:45.175 7 INFO oslo_service.service [-] Child 34 exited
> with status 0
>     2020-04-26 03:43:45.184 7 INFO oslo_service.service [-] Child 33 exited
> with status 0
>     2020-04-26 03:43:45.185 7 INFO oslo_service.service [-] Wait called
> after thread killed. Cleaning up.
>     2020-04-26 03:43:45.185 7 DEBUG oslo_service.service [-] Stop services.
> stop /usr/lib/python3.6/site-packages/oslo_service/service.py:699
>     2020-04-26 03:43:45.185 7 DEBUG oslo_service.service [-] Killing
> children. stop /usr/lib/python3.6/site-packages/oslo_service/service.py:704

Hi Itzik,

Is ovn-controller still crashing? This BZ was supposed to track fixes in ovn-controller for the segfault that was discovered in BZ 1812009.

If ovn-controller crashed, can you please attach the coredump? If not then there might also be an issue that's not related to core OVN and should be tracked through BZ 1812009.

Thanks,
Dumitru

Comment 16 Itzik Brown 2020-04-27 11:13:10 UTC
ovn controller is not crashing.

Comment 17 Dumitru Ceara 2020-04-27 12:12:10 UTC
(In reply to Itzik Brown from comment #16)
> ovn controller is not crashing.

Thanks, moving back to ON_QA based on this.

Regards,
Dumitru

Comment 20 errata-xmlrpc 2020-05-26 14:07:41 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2318


Note You need to log in before you can comment on or make changes to this bug.