Bug 1980799 - [RHOSP16.1.6] Consistent nova-metadata-api errors without impact
Summary: [RHOSP16.1.6] Consistent nova-metadata-api errors without impact
Keywords:
Status: CLOSED DUPLICATE of bug 1890037
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-nova
Version: 16.1 (Train)
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: OSP DFG:Compute
QA Contact: OSP DFG:Compute
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-07-09 14:55 UTC by camorris@redhat.co
Modified: 2024-10-01 18:56 UTC (History)
8 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-07-13 13:58:16 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker OSP-6088 0 None None None 2022-08-11 08:49:02 UTC

Description camorris@redhat.co 2021-07-09 14:55:49 UTC
Description of problem:
There are rabbit error logs in nova-metadata-api, but without apparent issues to the overcloud. Are they related to https://bugzilla.redhat.com/show_bug.cgi?id=1913177 ?

Version-Release number of selected component (if applicable):
OSP16.1.6

How reproducible:
Multiple times per hour

Steps to Reproduce:
Seeing this in all three seeing environments

Actual results:
~~~
nova/nova-metadata-api.log.1:2021-07-07 08:49:47.619 21 ERROR oslo.messaging._drivers.impl_rabbit [-] Connection failed: [Errno 104] Connection reset by peer (retrying in 0 seconds): ConnectionResetError: [Errno 104] Connection reset by peer
nova/nova-metadata-api.log.1:2021-07-07 08:53:52.088 29 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable connection/channel error occurred, trying to reconnect: [Errno 104] Connection reset by peer
nova/nova-metadata-api.log.1:2021-07-07 09:01:57.128 29 ERROR oslo.messaging._drivers.impl_rabbit [-] Connection failed: [Errno 104] Connection reset by peer (retrying in 0 seconds): ConnectionResetError: [Errno 104] Connection reset by peer
nova/nova-metadata-api.log.1:2021-07-07 09:08:03.305 21 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable connection/channel error occurred, trying to reconnect: [Errno 104] Connection reset by peer
nova/nova-metadata-api.log.1:2021-07-07 09:10:04.573 21 ERROR oslo.messaging._drivers.impl_rabbit [-] Connection failed: [Errno 104] Connection reset by peer (retrying in 0 seconds): ConnectionResetError: [Errno 104] Connection reset by peer
nova/nova-metadata-api.log.1:2021-07-07 09:46:32.235 29 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable connection/channel error occurred, trying to reconnect: [Errno 104] Connection reset by peer
nova/nova-metadata-api.log.1:2021-07-07 10:02:45.992 21 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable connection/channel error occurred, trying to reconnect: [Errno 104] Connection reset by peer
nova/nova-metadata-api.log.1:2021-07-07 10:08:49.377 21 ERROR oslo.messaging._drivers.impl_rabbit [-] Connection failed: [Errno 104] Connection reset by peer (retrying in 0 seconds): ConnectionResetError: [Errno 104] Connection reset by peer
nova/nova-metadata-api.log.1:2021-07-07 10:12:53.623 21 ERROR oslo.messaging._drivers.impl_rabbit [-] Connection failed: [Errno 104] Connection reset by peer (retrying in 0 seconds): ConnectionResetError: [Errno 104] Connection reset by peer
nova/nova-metadata-api.log.1:2021-07-07 10:35:09.030 29 ERROR oslo.messaging._drivers.impl_rabbit [-] Connection failed: [Errno 104] Connection reset by peer (retrying in 0 seconds): ConnectionResetError: [Errno 104] Connection reset by peer
nova/nova-metadata-api.log.1:2021-07-07 11:33:52.701 29 ERROR oslo.messaging._drivers.impl_rabbit [-] Connection failed: [Errno 104] Connection reset by peer (retrying in 0 seconds): ConnectionResetError: [Errno 104] Connection reset by peer
nova/nova-metadata-api.log.1:2021-07-07 12:20:29.037 29 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable connection/channel error occurred, trying to reconnect: [Errno 104] Connection reset by peer
nova/nova-metadata-api.log.1:2021-07-07 12:40:44.477 21 ERROR oslo.messaging._drivers.impl_rabbit [-] Connection failed: [Errno 104] Connection reset by peer (retrying in 0 seconds): ConnectionResetError: [Errno 104] Connection reset by peer
nova/nova-metadata-api.log.1:2021-07-07 12:42:50.177 29 ERROR oslo.messaging._drivers.impl_rabbit [-] Connection failed: [Errno 104] Connection reset by peer (retrying in 0 seconds): ConnectionResetError: [Errno 104] Connection reset by peer
nova/nova-metadata-api.log.1:2021-07-07 13:31:24.052 29 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable connection/channel error occurred, trying to reconnect: [Errno 104] Connection reset by peer
nova/nova-metadata-api.log.1:2021-07-07 14:20:04.552 29 ERROR oslo.messaging._drivers.impl_rabbit [-] Connection failed: [Errno 104] Connection reset by peer (retrying in 0 seconds): ConnectionResetError: [Errno 104] Connection reset by peer
nova/nova-metadata-api.log.1:2021-07-07 14:24:51.283 21 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable connection/channel error occurred, trying to reconnect: [Errno 104] Connection reset by peer
nova/nova-metadata-api.log.1:2021-07-07 14:24:51.485 29 ERROR oslo.messaging._drivers.impl_rabbit [-] Connection failed: [Errno 104] Connection reset by peer (retrying in 0 seconds): ConnectionResetError: [Errno 104] Connection reset by peer
nova/nova-metadata-api.log.1:2021-07-07 14:34:11.725 21 ERROR oslo.messaging._drivers.impl_rabbit [-] Connection failed: [Errno 104] Connection reset by peer (retrying in 0 seconds): ConnectionResetError: [Errno 104] Connection reset by peer
nova/nova-metadata-api.log.1:2021-07-07 14:40:16.804 29 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable connection/channel error occurred, trying to reconnect: [Errno 104] Connection reset by peer
nova/nova-metadata-api.log.1:2021-07-07 15:14:43.404 21 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable connection/channel error occurred, trying to reconnect: [Errno 104] Connection reset by peer
nova/nova-metadata-api.log.1:2021-07-07 15:24:51.444 21 ERROR oslo.messaging._drivers.impl_rabbit [-] Connection failed: [Errno 104] Connection reset by peer (retrying in 0 seconds): ConnectionResetError: [Errno 104] Connection reset by peer
nova/nova-metadata-api.log.1:2021-07-07 15:49:12.133 21 ERROR oslo.messaging._drivers.impl_rabbit [-] Connection failed: [Errno 104] Connection reset by peer (retrying in 0 seconds): ConnectionResetError: [Errno 104] Connection reset by peer
nova/nova-metadata-api.log.1:2021-07-07 16:01:22.093 21 ERROR oslo.messaging._drivers.impl_rabbit [-] Connection failed: [Errno 104] Connection reset by peer (retrying in 0 seconds): ConnectionResetError: [Errno 104] Connection reset by peer
nova/nova-metadata-api.log.1:2021-07-07 17:16:17.011 29 ERROR oslo.messaging._drivers.impl_rabbit [-] Connection failed: [Errno 104] Connection reset by peer (retrying in 0 seconds): ConnectionResetError: [Errno 104] Connection reset by peer
nova/nova-metadata-api.log.1:2021-07-07 17:19:51.796 29 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable connection/channel error occurred, trying to reconnect: [Errno 104] Connection reset by peer
nova/nova-metadata-api.log.1:2021-07-07 17:20:19.899 21 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable connection/channel error occurred, trying to reconnect: [Errno 104] Connection reset by peer
nova/nova-metadata-api.log.1:2021-07-07 17:46:40.682 21 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable connection/channel error occurred, trying to reconnect: [Errno 104] Connection reset by peer
nova/nova-metadata-api.log.1:2021-07-07 17:51:11.308 21 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable connection/channel error occurred, trying to reconnect: [Errno 104] Connection reset by peer
nova/nova-metadata-api.log.1:2021-07-07 18:17:04.838 29 ERROR oslo.messaging._drivers.impl_rabbit [-] Connection failed: [Errno 104] Connection reset by peer (retrying in 0 seconds): ConnectionResetError: [Errno 104] Connection reset by peer
nova/nova-metadata-api.log.1:2021-07-07 18:23:08.921 21 ERROR oslo.messaging._drivers.impl_rabbit [-] Connection failed: [Errno 104] Connection reset by peer (retrying in 0 seconds): ConnectionResetError: [Errno 104] Connection reset by peer
nova/nova-metadata-api.log.1:2021-07-07 18:27:12.145 29 ERROR oslo.messaging._drivers.impl_rabbit [-] Connection failed: [Errno 104] Connection reset by peer (retrying in 0 seconds): ConnectionResetError: [Errno 104] Connection reset by peer
nova/nova-metadata-api.log.1:2021-07-07 18:27:12.685 21 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable connection/channel error occurred, trying to reconnect: [Errno 104] Connection reset by peer
nova/nova-metadata-api.log.1:2021-07-07 18:31:15.022 21 ERROR oslo.messaging._drivers.impl_rabbit [-] Connection failed: [Errno 104] Connection reset by peer (retrying in 0 seconds): ConnectionResetError: [Errno 104] Connection reset by peer
nova/nova-metadata-api.log.1:2021-07-07 18:37:20.045 29 ERROR oslo.messaging._drivers.impl_rabbit [-] Connection failed: [Errno 104] Connection reset by peer (retrying in 0 seconds): ConnectionResetError: [Errno 104] Connection reset by peer
nova/nova-metadata-api.log.1:2021-07-07 19:15:50.332 21 ERROR oslo.messaging._drivers.impl_rabbit [-] Connection failed: [Errno 104] Connection reset by peer (retrying in 0 seconds): ConnectionResetError: [Errno 104] Connection reset by peer
nova/nova-metadata-api.log.1:2021-07-07 19:23:56.820 29 INFO oslo.messaging._drivers.impl_rabbit [-] A recoverable connection/channel error occurred, trying to reconnect: [Errno 104] Connection reset by peer

~~~

Expected results:
No errors

Additional info:

Comment 2 smooney 2021-07-13 13:35:00 UTC
i have not reviewed the sosreports yet but this is expected behavior and i suspect its not a real bug
likely this is nust the nova side of https://bugzilla.redhat.com/show_bug.cgi?id=1890037

the reason this happens is explained here https://bugzilla.redhat.com/show_bug.cgi?id=1913177#c6

effectively the liftime of the nova api an nova metadata api processes are managed by the wsgi server under which they are run.
in this case mod_wsgi in the apache process in the nova-metadata-api container.

one possible way to work around this is to enable running the heartbeat in a real pthread not an eventlet green tread

ill bring this up in our triage call tomorrow but i suspect we will close this as a duplicate. or one of the proceeding bugs.

the conection reset and reconnects are real and unlike the heart beat message we should not suppress this IMO as the loggign and
connection lifetime is working as expected given the constraits imposed by runing under apache mod_wsgi.

Comment 3 smooney 2021-07-13 13:58:16 UTC
reviewing there sosreports i can confirm they have

#heartbeat_in_pthread=false

im going to close this as a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1890037

they can optionally set heartbeat_in_pthread=true in the nova.conf in there contoler nodes
to escape the life cycle model of the apache server and ensure the heartbeat is not terminated.

this will reduce or elimiate the logs messages as the connection should not reset unless
there is a tempory network issue.

*** This bug has been marked as a duplicate of bug 1890037 ***


Note You need to log in before you can comment on or make changes to this bug.