Bug 1706456
Summary: | Reduce the log level in the nova-api for oslo messing warnings | |||
---|---|---|---|---|
Product: | Red Hat OpenStack | Reporter: | Damien Ciabrini <dciabrin> | |
Component: | openstack-nova | Assignee: | smooney | |
Status: | CLOSED EOL | QA Contact: | OSP DFG:Compute <osp-dfg-compute> | |
Severity: | high | Docs Contact: | ||
Priority: | high | |||
Version: | 15.0 (Stein) | CC: | apevec, astupnik, athomas, bdobreli, dasmith, dbecker, eglynn, ggrimaux, jeckersb, jhakimra, kchamart, lhh, lyarwood, mbooth, mburns, michele, morazi, mschuppe, sbauza, sgordon, smooney, stchen | |
Target Milestone: | --- | Keywords: | Patch, Triaged, ZStream | |
Target Release: | --- | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | If docs needed, set a value | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1711794 1913177 2036377 (view as bug list) | Environment: | ||
Last Closed: | 2020-09-30 20:04:20 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1711794, 1913177, 2036377 |
Description
Damien Ciabrini
2019-05-04 23:26:26 UTC
this is an issue that is currently be investigated upstream however it does not functionally break the nova api as the heartbeats are not required for it to function correctly. The current consensus is that it would be incorrect to try and circumvent the wsgi servers thread management to try and force the server to keep the heartbeat thread alive. while the heartbeat can stop this does not break the underlying way that oslo.messaging works in that it will automatically reconnect to rabbitmq when a new api request is received. a workaround has been found for deployment which run the nova-api under uwsgi that we believe will also apply to mod_wsgi. when the nova-api is run under wsgi the wsgi server should be configured to run the wsgi app with 1 thread per interpreter process. to address this upstream we are planning to take two approaches. first document the requirement to use only 1 thread per interpreter process when running the nova-api under mod_wsgi or uwsgi. for mod_wsgi this can be done by setting thread=1 as per https://modwsgi.readthedocs.io/en/develop/configuration-directives/WSGIDaemonProcess.html and similarly for uwsgi https://uwsgi-docs.readthedocs.io/en/latest/Options.html#threads in addition to this the log level of the connection closed message will be reduced to info or debug as this is a recoverable issue that is already handled by how oslo.messaging is designed. disconnection is not an error and should not be reported as such in the logs. from a deployment perspective https://github.com/openstack/tripleo-heat-templates/blob/master/deployment/nova/nova-api-container-puppet.yaml#L209-L244 shoudl be updated to ensure only 1 thread is used and that api parallelism is managed at the process level instead. (In reply to smooney from comment #1) > this is an issue that is currently be investigated upstream however it does > not functionally break > the nova api as the heartbeats are not required for it to function > correctly. The current consensus > is that it would be incorrect to try and circumvent the wsgi servers thread > management to try and > force the server to keep the heartbeat thread alive. > > while the heartbeat can stop this does not break the underlying way that > oslo.messaging > works in that it will automatically reconnect to rabbitmq when a new api > request is received. > > a workaround has been found for deployment which run the nova-api under > uwsgi that we believe > will also apply to mod_wsgi. when the nova-api is run under wsgi the wsgi > server should be > configured to run the wsgi app with 1 thread per interpreter process. > > to address this upstream we are planning to take two approaches. > > first document the requirement to use only 1 thread per interpreter process > when running the nova-api under mod_wsgi or uwsgi. > > for mod_wsgi this can be done by setting thread=1 as per > https://modwsgi.readthedocs.io/en/develop/configuration-directives/ > WSGIDaemonProcess.html > and similarly for uwsgi > https://uwsgi-docs.readthedocs.io/en/latest/Options.html#threads > > in addition to this the log level of the connection closed message will > be reduced to info or debug as this is a recoverable issue that is already > handled > by how oslo.messaging is designed. disconnection is not an error and should > not be > reported as such in the logs. > > from a deployment perspective > https://github.com/openstack/tripleo-heat-templates/blob/master/deployment/ > nova/nova-api-container-puppet.yaml#L209-L244 > shoudl be updated to ensure only 1 thread is used and that api parallelism > is managed at the process level instead. Cloned [1] for deployment, but to note, we already have threads=1 (if not other specified by an operator) as this is the default in puppet-nova [1] - 10-nova_api_wsgi.conf : # ************************************ # Vhost template in module puppetlabs-apache # Managed by Puppet # ************************************ # <VirtualHost 192.168.24.1:8774> ServerName undercloud-0.ctlplane.localdomain ## Vhost docroot DocumentRoot "/var/www/cgi-bin/nova" ## Directories, there should at least be a declaration for /var/www/cgi-bin/nova <Directory "/var/www/cgi-bin/nova"> Options Indexes FollowSymLinks MultiViews AllowOverride None Require all granted </Directory> ## Logging ErrorLog "/var/log/httpd/nova_api_wsgi_error.log" ServerSignature Off CustomLog "/var/log/httpd/nova_api_wsgi_access.log" combined SetEnvIf X-Forwarded-Proto https HTTPS=1 ## WSGI configuration WSGIApplicationGroup %{GLOBAL} WSGIDaemonProcess nova-api display-name=nova_api_wsgi group=nova processes=4 threads=1 user=nova WSGIProcessGroup nova-api WSGIScriptAlias / "/var/www/cgi-bin/nova/nova-api" </VirtualHost> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1711794 [2] https://github.com/openstack/puppet-nova/blob/stable/stein/manifests/wsgi/apache_api.pp#L55-L57 After some discussion, the decision here was to disable oslo.messaging heartbeats entirely in nova.conf for the n-api service. This means there is no need to alter the severity of the log message, as it won't be produced. We believe our existing tcp keepalive settings will effectively serve the same purpose in any case. Disabling heartbeats in oslo.messaging is currently an experimental feature, so it's not clear that upstream would take this solution yet. We need to test this. We believe that it should be covered by standard 'destructive' testing, but it would be good to both confirm this, and give them a heads up to look out for it. Joe, do you know which tests would cover down controllers or rabbit service, and who is responsible for running them? not this has been fixed on master and backports are in flight upstream Closing EOL, OSP 15 has been retired as of Sept 19, 2020 |