1930806 – Tune cinder wsgi/httpd timeout

Bug 1930806 - Tune cinder wsgi/httpd timeout

Summary: Tune cinder wsgi/httpd timeout

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat OpenStack
Classification:	Red Hat
Component:	openstack-tripleo-heat-templates
Sub Component:
Version:	16.1 (Train)
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	z7
Target Release:	16.1 (Train on RHEL 8.2)
Assignee:	Alan Bishop
QA Contact:	Tzach Shefi
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2021-02-19 15:50 UTC by Andreas Karis
Modified:	2024-10-01 17:32 UTC (History)
CC List:	7 users (show)
Fixed In Version:	openstack-tripleo-heat-templates-11.3.2-1.20210705103304.29a02c1.el8ost
Doc Type:	Enhancement
Doc Text:	This enhancement adds the new `CinderRpcResponseTimeout` and `CinderApiWsgiTimeout` parameters to support tuning RPC and API WSGI timeouts in the Block Storage service (cinder). Default timeout values might not be adequate for large deployments and in situations where transactions might be delayed due to system load. + It is now possible to tune the RPC and API WSGI timeouts to prevent transactions prematurely timing out.
Clone Of:
Environment:
Last Closed:	2021-12-09 20:18:00 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
OpenStack gerrit	784962	None	MERGED	Support configuring cinder's RPC and WSGI timeouts	2021-04-15 20:44:56 UTC
Red Hat Issue Tracker	OSP-715	None	None	None	2021-11-18 11:29:34 UTC
Red Hat Product Errata	RHBA-2021:3762	None	None	None	2021-12-09 20:18:26 UTC

Description Andreas Karis 2021-02-19 15:50:11 UTC

Description of problem:
It's currently impossible to tune cinder wsgi/httpd timeout

We can:
* tune haproxy timeouts
* tune cinder RPC timeouts

When detaching volumes, nova calls into haproxy which calls into cinder-api. cinder-api calls cinder-volume via rabbitmq. The default timeouts here are 1 minute for RPC, 1 minute for httpd/wsgi and 2 minutes for  haproxy.

Udner heavy load and/or depending on the backend detach calls might take longer than 1 minute. In our customer case, ca. 2 minutes. 

We tweaked haproxy and rpc calls, but we still have issues with:
~~~
/var/log/containers/httpd/cinder-api/cinder_wsgi_error.log:[Wed Feb 17 12:28:45.685321 2021] [wsgi:error] [pid 10529] [client 10.133.0.136:35866] Timeout when reading response headers from daemon process 'cinder-api': /var/www/cgi-bin/cinder/cinder-api
~~~


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 2 Alan Bishop 2021-02-24 18:04:25 UTC

Until a proper fix is developed, here is a workaround. Add this to any env deployment file:

parameter_defaults:
  ControllerExtraConfig:
    cinder::wsgi::apache::vhost_custom_fragment: 'Timeout 300'

The value will appear in cinder's /etc/httpd/conf.d/10-cinder_wsgi.conf file (not in /etc/httpd/conf/httpd.conf)

Comment 3 Andreas Karis 2021-02-24 18:52:16 UTC

Nice, thanks! We'll try that!

Comment 4 Alan Bishop 2021-02-24 22:02:47 UTC

You can also control the RPC response timeout:

parameter_defaults:
  ControllerExtraConfig:
    cinder::rpc_response_timeout: 120
    cinder::wsgi::apache::vhost_custom_fragment: 'Timeout 300'

The upstream patch I just proposed woud add the following two THT parameters:

CinderRpcResponseTimeout
CinderApiWsgiTimeout

Comment 15 Tzach Shefi 2021-08-01 12:32:22 UTC

Verified on:
openstack-tripleo-heat-templates-11.3.2-1.20210720153309.29a02c1.el8ost.noarch

Used this yaml:
(overcloud) [stack@undercloud-0 ~]$ cat virt/extra_templates.yaml 
parameter_defaults:
    ControllerExtraConfig:
        cinder::rpc_response_timeout: 120
        cinder::wsgi::apache::vhost_custom_fragment: Timeout 300


Resulting in an overcloud deployment with both required setting:

[root@controller-0 ~]# grep rpc_res /var/lib/config-data/puppet-generated/cinder/etc/cinder/cinder.conf 
#rpc_response_timeout = 60
rpc_response_timeout=120



[root@controller-0 ~]# cat /var/lib/config-data/puppet-generated/cinder/etc/httpd/conf.d/10-cinder_wsgi.conf 
# ************************************
# Vhost template in module puppetlabs-apache
# Managed by Puppet
...
...
  ## Custom fragment
  Timeout 300
</VirtualHost>

Good to verify.

Comment 26 errata-xmlrpc 2021-12-09 20:18:00 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform 16.1.7 (Train) bug fix and enhancement advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:3762

Note You need to log in before you can comment on or make changes to this bug.