Bug 2100879

Summary: From time to time memcached stops processing requests and brings down OpenStack control plane
Product: Red Hat OpenStack Reporter: David Hill <dhill>
Component: python-dogpile-cacheAssignee: Hervé Beraud <hberaud>
Status: CLOSED ERRATA QA Contact: nlevinki <nlevinki>
Severity: high Docs Contact:
Priority: high    
Version: 16.1 (Train)CC: apevec, aruffin, astupnik, bdobreli, camorris, dciabrin, dhill, dhruv, dsedgmen, enothen, ggrimaux, hberaud, jelynch, jhakimra, jjoyce, jmarcian, jmelvin, joflynn, jpretori, jraju, jschluet, lhh, lmiccini, mbayer, mburns, mgarciac, michal.vasko, michele, msecaur, satmakur, schhabdi, tkajinam, xili, ykulkarn, yusuf, yusufhadiwinata
Target Milestone: z9Keywords: Triaged
Target Release: 16.1 (Train on RHEL 8.2)Flags: hberaud: needinfo+
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: python-dogpile-cache-0.6.8-3.el8ost Doc Type: Bug Fix
Doc Text:
Before this update, dogpile.cache support for dead_retry and socket_timeout was not implemented for the memcached back end. The oslo.cache mechanism filled the arguments dictionary with values for dead_retry and socket_timeout, but dogpile.cache ignored the values so the defaults of 30s for dead_retry and 3s for socket_timeout were used. When using dogpile.cache.memcached as the cache back end on the Identity service (keystone), and then taking down one of the memcached instances, the memcache server objects set their deaduntil value to 30 seconds in the future. When a request came in to an API server with two memcached servers configured, one of which was unroutable, it took approximately 15 seconds for it to try each of those servers in each thread it created and reach the three-second socket timeout limit every time it encountered the one that was down. By the time the user issued another request, the deaduntil value was reached and the whole cycle was repeated. With this update, dogpile.cache consumes dead_retry and socket_timeout arguments passed by oslo.cache.
Story Points: ---
Clone Of: 1893205
: 2101864 (view as bug list) Environment:
Last Closed: 2022-12-07 20:27:07 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1893205    
Bug Blocks: 2046185, 2101864, 2101865    

Comment 27 errata-xmlrpc 2022-12-07 20:27:07 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform 16.1.9 bug fix and enhancement advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:8795