Description of problem: It looks like due to a way oslo_cache.memcache_pool implements reaping of old connections, whenever "def acquire()" is called, in some cases the connection returned to the caller from the pool can be already be stale (in CLOSE_WAIT state). Version-Release number of selected component (if applicable): RHOSP13 How reproducible: very hard to reproduce, but it's definitely real: as master branch has merged[3] the fix[4] for it. Steps to Reproduce: See [1] for details Actual results: Services relying on Oslo will fail. In [1] you can see it's vnc to fail, in my case VM can not boot properly: ~~~ $ openstack server show xxx +-------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------+ | Field | Value | +-------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------+ | OS-EXT-STS:power_state | Shutdown | | OS-EXT-STS:task_state | None | | OS-EXT-STS:vm_state | error | | OS-SRV-USG:launched_at | 2020-01-22T01:43:16.000000 | | OS-SRV-USG:terminated_at | None | | created | 2020-01-22T01:41:12Z | | fault | {u'message': u'Unable to get a connection from pool id 139676967376016 after 10 seconds.', u'code': 400, u'created': u'2020-03-18T19:08:11Z'} | | status | ERROR | +-------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------+ ~~~ Expected results: Requesting a backport to RHOSP13. [2] queens branch should have [4] backported, as we can see it is already in [1] train. [0] https://bugs.launchpad.net/oslo.cache/+bug/1775341 [1] https://github.com/openstack/oslo.cache/blob/stable/train/oslo_cache/_memcache_pool.py#L135 [2] https://github.com/openstack/oslo.cache/blob/stable/queens/oslo_cache/_memcache_pool.py#L137 [3] https://github.com/openstack/oslo.cache/blob/master/oslo_cache/_memcache_pool.py#L135 [4] https://opendev.org/openstack/oslo.cache/commit/43c6279a7eff0df7ab22155fb6c165f551cdcf8d
Fixed with python-oslo-cache-1.28.1-2.el7ost https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=27665209
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2719