Bug 1302413

Summary: optimize metadata service caching logic to avoid unnecessary data retrieval from conductor
Product: Red Hat OpenStack Reporter: Eoghan Glynn <eglynn>
Component: openstack-novaAssignee: Eoghan Glynn <eglynn>
Status: CLOSED EOL QA Contact: nlevinki <nlevinki>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 5.0 (RHEL 7)CC: berrange, dasmith, eglynn, kchamart, sbauza, sferdjao, sgordon, srevivo, svanders, tvignaud, vromanso
Target Milestone: ---Keywords: ZStream
Target Release: 5.0 (RHEL 7)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-06-30 13:31:03 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Eoghan Glynn 2016-01-27 18:58:05 UTC
Currently the metadata service retrieves data unneccessarily from the conductor in at least two separate cases:

1. where the same data was already retrieved and cached by another metadata service worker running on the same node

2. when a retrieval for that same data is already in flight

This unneccessary load on the conductor could be reduced by using a shared cache across all workers running on the same node, and recording when a retrieval for some URL is already in progress so that if that same data is requested again before the initial fetch has completed, then the subsequent request can simply await it's arrival in the shared cached as opposed to independently re-fetching it in parallel.

Comment 2 Sven Anderson 2016-02-25 16:19:54 UTC
The metadata caching is flawed in general, but there is a symptomatic fix for this related performance issue:

https://bugzilla.redhat.com/show_bug.cgi?id=1244852

upstream, which pre-fetches some of data before caching, here:

https://github.com/openstack/nova/commit/3a761270581d1ac61a3b4669c130d211f1ad5a17#diff-969229657f01b56c336e01497df732d7R1226

and here:

https://github.com/openstack/nova/commit/cc41015d463e11ac11bbaaac0b5c441329dc5f0b#diff-567f52edc17aff6c473d69c341a4cb0cR513


Unfortunately the second change introduced the pre-fetch as a side-effect, so it cannot be backported as is.

Comment 3 Sven Anderson 2016-02-29 11:23:37 UTC
I have submitted two upstream changes that are related to this.

Disabling memached for metadata caching:
https://review.openstack.org/#/c/285530

No parallel queries of the same data (this addresses point 2 in the description):
https://review.openstack.org/#/c/285562

Comment 5 Sven Anderson 2016-03-04 11:27:05 UTC
Clarification: the upstream changes are stop-gap fixes (and might not get accepted). The main problem of caching the db queries instead of a whole python object, in order to make it shareable between different processes, is still unaddressed.

Comment 6 Stephen Gordon 2016-06-16 17:27:10 UTC
Sven can you help me understand what if anything remains that is backportable here versus needing to be addressed in a later release (Newton/Ocata)?

Comment 7 Sven Anderson 2017-01-03 16:33:11 UTC
Stephen, Diana was working on that after me. I have no idea about the current status.

Comment 8 Scott Lewis 2017-06-30 13:31:03 UTC
Red Hat OpenStack Platform version 5 is now End-of-Life, and as such will not have further updates. See https://access.redhat.com/support/policy/updates/openstack/platform/ for full support lifecycle details.