Bug 1302413 - optimize metadata service caching logic to avoid unnecessary data retrieval from conductor
Summary: optimize metadata service caching logic to avoid unnecessary data retrieval f...
Keywords:
Status: CLOSED EOL
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-nova
Version: 5.0 (RHEL 7)
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: 5.0 (RHEL 7)
Assignee: Eoghan Glynn
QA Contact: nlevinki
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-01-27 18:58 UTC by Eoghan Glynn
Modified: 2019-09-09 15:17 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-06-30 13:31:03 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1540526 0 None None None 2016-02-25 16:25:14 UTC
Launchpad 1549814 0 None None None 2016-02-25 16:00:59 UTC
OpenStack gerrit 158066 0 None MERGED Replace conductor get_ec2_ids() with new Instance.ec2_ids attribute 2020-10-02 08:10:12 UTC
OpenStack gerrit 276861 0 None MERGED Avoid lazy-loads in metadata requests 2020-10-02 08:10:12 UTC

Description Eoghan Glynn 2016-01-27 18:58:05 UTC
Currently the metadata service retrieves data unneccessarily from the conductor in at least two separate cases:

1. where the same data was already retrieved and cached by another metadata service worker running on the same node

2. when a retrieval for that same data is already in flight

This unneccessary load on the conductor could be reduced by using a shared cache across all workers running on the same node, and recording when a retrieval for some URL is already in progress so that if that same data is requested again before the initial fetch has completed, then the subsequent request can simply await it's arrival in the shared cached as opposed to independently re-fetching it in parallel.

Comment 2 Sven Anderson 2016-02-25 16:19:54 UTC
The metadata caching is flawed in general, but there is a symptomatic fix for this related performance issue:

https://bugzilla.redhat.com/show_bug.cgi?id=1244852

upstream, which pre-fetches some of data before caching, here:

https://github.com/openstack/nova/commit/3a761270581d1ac61a3b4669c130d211f1ad5a17#diff-969229657f01b56c336e01497df732d7R1226

and here:

https://github.com/openstack/nova/commit/cc41015d463e11ac11bbaaac0b5c441329dc5f0b#diff-567f52edc17aff6c473d69c341a4cb0cR513


Unfortunately the second change introduced the pre-fetch as a side-effect, so it cannot be backported as is.

Comment 3 Sven Anderson 2016-02-29 11:23:37 UTC
I have submitted two upstream changes that are related to this.

Disabling memached for metadata caching:
https://review.openstack.org/#/c/285530

No parallel queries of the same data (this addresses point 2 in the description):
https://review.openstack.org/#/c/285562

Comment 5 Sven Anderson 2016-03-04 11:27:05 UTC
Clarification: the upstream changes are stop-gap fixes (and might not get accepted). The main problem of caching the db queries instead of a whole python object, in order to make it shareable between different processes, is still unaddressed.

Comment 6 Stephen Gordon 2016-06-16 17:27:10 UTC
Sven can you help me understand what if anything remains that is backportable here versus needing to be addressed in a later release (Newton/Ocata)?

Comment 7 Sven Anderson 2017-01-03 16:33:11 UTC
Stephen, Diana was working on that after me. I have no idea about the current status.

Comment 8 Scott Lewis 2017-06-30 13:31:03 UTC
Red Hat OpenStack Platform version 5 is now End-of-Life, and as such will not have further updates. See https://access.redhat.com/support/policy/updates/openstack/platform/ for full support lifecycle details.


Note You need to log in before you can comment on or make changes to this bug.