Bug 1724167 - CLI too slow when listing servers in an environment with 1000's of images
Summary: CLI too slow when listing servers in an environment with 1000's of images
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: python-openstackclient
Version: 13.0 (Queens)
Hardware: x86_64
OS: Linux
low
low
Target Milestone: Upstream M2
: 16.0 (Train on RHEL 8.1)
Assignee: melanie witt
QA Contact: Archit Modi
URL:
Whiteboard:
Depends On:
Blocks: 1732072 1772994 1772995
TreeView+ depends on / blocked
 
Reported: 2019-06-26 11:35 UTC by kforde
Modified: 2023-10-06 18:28 UTC (History)
11 users (show)

Fixed In Version: python-openstackclient-4.0.0-0.20190924092455.aa64eb6.el8ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1732072 1772994 1772995 (view as bug list)
Environment:
Last Closed: 2020-02-06 14:40:58 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
Output from openstack server list --name-lookup-one-by-one --all --debug (2.05 MB, text/plain)
2019-06-28 11:24 UTC, Maurizio Porrato
no flags Details
Output from pip list (2.19 KB, text/plain)
2019-06-28 11:26 UTC, Maurizio Porrato
no flags Details


Links
System ID Private Priority Status Summary Last Updated
OpenStack Storyboard 2006063 0 None None None 2019-07-22 15:32:45 UTC
OpenStack gerrit 668255 0 'None' MERGED Fix BFV server list handling with --name-lookup-one-by-one 2020-09-01 22:08:30 UTC
Red Hat Issue Tracker OSP-29363 0 None None None 2023-10-06 18:28:04 UTC
Red Hat Product Errata RHEA-2020:0283 0 None None None 2020-02-06 14:41:43 UTC

Description kforde 2019-06-26 11:35:26 UTC
Description of problem:




Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 1 kforde 2019-06-26 11:52:23 UTC
(In reply to kforde from comment #0)
Description of problem:

In our Environment with > 3K images it can take up to 60 seconds to list server for a particular project.
We tracked this down to the way image names and flavor names are looked up by the OpenStack Client.
Basically it is querying in batches of 20 the whole image itinerary. 


Version-Release number of selected component (if applicable):

python2-openstackclient-3.14.3-2.el7ost.noarch 

How reproducible:


Steps to Reproduce:
1. time openstack server list --project XXXX
                                                                                                                                                                                                       


Actual results:

Takes a long time to return the information even for a single small project.

real    0m50.869s

Expected results:

Take between 2 - 10 seconds to return the information.


Additional info:

We worked around this by supplying the '--no-name-lookup' flag to the OpenStack Client. 
This will return only image and flavor IDs and runs in about 2 seconds (real    0m2.365s)

Additionally running with the Nova client returns servers in a similar time to the OpenStack Client with the '--no-name-lookup' flag.

There is an option called '--name-lookup-one-by-one' available in version 3.19.0 of the OpenStack Client. 
It returns the server list and image names much quicker (around 10 seconds),
however it does not work (throws exception: 'unicode' object has no attribute 'get') with any other options, for example, '--project', '--all'.

I would suggest that the default should be changed for 'server list' operations to either:

1) not include the names in the output in favor of speed or,
2) backport (and fix) '--name-lookup-one-by-one' to Queens.

Comment 2 Julie Pichon 2019-06-26 13:02:53 UTC
Moving to DFG:Compute as the --name-lookup-one-by-one flag is specific to the "server list" command.

Comment 3 melanie witt 2019-06-28 00:25:46 UTC
(In reply to kforde from comment #1)
> (In reply to kforde from comment #0)
> Additional info:
> 
> We worked around this by supplying the '--no-name-lookup' flag to the
> OpenStack Client. 
> This will return only image and flavor IDs and runs in about 2 seconds (real
> 0m2.365s)
> 
> Additionally running with the Nova client returns servers in a similar time
> to the OpenStack Client with the '--no-name-lookup' flag.

I checked the novaclient code and unsurprisingly, its 'nova list' command does not do any image or flavor name lookup. So, the similarity with '--no-name-lookup' in openstackclient makes sense.

> There is an option called '--name-lookup-one-by-one' available in version
> 3.19.0 of the OpenStack Client. 
> It returns the server list and image names much quicker (around 10 seconds),
> however it does not work (throws exception: 'unicode' object has no
> attribute 'get') with any other options, for example, '--project', '--all'.

I just tried some examples locally: 'openstack server list --name-lookup-one-by-one --all' and 'openstack server list --name-lookup-one-by-one --project admin' with version 3.18.0 and am not able to reproduce the bug (command returns results fine). Same for version 3.19.0.

> I would suggest that the default should be changed for 'server list'
> operations to either:
> 
> 1) not include the names in the output in favor of speed or,
> 2) backport (and fix) '--name-lookup-one-by-one' to Queens.

I expect option 2 will be more successful approach for the near term, but so far I'm unable to reproduce the bug.

Comment 4 Maurizio Porrato 2019-06-28 11:24:17 UTC
Created attachment 1585569 [details]
Output from openstack server list --name-lookup-one-by-one --all --debug

Comment 5 Maurizio Porrato 2019-06-28 11:26:42 UTC
Created attachment 1585582 [details]
Output from pip list

Comment 6 melanie witt 2019-06-28 15:13:07 UTC
Thank you for attaching the debug output. The full traceback gives us a nice hint about what's going on:

Traceback (most recent call last):
  File "/home/mporrato/.venvs/os/lib64/python3.6/site-packages/cliff/app.py", line 401, in run_subcommand
    result = cmd.run(parsed_args)
  File "/home/mporrato/.venvs/os/lib64/python3.6/site-packages/osc_lib/command/command.py", line 41, in run
    return super(Command, self).run(parsed_args)
  File "/home/mporrato/.venvs/os/lib64/python3.6/site-packages/cliff/display.py", line 116, in run
    column_names, data = self.take_action(parsed_args)
  File "/home/mporrato/.venvs/os/lib64/python3.6/site-packages/openstackclient/compute/v2/server.py", line 1325, in take_action
    (s.image.get('id') for s in data))):
  File "/home/mporrato/.venvs/os/lib64/python3.6/site-packages/openstackclient/compute/v2/server.py", line 1325, in <genexpr>
    (s.image.get('id') for s in data))):
AttributeError: 'str' object has no attribute 'get'

This shows us that the 'image' attribute on the server object is a string instead of an image object when the bug occurs. Thinking about when/why the image attribute would be a string instead of an image object, this would occur when a server has been booted from a volume (and thus has no image).

I tried booting a server from a volume and I'm now able to reproduce the bug. I'll begin work on a fix for upstream.

Comment 8 melanie witt 2019-06-28 15:52:35 UTC
Here's an example of the response body for the boot-from-volume server:

RESP BODY: {"servers": [{"OS-EXT-STS:task_state": null, "addresses": {}, "links": [{"href": "https://127.0.0.1/compute/v2.1/servers/a7af243e-405b-435f-ae29-5b92270cefef", "rel": "self"}, {"href": "https://127.0.0.1/compute/servers/a7af243e-405b-435f-ae29-5b92270cefef", "rel": "bookmark"}], "image": "", <=== image is the empty string
"OS-EXT-STS:vm_state": "error", "OS-EXT-SRV-ATTR:instance_name": "instance-00000004", "OS-SRV-USG:launched_at": null, "flavor": {"id": "42", "links": [{"href": "https://127.0.0.1/compute/flavors/42", "rel": "bookmark"}]}, "id": "a7af243e-405b-435f-ae29-5b92270cefef", "user_id": "d36195e9a8e54eabbb4502eecf6f2df8", "OS-DCF:diskConfig": "MANUAL", "accessIPv4": "", "accessIPv6": "", "OS-EXT-STS:power_state": 0, "OS-EXT-AZ:availability_zone": "", "config_drive": "", "status": "ERROR", "updated": "2019-06-28T14:59:34Z", "hostId": "", "OS-EXT-SRV-ATTR:host": null, "OS-SRV-USG:terminated_at": null, "key_name": null, "OS-EXT-SRV-ATTR:hypervisor_hostname": null, "name": "bfv", "created": "2019-06-28T14:59:24Z", "tenant_id": "e4dd897770e14a3ba4ccf545d21832f9", "os-extended-volumes:volumes_attached": [], "metadata": {}},

Comment 13 errata-xmlrpc 2020-02-06 14:40:58 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:0283


Note You need to log in before you can comment on or make changes to this bug.