Bug 1559947 - OpenShift on OpenStack - Nova has a max_limit of 1000
Summary: OpenShift on OpenStack - Nova has a max_limit of 1000
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: python-shade
Version: unspecified
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: z3
: 13.0 (Queens)
Assignee: Antoni Segura Puimedon
QA Contact: Udi Shkalim
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-03-23 15:23 UTC by Matt Bruzek
Modified: 2018-11-14 01:15 UTC (History)
9 users (show)

Fixed In Version: python-shade-1.27.1-2.el7ost
Doc Type: Bug Fix
Doc Text:
When trying to get server lists, OpenStack Nova's API paging feature would return at most N elements, with N being the page size configured in Nova. This was not apparent to shade library users. Shade now deals with the paging on behalf of the users, so the users will get all the servers that they request.
Clone Of:
Environment:
Last Closed: 2018-11-14 01:14:59 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 555876 0 None MERGED list_servers pagination support 2020-06-04 12:41:30 UTC
OpenStack gerrit 600045 0 None MERGED list_servers pagination support 2020-06-04 12:41:30 UTC
Red Hat Product Errata RHBA-2018:3611 0 None None None 2018-11-14 01:15:45 UTC

Description Matt Bruzek 2018-03-23 15:23:24 UTC
Description of problem:

OpenStack Nova has a default max_limit of 1000 results returned from the api service. The OpenShift on OpenStack implementation uses a dynamic inventory openshift-ansible/playbooks/openstack/inventory.py which uses Python to generate a list of all the nodes in the OpenShift cluster.

When attempting scale past 1000 nodes we found that inventory.py was only returning 1000 hosts and this was the latest nodes created, so it did not include the master, infra, or cns nodes in the output. 

Only returning 1000 records could cause a problem with a large cluster on scaleup or other Ansible operations that use the dynamic inventory file.

There are two workarounds for this issue.

1) From the command line you can use `openstack server list --limit -1` this will remove the limit and you are able to see all the nodes in the OpenStack cluster. However the inventory.py does not use the command line and will only return 1000 records.
2) On all controller systems you can edit /etc/nova/nova.conf and uncomment "max_limit" with a value greater than 1000. Then restart the nova services on all  controllers (we have 3 controllers). This also works around the 1000 limit, but could be an issue for customers to edit a production OpenStack cluster and restart services could cause downtime.

Version-Release number of selected component (if applicable): 
We noticed this in 3.9 while using the OpenShift on OpenStack ansible installer. But this is an OpenStack limitation.


How reproducible:
100% of the time we have been over 1000 nodes in our OpenShift on OpenStack cluster.

Steps to Reproduce:
1. Deploy OpenShift on OpenStack following the method outlined here: https://github.com/openshift/openshift-ansible/tree/master/playbooks/openstack
2. Scale up the cluster to past 1000 nodes.
3. Notice that inventory.py only returns 1000 nodes.

Actual results:
The inventory was only returning 1000 hosts at a time.


Expected results:
The inventory should return all hosts in the cluster. If there is an api limit the inventory should keep calling the API until all nodes are returned and then return the cluster listing.


Additional info:

The nova limit is described here: https://docs.openstack.org/ocata/config-reference/compute/api.html

This appears to be a limit for all OpenStack versions not just ocata.

Let me know what other information you may need.

Comment 1 Tomas Sedovic 2018-03-23 15:37:01 UTC
We'll have to fix the inventory script to handle pagination.

We could document & advise the nova change, but I'd prefer to only do that as the last resort. Looks like something we should be able to fix on our side.

Comment 4 Tomas Sedovic 2018-07-17 12:06:20 UTC
This is addressed by the following patch in python-shade:

https://review.openstack.org/#/c/555876/

It should be fixed in 1.28.0 version.

Comment 5 Johnny Liu 2018-08-31 08:49:47 UTC
Based on comment 4, it is kind of openstack bug fix, moving this bug to OpenStack Component.

Comment 16 Udi Shkalim 2018-11-12 11:12:13 UTC
Verified on python2-shade-1.27.1-2.el7ost
Set the mac_limit to 2 on nova.conf

(shiftstack) [cloud-user@ansible-host-0 ~]$ openstack server list
+--------------------------------------+----------------------------------+--------+----------------------------------------------------------------------+----------+---------+
| ID                                   | Name                             | Status | Networks                                                             | Image    | Flavor  |
+--------------------------------------+----------------------------------+--------+----------------------------------------------------------------------+----------+---------+
| 68345b69-09e6-4839-aaf6-1d096deafe84 | master-0.openshift.example.com   | ACTIVE | openshift-ansible-openshift.example.com-net=192.168.99.6, 10.0.0.223 | rhel-7.6 |         |
| 735bad92-7f66-4791-9a2c-7bf42433b289 | app-node-0.openshift.example.com | ACTIVE | openshift-ansible-openshift.example.com-net=192.168.99.9, 10.0.0.219 | rhel-7.6 | m1.node |
+--------------------------------------+----------------------------------+--------+----------------------------------------------------------------------+----------+---------+




With shade it is more then max limit (2)

(shiftstack) [cloud-user@ansible-host-0 ~]$ python -c 'import shade; cloud = shade.openstack_cloud(); print [server.name for server in cloud.list_servers()]'
[u'master-0.openshift.example.com', u'app-node-0.openshift.example.com', u'infra-node-0.openshift.example.com', u'app-node-1.openshift.example.com', u'ansible_host-0', u'openshift_dns-0']




Same for inventory.py:

(shiftstack) [cloud-user@ansible-host-0 ~]$ /usr/share/ansible/openshift-ansible/playbooks/openstack/inventory.py --list
{
    "OSEv3": {
        "hosts": [
            "app-node-0.openshift.example.com",
            "app-node-1.openshift.example.com",
            "master-0.openshift.example.com",
            "infra-node-0.openshift.example.com"
        ],
        "vars": {}
    },
    "_meta": {
        "hostvars": {
            "app-node-0.openshift.example.com": {
                "ansible_host": "10.0.0.219",
                "openshift_ip": "192.168.99.9",
                "openshift_node_group_name": "node-config-compute",
                "openshift_public_hostname": "app-node-0.openshift.example.com",
                "openshift_public_ip": "10.0.0.219",
                "private_v4": "192.168.99.9",
                "public_v4": "10.0.0.219"
            },
            "app-node-1.openshift.example.com": {
                "ansible_host": "10.0.0.210",
                "openshift_ip": "192.168.99.4",
                "openshift_node_group_name": "node-config-compute",
                "openshift_public_hostname": "app-node-1.openshift.example.com",
                "openshift_public_ip": "10.0.0.210",
                "private_v4": "192.168.99.4",
                "public_v4": "10.0.0.210"
            },
            "infra-node-0.openshift.example.com": {
                "ansible_host": "10.0.0.224",
                "openshift_ip": "192.168.99.15",
                "openshift_node_group_name": "node-config-infra",
                "openshift_public_hostname": "infra-node-0.openshift.example.com",
                "openshift_public_ip": "10.0.0.224",
                "private_v4": "192.168.99.15",
                "public_v4": "10.0.0.224"
            },
            "master-0.openshift.example.com": {
                "ansible_host": "10.0.0.223",
                "openshift_ip": "192.168.99.6",
                "openshift_node_group_name": "node-config-master",
                "openshift_public_hostname": "master-0.openshift.example.com",
                "openshift_public_ip": "10.0.0.223",
                "private_v4": "192.168.99.6",
                "public_v4": "10.0.0.223"
            }
        }
    },
    "app": {
        "hosts": [
            "app-node-0.openshift.example.com",
            "app-node-1.openshift.example.com"
        ]
    },
    "cluster_hosts": {
        "hosts": [
            "master-0.openshift.example.com",
            "app-node-0.openshift.example.com",
            "infra-node-0.openshift.example.com",
            "app-node-1.openshift.example.com"
        ]
    },
    "dns": {
        "hosts": []
    },
    "etcd": {
        "hosts": [
            "master-0.openshift.example.com"
        ]
    },
    "glusterfs": {
        "hosts": []
    },
    "infra.openshift.example.com": {
        "hosts": [
            "infra-node-0.openshift.example.com"
        ]
    },
    "infra_hosts": {
        "hosts": [
            "infra-node-0.openshift.example.com"
        ]
    },
    "lb": {
        "hosts": []
    },
    "localhost": {
        "ansible_connection": "local"
    },
    "masters": {
        "hosts": [
            "master-0.openshift.example.com"
        ]
    },
    "masters.openshift.example.com": {
        "hosts": [
            "master-0.openshift.example.com"
        ]
    },
    "nodes": {
        "hosts": [
            "app-node-0.openshift.example.com",
            "app-node-1.openshift.example.com",
            "master-0.openshift.example.com",
            "infra-node-0.openshift.example.com"
        ]
    },
    "nodes.openshift.example.com": {
        "hosts": [
            "app-node-0.openshift.example.com",
            "app-node-1.openshift.example.com"
        ]
    }
}

Comment 18 errata-xmlrpc 2018-11-14 01:14:59 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:3611


Note You need to log in before you can comment on or make changes to this bug.