Bug 1573347 - get-occ-config.sh counts "wrong" cause of missing tonumber screwing up the deployment
Summary: get-occ-config.sh counts "wrong" cause of missing tonumber screwing up the de...
Keywords:
Status: CLOSED EOL
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 11.0 (Ocata)
Hardware: Unspecified
OS: Unspecified
high
urgent
Target Milestone: async
: 11.0 (Ocata)
Assignee: James Slagle
QA Contact: Gurenko Alex
URL:
Whiteboard:
Depends On: 1572677
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-05-01 00:07 UTC by James Slagle
Modified: 2022-03-13 14:56 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1572677
Environment:
Last Closed: 2018-06-22 12:40:07 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description James Slagle 2018-05-01 00:07:53 UTC
+++ This bug was initially created as a clone of Bug #1572677 +++

Description of problem:
When using the get-occ-config.sh on a deployment with more than 10 nodes, we noticed that it mixed up the metadata urls. So we had a few nodes configured correctly, and some had the wrong IPs deployed in the configs (split-stack), for example my_ip was wrong.


Version-Release number of selected component (if applicable):
7.0.3-22

How reproducible:
deploy a split stack environment with more than 10 nodes.


Steps to Reproduce:
1. setup rhosp12
2. deploy with more than 10 compute nodes
3. compare the deployed config with expected config and see the errors


Actual results:
Deployment sometimes got stuck because one node was missing (didn't get its
config because of wrong meta data) or deployment went fine but instead of 11 different nodes you see one node twice.

Expected results:
11 different nodes deployed and configured correctly

Additional info:
I checked how and why this would happen and i think i found the issue. In line
67 there is a sort_by on resource_name. If you check what it generates, you'll
notice that the sorting is *not* done as expected. This is cause it does string
sorting instead of integer sorting. The easy fix was to add a "tonumber" to
the sort_by: sort_by(.resource_name | tonumber)
That way node 11 (10 is the number of the node in the resource list) will end
up at the very end of the list, otherwise it showed up as 3rd node (0, 1, 10, 2).

Demo:
(undercloud) [stack@director ~]$ openstack stack resource list 55a9e155-edab-4734-a655-523339d0c8ac -c resource_name -c physical_resource_id
+---------------+--------------------------------------+
| resource_name | physical_resource_id                 |
+---------------+--------------------------------------+
| 10            | e4ee76de-9794-4bf8-81d5-3fd5c54fc33e |
| 1             | 46f3d8f0-8d5d-47a7-ad59-a2487ce55fb5 |
| 0             | ddf8db32-4bf3-4635-9020-28b07c21d66a |
| 3             | 48745fa1-f5e2-41ca-8a3e-e9da4cf2bf4e |
| 2             | f4a586f8-e93a-49aa-bf88-97436da05cb6 |
| 5             | 137a14db-a641-4782-b5c0-5f554cf76f46 |
| 4             | afe87089-683e-4c8d-bf43-92a11c0b1f96 |
| 7             | bc84c066-827a-4bfe-b830-3532ffd24875 |
| 6             | f80f41b9-f89b-443e-b0bf-ee99bea43654 |
| 9             | c0df0d59-9c44-4712-abd2-54f05cdd59c2 |
| 8             | da466290-adac-4c53-b930-d42d634c8436 |
+---------------+--------------------------------------+
(undercloud) [stack@director ~]$ openstack stack resource list 55a9e155-edab-4734-a655-523339d0c8ac -c resource_name -c physical_resource_id -f json | jq -r "sort_by(.resource_name) | .[] | .physical_resource_id"
ddf8db32-4bf3-4635-9020-28b07c21d66a
46f3d8f0-8d5d-47a7-ad59-a2487ce55fb5
e4ee76de-9794-4bf8-81d5-3fd5c54fc33e
f4a586f8-e93a-49aa-bf88-97436da05cb6
48745fa1-f5e2-41ca-8a3e-e9da4cf2bf4e
afe87089-683e-4c8d-bf43-92a11c0b1f96
137a14db-a641-4782-b5c0-5f554cf76f46
f80f41b9-f89b-443e-b0bf-ee99bea43654
bc84c066-827a-4bfe-b830-3532ffd24875
da466290-adac-4c53-b930-d42d634c8436
c0df0d59-9c44-4712-abd2-54f05cdd59c2
(undercloud) [stack@director ~]$ openstack stack resource list 55a9e155-edab-4734-a655-523339d0c8ac -c resource_name -c physical_resource_id -f json | jq -r "sort_by(.resource_name | tonumber) | .[] | .physical_resource_id"
ddf8db32-4bf3-4635-9020-28b07c21d66a
46f3d8f0-8d5d-47a7-ad59-a2487ce55fb5
f4a586f8-e93a-49aa-bf88-97436da05cb6
48745fa1-f5e2-41ca-8a3e-e9da4cf2bf4e
afe87089-683e-4c8d-bf43-92a11c0b1f96
137a14db-a641-4782-b5c0-5f554cf76f46
f80f41b9-f89b-443e-b0bf-ee99bea43654
bc84c066-827a-4bfe-b830-3532ffd24875
da466290-adac-4c53-b930-d42d634c8436
c0df0d59-9c44-4712-abd2-54f05cdd59c2
e4ee76de-9794-4bf8-81d5-3fd5c54fc33e

So you can clearly see the difference with and without "tonumber".

Comment 1 Scott Lewis 2018-06-22 12:40:07 UTC
OSP11 is now retired, see details at https://access.redhat.com/errata/product/191/ver=11/rhel---7/x86_64/RHBA-2018:1828


Note You need to log in before you can comment on or make changes to this bug.