Bug 1573347

Summary: get-occ-config.sh counts "wrong" cause of missing tonumber screwing up the deployment
Product: Red Hat OpenStack Reporter: James Slagle <jslagle>
Component: openstack-tripleo-heat-templatesAssignee: James Slagle <jslagle>
Status: CLOSED EOL QA Contact: Gurenko Alex <agurenko>
Severity: urgent Docs Contact:
Priority: high    
Version: 11.0 (Ocata)CC: agurenko, aschultz, bhaubeck, mburns, ohochman, slinaber, svmichel
Target Milestone: asyncKeywords: Triaged, ZStream
Target Release: 11.0 (Ocata)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1572677 Environment:
Last Closed: 2018-06-22 12:40:07 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1572677    
Bug Blocks:    

Description James Slagle 2018-05-01 00:07:53 UTC
+++ This bug was initially created as a clone of Bug #1572677 +++

Description of problem:
When using the get-occ-config.sh on a deployment with more than 10 nodes, we noticed that it mixed up the metadata urls. So we had a few nodes configured correctly, and some had the wrong IPs deployed in the configs (split-stack), for example my_ip was wrong.


Version-Release number of selected component (if applicable):
7.0.3-22

How reproducible:
deploy a split stack environment with more than 10 nodes.


Steps to Reproduce:
1. setup rhosp12
2. deploy with more than 10 compute nodes
3. compare the deployed config with expected config and see the errors


Actual results:
Deployment sometimes got stuck because one node was missing (didn't get its
config because of wrong meta data) or deployment went fine but instead of 11 different nodes you see one node twice.

Expected results:
11 different nodes deployed and configured correctly

Additional info:
I checked how and why this would happen and i think i found the issue. In line
67 there is a sort_by on resource_name. If you check what it generates, you'll
notice that the sorting is *not* done as expected. This is cause it does string
sorting instead of integer sorting. The easy fix was to add a "tonumber" to
the sort_by: sort_by(.resource_name | tonumber)
That way node 11 (10 is the number of the node in the resource list) will end
up at the very end of the list, otherwise it showed up as 3rd node (0, 1, 10, 2).

Demo:
(undercloud) [stack@director ~]$ openstack stack resource list 55a9e155-edab-4734-a655-523339d0c8ac -c resource_name -c physical_resource_id
+---------------+--------------------------------------+
| resource_name | physical_resource_id                 |
+---------------+--------------------------------------+
| 10            | e4ee76de-9794-4bf8-81d5-3fd5c54fc33e |
| 1             | 46f3d8f0-8d5d-47a7-ad59-a2487ce55fb5 |
| 0             | ddf8db32-4bf3-4635-9020-28b07c21d66a |
| 3             | 48745fa1-f5e2-41ca-8a3e-e9da4cf2bf4e |
| 2             | f4a586f8-e93a-49aa-bf88-97436da05cb6 |
| 5             | 137a14db-a641-4782-b5c0-5f554cf76f46 |
| 4             | afe87089-683e-4c8d-bf43-92a11c0b1f96 |
| 7             | bc84c066-827a-4bfe-b830-3532ffd24875 |
| 6             | f80f41b9-f89b-443e-b0bf-ee99bea43654 |
| 9             | c0df0d59-9c44-4712-abd2-54f05cdd59c2 |
| 8             | da466290-adac-4c53-b930-d42d634c8436 |
+---------------+--------------------------------------+
(undercloud) [stack@director ~]$ openstack stack resource list 55a9e155-edab-4734-a655-523339d0c8ac -c resource_name -c physical_resource_id -f json | jq -r "sort_by(.resource_name) | .[] | .physical_resource_id"
ddf8db32-4bf3-4635-9020-28b07c21d66a
46f3d8f0-8d5d-47a7-ad59-a2487ce55fb5
e4ee76de-9794-4bf8-81d5-3fd5c54fc33e
f4a586f8-e93a-49aa-bf88-97436da05cb6
48745fa1-f5e2-41ca-8a3e-e9da4cf2bf4e
afe87089-683e-4c8d-bf43-92a11c0b1f96
137a14db-a641-4782-b5c0-5f554cf76f46
f80f41b9-f89b-443e-b0bf-ee99bea43654
bc84c066-827a-4bfe-b830-3532ffd24875
da466290-adac-4c53-b930-d42d634c8436
c0df0d59-9c44-4712-abd2-54f05cdd59c2
(undercloud) [stack@director ~]$ openstack stack resource list 55a9e155-edab-4734-a655-523339d0c8ac -c resource_name -c physical_resource_id -f json | jq -r "sort_by(.resource_name | tonumber) | .[] | .physical_resource_id"
ddf8db32-4bf3-4635-9020-28b07c21d66a
46f3d8f0-8d5d-47a7-ad59-a2487ce55fb5
f4a586f8-e93a-49aa-bf88-97436da05cb6
48745fa1-f5e2-41ca-8a3e-e9da4cf2bf4e
afe87089-683e-4c8d-bf43-92a11c0b1f96
137a14db-a641-4782-b5c0-5f554cf76f46
f80f41b9-f89b-443e-b0bf-ee99bea43654
bc84c066-827a-4bfe-b830-3532ffd24875
da466290-adac-4c53-b930-d42d634c8436
c0df0d59-9c44-4712-abd2-54f05cdd59c2
e4ee76de-9794-4bf8-81d5-3fd5c54fc33e

So you can clearly see the difference with and without "tonumber".

Comment 1 Scott Lewis 2018-06-22 12:40:07 UTC
OSP11 is now retired, see details at https://access.redhat.com/errata/product/191/ver=11/rhel---7/x86_64/RHBA-2018:1828