Bug 1296601

Summary: rhel-osp-director: 7.2 - deployed HA overcloud with 2 computes, both are named overcloud-compute-0
Product: Red Hat OpenStack Reporter: Alexander Chuzhoy <sasha>
Component: rhosp-directorAssignee: Lucas Alvares Gomes <lmartins>
Status: CLOSED CURRENTRELEASE QA Contact: yeylon <yeylon>
Severity: high Docs Contact:
Priority: urgent    
Version: 7.0 (Kilo)CC: athomas, dtantsur, jslagle, kbasil, mburns, mcornea, rhel-osp-director-maint, sasha, srevivo
Target Milestone: gaKeywords: Triaged
Target Release: 8.0 (Liberty)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-03-24 11:38:48 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Alexander Chuzhoy 2016-01-07 16:00:08 UTC
rhel-osp-director: 7.2 - deployed HA overcloud with 2 computes, both are named overcloud-compute-0


Environment:
openstack-tripleo-heat-templates-0.8.6-94.el7ost.noarch
openstack-tripleo-image-elements-0.9.6-10.el7ost.noarch   
openstack-tripleo-0.0.7-0.1.1664e566.el7ost.noarch                          
instack-undercloud-2.1.2-36.el7ost.noarch           
openstack-tripleo-common-0.0.1.dev6-5.git49b57eb.el7ost.noarch            
openstack-tripleo-puppet-elements-0.0.1-5.el7ost.noarch  

Steps to reproduce:
1. deploy 7.2 HA overcloud without network isolation.
This was the deployment command:
openstack overcloud deploy --templates --control-scale 3 --compute-scale 1 --ntp-server x.x.x.x --timeout 90
2. source the stackrc file and run "nova list"


Result:
+--------------------------------------+------------------------+--------+------------+-------------+-----------------------+
| ID                                   | Name                   | Status | Task State | Power State | Networks              |
+--------------------------------------+------------------------+--------+------------+-------------+-----------------------+
| 70dfd35e-20e9-491a-9c11-b105282bb5cd | overcloud-compute-0    | ACTIVE | -          | Running     | ctlplane=192.168.0.12 |
| 836cbfa0-a1c4-4fcc-9da6-d2bb0ed54bb9 | overcloud-compute-0    | ACTIVE | -          | Running     | ctlplane=192.168.0.15 |
| a4bdec2b-b33f-4acf-bf33-7f2e15858648 | overcloud-controller-0 | ACTIVE | -          | Running     | ctlplane=192.168.0.16 |
| fa7f5db3-7209-4bfe-b7b4-ec9b3c419bd7 | overcloud-controller-1 | ACTIVE | -          | Running     | ctlplane=192.168.0.17 |
| 4df9383f-4987-4a21-b505-de27d7aa881d | overcloud-controller-2 | ACTIVE | -          | Running     | ctlplane=192.168.0.18 |
+--------------------------------------+------------------------+--------+------------+-------------+-----------------------+


When I login to each compute and run hostname - it shows the same "overcloud-compute-0".

Seems like the issue is intermittent.


Expected result:
different name for each compute.

Comment 4 Hugh Brock 2016-02-05 15:45:14 UTC
Any possibility this is a dup of https://bugzilla.redhat.com/show_bug.cgi?id=1269919 ?

Assigning to dtantsur...

Comment 5 Hugh Brock 2016-02-05 15:45:40 UTC
Any possibility this is a dup of https://bugzilla.redhat.com/show_bug.cgi?id=1269919 ?

Assigning to dtantsur...

Comment 6 Dmitry Tantsur 2016-02-08 10:33:01 UTC
Looks like a duplicate, hard to tell without more information

Comment 7 James Slagle 2016-02-17 17:33:14 UTC
dmitry, is there something we can ask sasha to check? what more information is needed to determine if it's a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1269919 or not?

Comment 8 Dmitry Tantsur 2016-02-17 17:40:04 UTC
Please see that bug: you have to check how many config drive partitions you end up with.

Comment 9 Lucas Alvares Gomes 2016-03-22 16:23:32 UTC
Hi,

> Steps to reproduce:
> 1. deploy 7.2 HA overcloud without network isolation.
> This was the deployment command:
> openstack overcloud deploy --templates --control-scale 3 --compute-scale 1
> --ntp-server x.x.x.x --timeout 90
> 2. source the stackrc file and run "nova list"
> 

The deploy command includes --control-scale 3 --compute-scale 1 (4 nodes) but 5 got deployed, it looks strange.

It seems to me that this is a problem with the CLI for deploy or even that one overcloud-compute-0 was already there before to the deploy command ran.

Can you run the following commands please:

$ ironic node-show 70dfd35e-20e9-491a-9c11-b105282bb5cd --instance | grep provision_updated_at

$ ironic node-show 836cbfa0-a1c4-4fcc-9da6-d2bb0ed54bb9 --instance | grep provision_updated_at

To check when the compute nodes were provisioned.

Comment 10 Alexander Chuzhoy 2016-03-22 18:01:27 UTC
Didn't reproduce it for a while (on 7.3)

Comment 11 Angus Thomas 2016-03-24 11:38:48 UTC
Please reopen this bug it it can be reproduced.