Bug 1377256

Summary: RHOS9 Overcloud update stalls indefinitely and fails due to update Identifier not set.
Product: Red Hat OpenStack Reporter: Navneet Krishnan <nkrishna>
Component: os-collect-configAssignee: Ben Nemec <bnemec>
Status: CLOSED WORKSFORME QA Contact: Omri Hochman <ohochman>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 9.0 (Mitaka)CC: achernet, apevec, djuran, dmacpher, jjoyce, jstransk, lhh, mandreou, mburns, mlammon, morazi, nkrishna, nlevinki, rhel-osp-director-maint, sbaker, sclewis, srevivo, tvignaud
Target Milestone: asyncKeywords: Reopened, Triaged
Target Release: 9.0 (Mitaka)   
Hardware: Unspecified   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1350489 Environment:
Last Closed: 2016-10-28 05:46:21 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On: 1350489    
Bug Blocks: 1333977    

Comment 1 Navneet Krishnan 2016-09-19 10:24:55 UTC
Description of problem:

Step1 of the controller upgrade fails when upgrading OSP 8 to OSP 9 because os-collect-config is restarted during yum update, which means it doesn't finish the rest of the upgrade step 1 script, and never reports the success of step 1 to Heat. Heat waits and times out.

On the undercloud:  

$for i in `heat deployment-list|grep 2016|grep -v COMPLETE|cut -f2 -d\|` ; do heat deployment-show $i ; done

-------------------------------------------------------------------------------
{
  "status": "IN_PROGRESS",
  "server_id": "59fcf4cc-4442-4c53-a94d-9b633568a391",
  "config_id": "a1e37275-e311-460c-b76c-5f1bd128c1ce",
  "output_values": {
    "deploy_stdout": "Started yum_update.sh on server 59fcf4cc-4442-4c53-a94d-9b633568a391 at Wed Jul 13 22:12:09 EDT 2016\nNot running due to unset update_identifier\n",
    "deploy_stderr": "",
    "update_managed_packages": "false",
    "deploy_status_code": 0
  },
  "creation_time": "2016-07-14T02:06:20",
  "updated_time": "2016-09-05T03:02:43",
  "input_values": {
    "update_identifier": ""
  },
  "action": "UPDATE",
  "status_reason": "Deploy data available",
  "id": "b1255ed1-ddc8-4cff-a669-fe6b00fa21d8"
}
WARNING (shell) "heat deployment-show" is deprecated, please use "openstack software deployment show" instead
{
  "status": "IN_PROGRESS",
  "server_id": "f1e66c29-4d4c-4f50-9a60-505de6df8668",
  "config_id": "3ad35b52-29a3-4b33-9d81-db180fbedf1d",
  "output_values": {
    "deploy_stdout": "Started yum_update.sh on server f1e66c29-4d4c-4f50-9a60-505de6df8668 at Wed Jul 13 22:11:56 EDT 2016\nNot running due to unset update_identifier\n",
    "deploy_stderr": "",
    "update_managed_packages": "false",
    "deploy_status_code": 0
  },
  "creation_time": "2016-07-14T02:06:23",
  "updated_time": "2016-09-05T03:04:05",
  "input_values": {
    "update_identifier": ""
  },

----
.......
--------------------------------------------------------------------------------

os-collect config versions on undercloud upgraded from :

original: os-collect-config-0.1.37-2.el7ost.noarch
updated:  os-collect-config-0.1.37-6.el7ost.noarch

os-collect-config version on overcloud at the time of upgrade:
os-collect-config-0.1.37-2.el7ost.noarch

After manual registration of overcloud nodes to osp8 repos:

os-collect-config-0.1.37-6.el7ost.noarch

Comment 3 Navneet Krishnan 2016-09-30 09:52:05 UTC
The overcloud nodes were registered the RHOPS9 repos set. However only os-collect-config was upgraded to os-collect-config-0.1.37-6.el7ost.noarch. After which the overcloud upgrade step 1 was carried out and succeeded .

It would be better for the rhel-7-server-openstack-8-rpms to be packaged with os-collect-config-0.1.37-6 to avoid this issue.