Bug 1351712

Summary: rhel-osp-director: 8.0 -> 9.0 upgrade that follows successfull 8.0GA->8.0async update: "openstack undercloud upgrade" exits with error.
Product: Red Hat OpenStack Reporter: Alexander Chuzhoy <sasha>
Component: rhosp-directorAssignee: Marios Andreou <mandreou>
Status: CLOSED NOTABUG QA Contact: Omri Hochman <ohochman>
Severity: high Docs Contact:
Priority: high    
Version: 9.0 (Mitaka)CC: dbecker, jason.dobies, jcoufal, lbezdick, mburns, morazi, rhel-osp-director-maint, sasha, tvignaud
Target Milestone: gaKeywords: Triaged
Target Release: 9.0 (Mitaka)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: related to https://bugzilla.redhat.com/show_bug.cgi?id=1353346
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-07-29 14:33:57 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1333977    
Attachments:
Description Flags
install-undercloud.log none

Description Alexander Chuzhoy 2016-06-30 15:51:47 UTC
rhel-osp-director:  8.0 -> 9.0 upgrade that follows successfull 8.0GA->8.0async update: "openstack undercloud upgrade" exits with error.


Environment:
openstack-tripleo-heat-templates-2.0.0-12.el7ost.noarch
openstack-puppet-modules-8.1.2-1.el7ost.noarch
instack-undercloud-4.0.0-5.el7ost.noarch
openstack-tripleo-heat-templates-kilo-2.0.0-12.el7ost.noarch
openstack-tripleo-heat-templates-liberty-2.0.0-12.el7ost.noarch


Steps to reproduce:
1. Deploy 8.0GA
2. Update to 8.0async
3. Attempt to upgrade the undercloud to 9.0:
   openstack undercloud upgrade (following pointing the repos to the last puddle)

Result:
+ echo 'puppet apply exited with exit code 6'
puppet apply exited with exit code 6
+ '[' 6 '!=' 2 -a 6 '!=' 0 ']'
+ exit 6
[2016-06-30 11:40:20,799] (os-refresh-config) [ERROR] during configure phase. [Command '['dib-run-parts', '/usr/libexec/os-refresh-config/configure.d']' returned non-zero exit status 6]

[2016-06-30 11:40:20,800] (os-refresh-config) [ERROR] Aborting...
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/lib/python2.7/site-packages/instack_undercloud/undercloud.py", line 845, in install
    _run_orc(instack_env)
  File "/usr/lib/python2.7/site-packages/instack_undercloud/undercloud.py", line 735, in _run_orc
    _run_live_command(args, instack_env, 'os-refresh-config')
  File "/usr/lib/python2.7/site-packages/instack_undercloud/undercloud.py", line 406, in _run_live_command
    raise RuntimeError('%s failed. See log for details.' % name)
RuntimeError: os-refresh-config failed. See log for details.


Expected result:
Successull completion of "openstack undercloud upgrade" command.

Comment 2 Alexander Chuzhoy 2016-06-30 15:53:21 UTC
Created attachment 1174667 [details]
install-undercloud.log

Comment 3 Alexander Chuzhoy 2016-06-30 17:55:46 UTC
Didn't reproduce on:
1. Deploy 8.0async
2. Update undercloud to 9.0

Comment 4 Lukas Bezdicka 2016-07-08 14:52:33 UTC
2016-06-30 11:24:43,003 INFO: [1;31mError: /Stage[main]/Apache::Service/Service[httpd]: Failed to call refresh: Could not restart Service[httpd]: Execution of '/bin/systemctl restart httpd' returned 1: Job for httpd.service failed because the control process exited with error code. See "systemctl status httpd.service" and "journalctl -xe" for details.[0m

could you look for what happend to httpd? config file of httpd and logs?

Comment 5 Marios Andreou 2016-07-22 13:52:11 UTC
I think bug 1351712 and bug 1353346 are related - they are both about a failed undercloud upgrade for 8..9 and afaics both have the same root symptom, that httpd fails to come up on the undercloud during the upgrade. I think it makes sense to keep both as the issue manifests in slightly different circumstances; update 8.. 8 latest and then upgrade to 9, or upgrade 7..8, and then do the 8..9 upgrade.  

I see from logs/description that the root cause is httpd not coming up as part of the upgrade. I can't see enough information, either in the install-undercloud.log from  bug 1351712 or in the description of bug 1353346. Basically we need the httpd logs. I think the rest of the errors from the trace (e.g. keystone related errors) are a consequence of the httpd not starting. 

Can we please have the httpd logs from the undercloud when this happens? Another thought is, I suspect that incorporating a stop on all undercloud services like at https://review.openstack.org/#/c/331804/ before invoking the "openstack undercloud upgrade" might solve this problem. You could try this if you can reproduce on an enviroonment. Otherwise needs logs/more info.

It could yet be another root cause, but if the service stop before upgrade works we can land that to unblock us on these two bugs. For clarity, before "openstack undercloud upgrade" stop services (this has been my workflow for all 8..9 upgrades testing for the undercloud):

        sudo rm -rf /etc/yum.repos.d/*
        sudo rhos-release 9-director -d
        sudo rhos-release 9 -d
        sudo yum clean all && sudo yum clean metadata && sudo yum clean dbcache && sudo yum makecache
        sudo yum -y update
        sudo systemctl stop openstack-*
        sudo systemctl stop neutron-*
        openstack undercloud upgrade

thanks, marios

Comment 6 Alexander Chuzhoy 2016-07-28 18:38:41 UTC
Environment:
openstack-tripleo-heat-templates-kilo-2.0.0-18.el7ost.noarch
instack-undercloud-4.0.0-8.el7ost.noarch
openstack-tripleo-heat-templates-2.0.0-18.el7ost.noarch
openstack-tripleo-heat-templates-liberty-2.0.0-18.el7ost.noarch
openstack-puppet-modules-8.1.5-1.el7ost.noarch


Didn't reproduce the issue.

Comment 7 Jay Dobies 2016-07-29 14:33:57 UTC
Closing as not a bug based on sasha's last comment.