Bug 1294098

Summary: rhel-osp-director: 7.2-> 8.0 upgrade after yum update ran 'openstack undercloud install' failed with error.
Product: Red Hat OpenStack Reporter: Alexander Chuzhoy <sasha>
Component: python-tripleoclientAssignee: Mike Burns <mburns>
Status: CLOSED ERRATA QA Contact: Alexander Chuzhoy <sasha>
Severity: high Docs Contact:
Priority: urgent    
Version: 8.0 (Liberty)CC: brad, clincoln, hbrock, jcoufal, jslagle, mandreou, mbultel, mburns, mcornea, ohochman, rhel-osp-director-maint, sasha
Target Milestone: ga   
Target Release: 8.0 (Liberty)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: python-tripleoclient-0.0.11-6.el7ost openstack-ironic-inspector-2.2.2-2.el7ost instack-undercloud-2.2.2-3.el7ost Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-04-07 21:44:03 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Alexander Chuzhoy 2015-12-24 17:30:32 UTC
rhel-osp-director: 7.2-> 8.0 upgrade after yum update ran 'openstack undercloud install' failed with error.

Environment:
instack-undercloud-2.2.0-1.el7ost.noarch
python-rdomanager-oscplugin-0.0.10-22.el7ost.noarch

No python-tripleoclient RPM installed.

Steps to reproduce:
1. Deploy overcloud 7.2
2. Update the yum repos on the undercloud node to point to 8.0
3. Run 'yum update' and reboot the node.
4. run 'openstack undercloud install'


Result:

                      
+ echo dib-run-parts Thu Dec 24 12:18:00 EST 2015 Running /tmp/tmp6Owi2g/pre-install.d/01-persistent-journal
dib-run-parts Thu Dec 24 12:18:00 EST 2015 Running /tmp/tmp6Owi2g/pre-install.d/01-persistent-journal
+ target_tag=01-persistent-journal
+ date +%s.%N
+ /tmp/tmp6Owi2g/pre-install.d/01-persistent-journal
Job for systemd-journald.service failed because a fatal signal was delivered to the control process. See "systemctl status systemd-journald.service" and "journalctl -xe" for details.
INFO: 2015-12-24 12:18:16,827 -- ############### End stdout/stderr logging ###############
ERROR: 2015-12-24 12:18:16,827 --     Hook FAILED.
ERROR: 2015-12-24 12:18:16,827 -- Failed running command ['dib-run-parts', u'/tmp/tmp6Owi2g/pre-install.d']
  File "/usr/lib/python2.7/site-packages/instack/main.py", line 163, in main
    em.run()
  File "/usr/lib/python2.7/site-packages/instack/runner.py", line 79, in run
    self.run_hook(hook)
  File "/usr/lib/python2.7/site-packages/instack/runner.py", line 174, in run_hook
    raise Exception("Failed running command %s" % command)
ERROR: 2015-12-24 12:18:16,847 -- None
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/lib/python2.7/site-packages/instack_undercloud/undercloud.py", line 555, in install
    _run_instack(instack_env)
  File "/usr/lib/python2.7/site-packages/instack_undercloud/undercloud.py", line 487, in _run_instack
    _run_live_command(args, instack_env, 'instack')
  File "/usr/lib/python2.7/site-packages/instack_undercloud/undercloud.py", line 325, in _run_live_command
    raise RuntimeError('%s failed. See log for details.' % name)
RuntimeError: instack failed. See log for details.
Command 'instack-install-undercloud' returned non-zero exit status 1



Expected result:
Successfully completed run of 'openstack undercloud install'

Comment 2 chris alfonso 2016-01-04 17:40:29 UTC
If you manually install python-tripleoclient does it work. If so, we can chalk this up to a packaging fix. Please let us know.

Comment 3 Mike Burns 2016-01-04 17:46:50 UTC
I'm fixing the obsoletes rule in the packaging.  It was not obsoleting the latest versions of python-rdomanager-oscplugin

Comment 4 Alexander Chuzhoy 2016-01-04 23:25:45 UTC
Unable to install the package:

[stack@instack ~]$ sudo yum install python-tripleoclient    
Loaded plugins: search-disabled-repos       
Resolving Dependencies
--> Running transaction check
---> Package python-rdomanager-oscplugin.noarch 0:0.0.10-22.el7ost will be obsoleted
---> Package python-tripleoclient.noarch 0:0.0.11-6.el7ost will be obsoleting
--> Processing Dependency: tripleo-common for package: python-tripleoclient-0.0.11-6.el7ost.noarch
--> Processing Dependency: sos for package: python-tripleoclient-0.0.11-6.el7ost.noarch
--> Processing Dependency: python-ironic-inspector-client for package: python-tripleoclient-0.0.11-6.el7ost.noarch
--> Running transaction check
---> Package openstack-tripleo-common.noarch 0:0.0.1.dev6-5.git49b57eb.el7ost will be updated
---> Package openstack-tripleo-common.noarch 0:0.0.2-5.el7ost will be an update
---> Package python-ironic-inspector-client.noarch 0:1.2.0-5.el7ost will be installed
--> Processing Dependency: python-oslo-utils >= 2.0.0 for package: python-ironic-inspector-client-1.2.0-5.el7ost.noarch
--> Processing Dependency: python-openstackclient >= 1.5.0 for package: python-ironic-inspector-client-1.2.0-5.el7ost.noarch
--> Processing Dependency: python-cliff >= 1.14.0 for package: python-ironic-inspector-client-1.2.0-5.el7ost.noarch
---> Package sos.noarch 0:3.2-36.el7ost.1 will be installed
--> Running transaction check
---> Package python-cliff.noarch 0:1.10.0-2.el7ost will be updated
---> Package python-cliff.noarch 0:1.15.0-1.el7ost will be an update
--> Processing Dependency: python-stevedore >= 1.5.0 for package: python-cliff-1.15.0-1.el7ost.noarch
--> Processing Dependency: python-unicodecsv for package: python-cliff-1.15.0-1.el7ost.noarch
---> Package python-openstackclient.noarch 0:1.0.3-3.el7ost will be updated
---> Package python-openstackclient.noarch 0:1.7.1-1.el7ost will be an update
--> Processing Dependency: python-os-client-config for package: python-openstackclient-1.7.1-1.el7ost.noarch
--> Processing Dependency: python-cliff-tablib for package: python-openstackclient-1.7.1-1.el7ost.noarch
---> Package python-oslo-utils.noarch 0:1.4.0-1.el7ost will be updated
---> Package python-oslo-utils.noarch 0:2.5.0-1.1.el7ost will be an update
--> Processing Dependency: python-monotonic for package: python-oslo-utils-2.5.0-1.1.el7ost.noarch
--> Running transaction check
---> Package python-cliff-tablib.noarch 0:1.1-3.el7ost will be installed
--> Processing Dependency: python-tablib for package: python-cliff-tablib-1.1-3.el7ost.noarch
---> Package python-monotonic.noarch 0:0.3-1.el7ost will be installed
---> Package python-os-client-config.noarch 0:1.7.4-1.3.el7ost will be installed
--> Processing Dependency: python-appdirs for package: python-os-client-config-1.7.4-1.3.el7ost.noarch
---> Package python-stevedore.noarch 0:1.3.0-1.1.el7ost will be updated
---> Package python-stevedore.noarch 0:1.8.0-1.el7ost will be an update
---> Package python-unicodecsv.noarch 0:0.14.1-2.el7ost will be installed
--> Running transaction check
---> Package python-appdirs.noarch 0:1.4.0-2.1.el7ost will be installed
---> Package python-tablib.noarch 0:0.10.0-1.el7ost will be installed
--> Processing Conflict: python-ironic-inspector-client-1.2.0-5.el7ost.noarch conflicts python-ironic-discoverd
--> Finished Dependency Resolution
Error: python-ironic-inspector-client conflicts with python-ironic-discoverd-1.1.0-8.el7ost.noarch
 You could try using --skip-broken to work around the problem
 You could try running: rpm -Va --nofiles --nodigest

Comment 5 Mike Burns 2016-01-05 14:39:53 UTC
Ok, this is a different obsoletes problem.  Fixed in openstack-ironic-inspector-2.2.2-2.el7ost

Comment 7 Mike Burns 2016-01-05 18:23:59 UTC
I've reproduced this as well.  All dependency issues are resolved, but there is something in DIB or instack-undercloud that is causing os-refresh-config to try to run old 7.0 config.  It seems something in the installation process is copying/building the elements into a different location (/usr/libexec/os-refresh-config) but isn't cleaning up old content from there.  

Observations:

There are files like 100-tuskar-api and 99-restart-discovery which only apply to OSP 7 and not OSP 8.
Other newer files like 99-restart-inspector are missing from the same location as above.

Comment 8 James Slagle 2016-01-05 18:31:56 UTC
part of the issue is definitely what mike has pointed out in https://bugzilla.redhat.com/show_bug.cgi?id=1294098#c7

we'll have to fix that with a patch to clean up /usr/libexec/os-refresh-config at the beginning of each install.

besides that though, why are you rebooting after the yum update? i think that's going to be problematic and cause issues since systemd is going to try and start the services on the boot, but they're going to have old osp-7 configs since the installer hasn't been rerun yet. that reboot step should be removed the instructions unless there's some other reason it's there.

Comment 9 James Slagle 2016-01-05 18:34:17 UTC
as a workaround, you can try sudo rm -rf /usr/libexec/os-refresh-config/* right before re-running openstack undercloud install

Comment 10 Alexander Chuzhoy 2016-01-05 21:30:22 UTC
The workaround in comment #9 worked.

#############################################################################
Undercloud install complete.

The file containing this installation's passwords is at
/home/stack/undercloud-passwords.conf.

There is also a stackrc file at /home/stack/stackrc.

These files are needed to interact with the OpenStack services, and should be
secured.

###########################################################################

Comment 11 Brad P. Crochet 2016-01-11 16:42:12 UTC
(In reply to James Slagle from comment #8)
> part of the issue is definitely what mike has pointed out in
> https://bugzilla.redhat.com/show_bug.cgi?id=1294098#c7
> 
> we'll have to fix that with a patch to clean up
> /usr/libexec/os-refresh-config at the beginning of each install.
> 
> besides that though, why are you rebooting after the yum update? i think
> that's going to be problematic and cause issues since systemd is going to
> try and start the services on the boot, but they're going to have old osp-7
> configs since the installer hasn't been rerun yet. that reboot step should
> be removed the instructions unless there's some other reason it's there.

I can confirm that the reboot causes problems. With the reboot, the ensuing undercloud install run fails. Without reboot, it completes without error.

Comment 12 mathieu bultel 2016-01-14 12:03:20 UTC
After the reboot, I got this error :

Error: Could not find resource 'Keystone_domain[heat_domain]' for relationship from 'Class[Keystone::Roles::Admin]' on node instack
Error: Could not find resource 'Keystone_domain[heat_domain]' for relationship from 'Class[Keystone::Roles::Admin]' on node instack
+ rc=1
+ set -e
+ echo 'puppet apply exited with exit code 1'
puppet apply exited with exit code 1
+ '[' 1 '!=' 2 -a 1 '!=' 0 ']'
+ exit 1
[2016-01-14 07:01:12,765] (os-refresh-config) [ERROR] during configure phase. [Command '['dib-run-parts', '/usr/libexec/os-refresh-config/configure.d']' returned non-zero exit status 1]

[2016-01-14 07:01:12,765] (os-refresh-config) [ERROR] Aborting...
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/lib/python2.7/site-packages/instack_undercloud/undercloud.py", line 562, in install
    _run_orc(instack_env)
  File "/usr/lib/python2.7/site-packages/instack_undercloud/undercloud.py", line 494, in _run_orc
    _run_live_command(args, instack_env, 'os-refresh-config')
  File "/usr/lib/python2.7/site-packages/instack_undercloud/undercloud.py", line 325, in _run_live_command
    raise RuntimeError('%s failed. See log for details.' % name)
RuntimeError: os-refresh-config failed. See log for details.
Command 'instack-install-undercloud' returned non-zero exit status 1

Comment 13 Marius Cornea 2016-01-14 12:08:24 UTC
(In reply to mathieu bultel from comment #12)
> After the reboot, I got this error :
> 
> Error: Could not find resource 'Keystone_domain[heat_domain]' for
> relationship from 'Class[Keystone::Roles::Admin]' on node instack
> Error: Could not find resource 'Keystone_domain[heat_domain]' for
> relationship from 'Class[Keystone::Roles::Admin]' on node instack
> + rc=1
> + set -e
> + echo 'puppet apply exited with exit code 1'
> puppet apply exited with exit code 1
> + '[' 1 '!=' 2 -a 1 '!=' 0 ']'
> + exit 1
> [2016-01-14 07:01:12,765] (os-refresh-config) [ERROR] during configure
> phase. [Command '['dib-run-parts',
> '/usr/libexec/os-refresh-config/configure.d']' returned non-zero exit status
> 1]
> 
> [2016-01-14 07:01:12,765] (os-refresh-config) [ERROR] Aborting...
> Traceback (most recent call last):
>   File "<string>", line 1, in <module>
>   File "/usr/lib/python2.7/site-packages/instack_undercloud/undercloud.py",
> line 562, in install
>     _run_orc(instack_env)
>   File "/usr/lib/python2.7/site-packages/instack_undercloud/undercloud.py",
> line 494, in _run_orc
>     _run_live_command(args, instack_env, 'os-refresh-config')
>   File "/usr/lib/python2.7/site-packages/instack_undercloud/undercloud.py",
> line 325, in _run_live_command
>     raise RuntimeError('%s failed. See log for details.' % name)
> RuntimeError: os-refresh-config failed. See log for details.
> Command 'instack-install-undercloud' returned non-zero exit status 1

This looks like the error described in BZ#1298189

Comment 14 Marios Andreou 2016-02-19 14:54:33 UTC
So I just hit this in an environment where the fix from https://code.engineering.redhat.com/gerrit/66946 "Clean out os-refresh-config on every run" is definitely present. Before the yum update I had instack-undercloud-2.1.2-39.el7ost.noarch and the update delivered instack-undercloud-2.2.2-2.el7ost.noarch

I also sanity checked the actual contents of /usr/lib/python2.7/site-packages/instack_undercloud/undercloud.py and see the new _clean_os_refresh_config() code there. The only thing I can think of is if python used the existing undercloud.pyc rather than the new code when the "openstack undercloud install" is executed. I had to manually run the fix (i.e. remove os-refresh-config/* like in comment #9) to get the undercloud install to complete.

As we discussed with mburns and bra call just now, we could include this into a single "undercloud update" command; we already want to keep the yum update and subsequent undercloud install as an atomic unit. After you've update the undercloud some services are down, in my environment the ceilometer services like in BZ #1293979 but are then started again after the undercloud install has completed.

Comment 15 Marios Andreou 2016-02-19 14:56:16 UTC
> As we discussed with mburns and bra call just now, we could include this

sorry keyboard fail..." mburns and thrash (Brad) just now, we could include"

Comment 16 Brad P. Crochet 2016-02-19 16:04:39 UTC
Fix in https://review.openstack.org/#/c/282359/

The subprocess.check_output call does not expand wildcards without shell=True. This is undesirable, so using the glob module to expand instead.

Comment 18 Alexander Chuzhoy 2016-02-26 00:40:41 UTC
Verified:
Environment:
instack-undercloud-2.2.3-1.el7ost.noarch
python-tripleoclient-0.1.1-4.el7ost.noarch
openstack-ironic-inspector-2.2.4-2.el7ost.noarch



Successfully ran "openstack undercloud install" after yum update on 7.2 to 8.0.

Comment 20 errata-xmlrpc 2016-04-07 21:44:03 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2016-0604.html