Bug 1382932 - [ocp-on-osp] Heat stack got stuck when passing openstack-9 and openstack-9-director repos into extra_repository_urls
Summary: [ocp-on-osp] Heat stack got stuck when passing openstack-9 and openstack-9-di...
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 3.3.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: ---
Assignee: Scott Dodson
QA Contact: Gan Huang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-10-08 13:33 UTC by Gan Huang
Modified: 2017-11-23 08:21 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-11-23 08:21:24 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description Gan Huang 2016-10-08 13:33:06 UTC
Description of problem:
When enabling openstack-9 and openstack-9-director repos, heat stack got stuck due to os-collect-config service was failed to start. 

1. Check the logs on bastion node.
$ sudo systemctl status os-collect-config -l
Oct 08 09:02:41 ghuang-new-repo-bastion.example.com os-collect-config[15483]: Traceback (most recent call last):
Oct 08 09:02:41 ghuang-new-repo-bastion.example.com os-collect-config[15483]: File "/usr/bin/os-collect-config", line 6, in
Oct 08 09:02:41 ghuang-new-repo-bastion.example.com os-collect-config[15483]: from os_collect_config.collect import __main_
Oct 08 09:02:41 ghuang-new-repo-bastion.example.com os-collect-config[15483]: File "/usr/lib/python2.7/site-packages/os_col
Oct 08 09:02:41 ghuang-new-repo-bastion.example.com os-collect-config[15483]: from oslo_log import log
Oct 08 09:02:41 ghuang-new-repo-bastion.example.com os-collect-config[15483]: ImportError: No module named oslo_log

2. After installing package python-oslo-log, os-collect-config restarted successfully.
$ sudo yum install python-oslo-log
$ sudo systemctl restart os-collect-config
$ sudo systemctl status os-collect-config
● os-collect-config.service - Collect metadata and run hook commands.
   Loaded: loaded (/usr/lib/systemd/system/os-collect-config.service; enabled; vendor preset: disabled)
   Active: active (running) since Sat 2016-10-08 09:10:16 EDT; 28s ago
 Main PID: 15509 (os-collect-conf)
   CGroup: /system.slice/os-collect-config.service
           └─15509 /usr/bin/python2 /usr/bin/os-collect-config

Oct 08 09:10:29 ghuang-new-repo-bastion.example.com os-collect-config[15509]: INFO:os-refresh-config:Completed phase m...on
Oct 08 09:10:31 ghuang-new-repo-bastion.example.com os-collect-config[15509]: /var/lib/os-collect-config/local-data no...ng
Oct 08 09:10:31 ghuang-new-repo-bastion.example.com os-collect-config[15509]: No local metadata found (['/var/lib/os-c...])
Oct 08 09:10:34 ghuang-new-repo-bastion.example.com os-collect-config[15509]: /var/lib/os-collect-config/local-data no...ng
Oct 08 09:10:34 ghuang-new-repo-bastion.example.com os-collect-config[15509]: No local metadata found (['/var/lib/os-c...])
Oct 08 09:10:37 ghuang-new-repo-bastion.example.com systemd[1]: Started Collect metadata and run hook commands..
Oct 08 09:10:39 ghuang-new-repo-bastion.example.com os-collect-config[15509]: /var/lib/os-collect-config/local-data no...ng
Oct 08 09:10:39 ghuang-new-repo-bastion.example.com os-collect-config[15509]: No local metadata found (['/var/lib/os-c...])
Oct 08 09:10:44 ghuang-new-repo-bastion.example.com os-collect-config[15509]: /var/lib/os-collect-config/local-data no...ng
Oct 08 09:10:44 ghuang-new-repo-bastion.example.com os-collect-config[15509]: No local metadata found (['/var/lib/os-c...])
Hint: Some lines were ellipsized, use -l to show in full.


Version-Release number of selected component (if applicable):
openshift-on-openstack v0.9.1
os-collect-config-0.1.37-6.el7ost.noarch

How reproducible:
100%

Steps to Reproduce:
1.Create a heat stack by passing openstack-9 and openstack-9-director repos in to extra_repository_urls parameter 


Actual results:
The stack got stuck at resource "bastion_host"

Expected results:
Heat stack successed when enabling openstack-9 and openstack-9-director repos.

Additional info:

Comment 1 Jan Provaznik 2016-10-11 07:01:54 UTC
This is caused by OCP3.3 bug https://bugzilla.redhat.com/show_bug.cgi?id=1332432

A workaround for openshift-on-openstack is merged now:
https://github.com/redhat-openstack/openshift-on-openstack/commit/2cc2e9fb8e176fff96b6075df2635a31ff2f4684

Comment 2 Jan Provaznik 2016-10-11 07:04:27 UTC
(In reply to Jan Provaznik from comment #1)
> This is caused by OCP3.3 bug
> https://bugzilla.redhat.com/show_bug.cgi?id=1332432
> 
> A workaround for openshift-on-openstack is merged now:
> https://github.com/redhat-openstack/openshift-on-openstack/commit/
> 2cc2e9fb8e176fff96b6075df2635a31ff2f4684

ignore above, I comment a wrong BZ :/

Comment 3 Jan Provaznik 2016-10-11 07:26:50 UTC
Yes, this is a special case:
- os-collect-config is isntalled and enabled properly (what we check)
- os-collect-config crashes on start - there is a missing package dependency on python-oslo-log so it fails with "ImportError: No module named oslo_log"
- when os-collect-config is not running, then on-delete actions can't be executed

The solution might be:
- to fix OSP9 issue install also python-oslo-log
- call notify_failure if https://github.com/redhat-openstack/openshift-on-openstack/blob/master/collect-config-setup/fragments/configure_config_agent.sh script fails

Comment 4 Sylvain Baubeau 2016-10-11 14:28:31 UTC
Fixed by https://github.com/redhat-openstack/openshift-on-openstack/pull/270

Comment 5 Jan Provaznik 2016-10-13 06:33:37 UTC
Fixed in 0.9.2

Comment 7 Jan Provaznik 2016-10-14 07:13:23 UTC
The message above is a warning, according to logs os-collect-config is running fine on the machine. The reason why stack failed was probably different - can you please check heat events and resources - there should be more details of the failure (here is debugging howto: https://github.com/redhat-openstack/openshift-on-openstack/blob/master/README_debugging.adoc)

Comment 8 Wenkai Shi 2016-10-14 10:27:51 UTC
When use the openstack-9 repos with openshift-on-openstack-0.9.2, installer keep re-exec and stop here until time out.And the stack will be failed.

Version-Release number of selected component:
openshift-on-openstack-0.9.2-1.el7.centos.noarch
os-collect-config-0.1.37-6.el7ost.noarch
python-oslo-log-3.2.0-1.el7ost.noarch

[root@weshi-neutron-bastion ~]# systemctl status os-collect-config
...
Oct 14 05:56:50 weshi-neutron-bastion.example.com os-collect-config[9821]: No local metadata found (['/var/lib/os-collect-config/local-data'])
Oct 14 05:56:50 weshi-neutron-bastion.example.com os-collect-config[9821]: No auth_url configured.
Oct 14 05:56:50 weshi-neutron-bastion.example.com os-collect-config[9821]: Traceback (most recent call last):
Oct 14 05:56:50 weshi-neutron-bastion.example.com os-collect-config[9821]: File "/usr/bin/os-refresh-config", line 6, in <module>
Oct 14 05:56:50 weshi-neutron-bastion.example.com os-collect-config[9821]: from os_refresh_config.os_refresh_config import main
Oct 14 05:56:50 weshi-neutron-bastion.example.com os-collect-config[9821]: File "/usr/lib/python2.7/site-packages/os_refresh_config/os_refresh_config.py", line 26, in <module>
Oct 14 05:56:50 weshi-neutron-bastion.example.com os-collect-config[9821]: import psutil
Oct 14 05:56:50 weshi-neutron-bastion.example.com os-collect-config[9821]: ImportError: No module named psutil
Oct 14 05:56:50 weshi-neutron-bastion.example.com os-collect-config[9821]: Command failed, will not cache new data. Command 'os-refresh-config' returned non-zero exit status 1
Oct 14 05:56:50 weshi-neutron-bastion.example.com os-collect-config[9821]: Sleeping 1.00 seconds before re-exec.
...

[root@weshi-neutron-bastion ~]# os-refresh-config
Traceback (most recent call last):
  File "/usr/bin/os-refresh-config", line 6, in <module>
    from os_refresh_config.os_refresh_config import main
  File "/usr/lib/python2.7/site-packages/os_refresh_config/os_refresh_config.py", line 26, in <module>
    import psutil
ImportError: No module named psutil


[root@weshi-neutron-bastion ~]# yum install python-psutil
[root@weshi-neutron-bastion ~]# os-refresh-config
[root@weshi-neutron-bastion ~]# sudo systemctl restart os-collect-config
[root@weshi-neutron-bastion ~]# sudo systemctl status os-collect-config
● os-collect-config.service - Collect metadata and run hook commands.
   Loaded: loaded (/usr/lib/systemd/system/os-collect-config.service; enabled; vendor preset: disabled)
   Active: active (running) since Fri 2016-10-14 06:12:00 EDT; 12s ago
 Main PID: 6913 (os-collect-conf)
   CGroup: /system.slice/os-collect-config.service
           └─6913 /usr/bin/python2 /usr/bin/os-collect-config

Oct 14 06:12:02 weshi-neutron-bastion.example.com os-collect-config[6913]: No local metadata found (['/var/lib/os-collect-config/local-data'])
Oct 14 06:12:02 weshi-neutron-bastion.example.com os-collect-config[6913]: No auth_url configured.
Oct 14 06:12:04 weshi-neutron-bastion.example.com os-collect-config[6913]: Source [request] Unavailable.
Oct 14 06:12:04 weshi-neutron-bastion.example.com os-collect-config[6913]: /var/lib/os-collect-config/local-data not found. Skipping
Oct 14 06:12:04 weshi-neutron-bastion.example.com os-collect-config[6913]: No local metadata found (['/var/lib/os-collect-config/local-data'])
Oct 14 06:12:04 weshi-neutron-bastion.example.com os-collect-config[6913]: No auth_url configured.
Oct 14 06:12:08 weshi-neutron-bastion.example.com os-collect-config[6913]: Source [request] Unavailable.
Oct 14 06:12:08 weshi-neutron-bastion.example.com os-collect-config[6913]: /var/lib/os-collect-config/local-data not found. Skipping
Oct 14 06:12:08 weshi-neutron-bastion.example.com os-collect-config[6913]: No local metadata found (['/var/lib/os-collect-config/local-data'])
Oct 14 06:12:08 weshi-neutron-bastion.example.com os-collect-config[6913]: No auth_url configured.

Comment 9 Gan Huang 2016-10-17 06:20:36 UTC
Yeah, os-collect-config looks like running fine. But the config file "/etc/os-collect-config.conf" was empty. Probably it's related.

Event-list
http://pastebin.test.redhat.com/421042

Resource-list
http://pastebin.test.redhat.com/421043

Comment 10 Wenkai Shi 2016-11-01 04:48:29 UTC
Verified with openshift-on-openstack-0.9.5-1.el7.centos.noarch.And the service os-collect-config running well.

[root@weshi-osp9repo-bastion ~]# systemctl status os-collect-config
● os-collect-config.service - Collect metadata and run hook commands.
   Loaded: loaded (/usr/lib/systemd/system/os-collect-config.service; enabled; vendor preset: disabled)
   Active: active (running) since Mon 2016-10-31 23:30:52 EDT; 1h 13min ago
...


Note You need to log in before you can comment on or make changes to this bug.