Bug 1611994 - nova_compute container stuck in restarting state after instance-ha deployment
Summary: nova_compute container stuck in restarting state after instance-ha deployment
Keywords:
Status: CLOSED DUPLICATE of bug 1612088
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 13.0 (Queens)
Hardware: x86_64
OS: Linux
urgent
urgent
Target Milestone: ---
: ---
Assignee: Emilien Macchi
QA Contact: Gurenko Alex
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-08-03 08:14 UTC by MD Sufiyan
Modified: 2018-08-04 15:52 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-08-04 15:52:00 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
nova_compute logs (97.33 KB, text/plain)
2018-08-03 08:14 UTC, MD Sufiyan
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1785245 0 None None None 2018-08-04 15:33:32 UTC
OpenStack gerrit 588552 0 None MERGED Revert shebang change for InstanceHA startup script 2020-11-15 01:15:09 UTC
OpenStack gerrit 588880 0 None MERGED Revert shebang change for InstanceHA startup script 2020-11-15 01:15:09 UTC

Description MD Sufiyan 2018-08-03 08:14:36 UTC
Created attachment 1472906 [details]
nova_compute logs

Description of problem:

I have followed doc[1] for instance-ha deployment and deployed the overcloud successfully, however noticed nova_compute container stuck in restarting state due to which I'm not able to spawn any instance.

~~~
[root@compute-1 ~]# docker ps | grep -i nova_compute
580317c906ce        192.168.24.1:8787/rhosp13/openstack-nova-compute:2018-07-30.2                "kolla_start"       2 days ago          Restarting (127) 13 hours ago                       nova_compute
[root@compute-1 ~]# 
~~~


[1] https://docs.openstack.org/tripleo-docs/latest/install/advanced_deployment/instance_ha.html

Version-Release number of selected component (if applicable):
OSP13

How reproducible:
Everytime

Steps to Reproduce:
==================

1) Create Composite role for compute,controller & CephStorage and the below registry in roles.yaml for additional instance-ha configuration under compute role

~~
    - OS::TripleO::Services::ComputeInstanceHA
    - OS::TripleO::Services::PacemakerRemote
~~

2) Create fencing.yaml 

~~
openstack overcloud generate fencing -a reboot --ipmi-lanplus --ipmi-level administrator --output fencing.yml instackenv.json
~~

3) Create compute-instanceha.yaml for enbling instance-ha

~~
resource_registry:
  OS::TripleO::Services::ComputeInstanceHA: /usr/share/openstack-tripleo-heat-templates/puppet/services/pacemaker/compute-instanceha.yaml

parameter_defaults:
  EnableInstanceHA: true
~~

4) Create a overcloud deployment script 

~~
openstack overcloud deploy \
--timeout 100 \
--templates /usr/share/openstack-tripleo-heat-templates \
--stack overcloud \
--libvirt-type kvm \
--ntp-server clock.redhat.com \
-r /home/stack/virt/roles_data.yaml -e /home/stack/virt/compute-instanceha.yaml -e /home/stack/virt/fencing.yaml \
-e /home/stack/virt/internal.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \
-e /home/stack/virt/network/network-environment.yaml \
-e /home/stack/virt/hostnames.yml \
-e /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-ansible.yaml \
-e /home/stack/virt/debug.yaml \
-e /home/stack/virt/nodes_data.yaml \
-e /home/stack/virt/docker-images.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/compute-instanceha.yaml \
--log-file overcloud_deployment_49.log
~~

5) overcloud deployment was successful, please refer the link for deployment result. However, nova_compute container remain stuck in restarting.

http://pastebin.test.redhat.com/626827

Expected result
===============

Nova_compute container should be up and remain healthy after deployment 

Actual result
=============

nova_compute container stuck in restarting due to which not able to spawn any instance

Additional info:-

[root@compute-1 ~]# docker logs nova_compute | tail
++ [[ ! -d /var/log/kolla/nova ]]
+++ stat -c %a /var/log/kolla/nova
++ [[ 2755 != \7\5\5 ]]
++ chmod 755 /var/log/kolla/nova
++ . /usr/local/bin/kolla_nova_extend_start
+++ [[ ! -d /var/lib/nova/instances ]]
+ echo 'Running command: '\''/var/lib/nova/instanceha/check-run-nova-compute '\'''
Running command: '/var/lib/nova/instanceha/check-run-nova-compute '
+ exec /var/lib/nova/instanceha/check-run-nova-compute
/usr/bin/env: python -utt: No such file or directory
[root@compute-1 ~]#

Comment 2 Michele Baldessari 2018-08-04 15:52:00 UTC

*** This bug has been marked as a duplicate of bug 1612088 ***


Note You need to log in before you can comment on or make changes to this bug.