Created attachment 1472906 [details] nova_compute logs Description of problem: I have followed doc[1] for instance-ha deployment and deployed the overcloud successfully, however noticed nova_compute container stuck in restarting state due to which I'm not able to spawn any instance. ~~~ [root@compute-1 ~]# docker ps | grep -i nova_compute 580317c906ce 192.168.24.1:8787/rhosp13/openstack-nova-compute:2018-07-30.2 "kolla_start" 2 days ago Restarting (127) 13 hours ago nova_compute [root@compute-1 ~]# ~~~ [1] https://docs.openstack.org/tripleo-docs/latest/install/advanced_deployment/instance_ha.html Version-Release number of selected component (if applicable): OSP13 How reproducible: Everytime Steps to Reproduce: ================== 1) Create Composite role for compute,controller & CephStorage and the below registry in roles.yaml for additional instance-ha configuration under compute role ~~ - OS::TripleO::Services::ComputeInstanceHA - OS::TripleO::Services::PacemakerRemote ~~ 2) Create fencing.yaml ~~ openstack overcloud generate fencing -a reboot --ipmi-lanplus --ipmi-level administrator --output fencing.yml instackenv.json ~~ 3) Create compute-instanceha.yaml for enbling instance-ha ~~ resource_registry: OS::TripleO::Services::ComputeInstanceHA: /usr/share/openstack-tripleo-heat-templates/puppet/services/pacemaker/compute-instanceha.yaml parameter_defaults: EnableInstanceHA: true ~~ 4) Create a overcloud deployment script ~~ openstack overcloud deploy \ --timeout 100 \ --templates /usr/share/openstack-tripleo-heat-templates \ --stack overcloud \ --libvirt-type kvm \ --ntp-server clock.redhat.com \ -r /home/stack/virt/roles_data.yaml -e /home/stack/virt/compute-instanceha.yaml -e /home/stack/virt/fencing.yaml \ -e /home/stack/virt/internal.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \ -e /home/stack/virt/network/network-environment.yaml \ -e /home/stack/virt/hostnames.yml \ -e /usr/share/openstack-tripleo-heat-templates/environments/ceph-ansible/ceph-ansible.yaml \ -e /home/stack/virt/debug.yaml \ -e /home/stack/virt/nodes_data.yaml \ -e /home/stack/virt/docker-images.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/compute-instanceha.yaml \ --log-file overcloud_deployment_49.log ~~ 5) overcloud deployment was successful, please refer the link for deployment result. However, nova_compute container remain stuck in restarting. http://pastebin.test.redhat.com/626827 Expected result =============== Nova_compute container should be up and remain healthy after deployment Actual result ============= nova_compute container stuck in restarting due to which not able to spawn any instance Additional info:- [root@compute-1 ~]# docker logs nova_compute | tail ++ [[ ! -d /var/log/kolla/nova ]] +++ stat -c %a /var/log/kolla/nova ++ [[ 2755 != \7\5\5 ]] ++ chmod 755 /var/log/kolla/nova ++ . /usr/local/bin/kolla_nova_extend_start +++ [[ ! -d /var/lib/nova/instances ]] + echo 'Running command: '\''/var/lib/nova/instanceha/check-run-nova-compute '\''' Running command: '/var/lib/nova/instanceha/check-run-nova-compute ' + exec /var/lib/nova/instanceha/check-run-nova-compute /usr/bin/env: python -utt: No such file or directory [root@compute-1 ~]#
*** This bug has been marked as a duplicate of bug 1612088 ***