Bug 1612088

Summary: [OSP13] Instance HA is broken due to nova-compute container always restarting
Product: Red Hat OpenStack Reporter: Michele Baldessari <michele>
Component: openstack-tripleo-heat-templatesAssignee: Michele Baldessari <michele>
Status: CLOSED ERRATA QA Contact: pkomarov
Severity: urgent Docs Contact:
Priority: urgent    
Version: 13.0 (Queens)CC: dciabrin, jschluet, mburns, msufiyan
Target Milestone: z2Keywords: Triaged, ZStream
Target Release: 13.0 (Queens)   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: openstack-tripleo-heat-templates-8.0.4-15.el7ost Doc Type: No Doc Update
Doc Text:
No doc needed as this was an interim regression that should not hit customers
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-08-29 16:39:25 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Michele Baldessari 2018-08-03 12:52:34 UTC
Description of problem:
Via change Id915ded03ae5a471ffa2dca13e2da90021279f63 we did the following:
--- a/extraconfig/tasks/instanceha/check-run-nova-compute
+++ b/extraconfig/tasks/instanceha/check-run-nova-compute
@@ -1,4 +1,4 @@
-#!/bin/python -utt
+#!/usr/bin/env python -utt

 import os
 import sys

The problem is that the above script gets invoked via 'exec' in kolla here:
https://github.com/openstack/tripleo-heat-templates/blob/master/docker/services/nova-compute.yaml#L161

Because kolla's start.sh invokes the script in the container via 'exec'.
Exec won't support multiple multiple commands and errors out as follows:
+ echo 'Running command: '\''/var/lib/nova/instanceha/check-run-nova-compute '\'''
+ exec /var/lib/nova/instanceha/check-run-nova-compute
Running command: '/var/lib/nova/instanceha/check-run-nova-compute '
/usr/bin/env: python -utt: No such file or directory

Comment 1 Michele Baldessari 2018-08-04 15:52:00 UTC
*** Bug 1611994 has been marked as a duplicate of this bug. ***

Comment 8 Jon Schlueter 2018-08-07 05:15:57 UTC
Patch fails repoclosure yum install /bin/python not available

Comment 9 Michele Baldessari 2018-08-07 06:00:56 UTC
(In reply to Jon Schlueter from comment #8)
> Patch fails repoclosure yum install /bin/python not available

Hi John,

can you clarify the above a bit? We simply reverted the broken bits to the tested state we had before (aka /bin/python). Also this code runs only inside the nova_compute container and things seem okay there?
| 37f0e3df-9235-4feb-9f84-4867e15eeef8 | compute-0    | ACTIVE | -          | Running     | ctlplane=192.168.24.12 |

(undercloud) [stack@undercloud-0 ~]$ ssh -l heat-admin 192.168.24.12 "sudo docker exec nova_compute sh -c 'ls -l /bin/python'"
Warning: Permanently added '192.168.24.12' (ECDSA) to the list of known hosts.
lrwxrwxrwx. 1 root root 7 Jul 14 14:23 /bin/python -> python2

What are we missing?

thanks,
Michele

Comment 10 Jon Schlueter 2018-08-07 14:18:50 UTC
from IRC discussions, recapping

yum install /bin/python ==> gives nothing
yum install /usr/bin/python ==> works

Something in rpm building scrapes through scripts for shebang lines and adds them as Requires to the built rpm.

Comment 16 errata-xmlrpc 2018-08-29 16:39:25 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:2574