Bug 1873470 - Pre-upgrade validations fail because of missing python3 command in overcloud nodes
Summary: Pre-upgrade validations fail because of missing python3 command in overcloud ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: python-tripleoclient
Version: 16.1 (Train)
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: z3
: 16.1 (Train on RHEL 8.2)
Assignee: mathieu bultel
QA Contact: David Rosenfeld
URL:
Whiteboard:
: 1894000 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-08-28 12:32 UTC by Takashi Kajinami
Modified: 2024-03-25 16:23 UTC (History)
19 users (show)

Fixed In Version: python-tripleoclient-12.3.2-1.20200914164930.el8ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-12-15 18:36:32 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 749735 0 None MERGED Make python interpreter option for ansible validation run 2021-02-19 14:20:13 UTC
Red Hat Product Errata RHEA-2020:5413 0 None None None 2020-12-15 18:37:03 UTC

Description Takashi Kajinami 2020-08-28 12:32:43 UTC
Description of problem:

The following failures are observed during pre-upgrade validation[2] prior to updade of overcloud nodes from 13 to 16.1
 [1] https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.1/html-single/framework_for_upgrades_13_to_16.1/index#validating-the-pre-upgrade-requirements

~~~
(undercloud) [stack@undercloud-0 ~]$ openstack tripleo validator run --group pre-upgrade
...
+--------------------------------------+-------------------------------------+--------+-----------------------+----------------------------------------------------------------------------+---------------------+-------------+
| UUID                                 | Validations                         | Status | Host Group(s)         | Status by Host                                                             | Unreachable Host(s) | Duration    |
+--------------------------------------+-------------------------------------+--------+-----------------------+----------------------------------------------------------------------------+---------------------+-------------+
| 525400df-30c2-0b1c-bbbc-00000000000b | openstack-endpoints                 | PASSED | undercloud            | undercloud                                                                 |                     | 0:00:04.331 |
| 525400df-30c2-26eb-509b-00000000000b | image-serve                         | PASSED | undercloud            | undercloud                                                                 |                     | 0:00:02.268 |
| 525400df-30c2-3d3e-7a30-00000000000b | service-status                      | FAILED | undercloud, overcloud | compute-0, compute-1, controller-0, controller-1, controller-2, undercloud |                     | 0:00:00.672 |
| 525400df-30c2-6fb6-9d16-00000000000b | containerized-undercloud-docker     | PASSED | undercloud            | undercloud                                                                 |                     | 0:00:00.798 |
| 525400df-30c2-78ff-0380-00000000000b | container-status                    | FAILED | undercloud, overcloud | compute-0, compute-1, controller-0, controller-1, controller-2, undercloud |                     | 0:00:02.809 |
| 525400df-30c2-7a76-0b50-00000000000b | undercloud-disk-space-pre-upgrade   | PASSED | undercloud            | undercloud                                                                 |                     | 0:00:02.296 |
| 525400df-30c2-7db9-49f5-00000000000b | ironic-boot-configuration           | PASSED | undercloud            | undercloud                                                                 |                     | 0:00:01.341 |
| 525400df-30c2-8b5b-c301-00000000000b | undercloud-heat-purge-deleted       | PASSED | undercloud            | undercloud                                                                 |                     | 0:00:01.889 |
| 525400df-30c2-a0c8-37dc-00000000000b | collect-flavors-and-verify-profiles | FAILED | undercloud            | undercloud                                                                 |                     | 0:00:02.143 |
| 525400df-30c2-b855-8a1d-00000000000b | check-ftype                         | FAILED | undercloud, overcloud | compute-0, compute-1, controller-0, controller-1, controller-2, undercloud |                     | 0:00:00.683 |
| 525400df-30c2-b956-c41c-00000000000b | undercloud-ram                      | PASSED | undercloud            | undercloud                                                                 |                     | 0:00:01.967 |
| 525400df-30c2-bef4-d515-00000000000b | undercloud-service-status           | PASSED | undercloud            | undercloud                                                                 |                     | 0:00:01.817 |
| 525400df-30c2-cf23-c1d9-00000000000b | repos                               | FAILED | undercloud, overcloud | compute-0, compute-1, controller-0, controller-1, controller-2, undercloud |                     | 0:00:08.172 |
| 525400df-30c2-d264-5e62-00000000000b | check-latest-packages-version       | PASSED | undercloud            | undercloud                                                                 |                     | 0:01:47.883 |
| 525400df-30c2-df65-a431-00000000000b | nova-status                         | FAILED | nova_api              | controller-0, controller-1, controller-2                                   |                     | 0:00:00.601 |
| 525400df-30c2-e1f7-a587-00000000000b | validate-selinux                    | FAILED | all                   | compute-0, compute-1, controller-0, controller-1, controller-2, undercloud |                     | 0:00:02.700 |
| 525400df-30c2-e264-5510-00000000000b | node-health                         | PASSED | undercloud            | undercloud                                                                 |                     | 0:00:03.364 |
| 525400df-30c2-e8ae-a513-00000000000b | stack-health                        | PASSED | undercloud            | undercloud                                                                 |                     | 0:00:02.420 |
+--------------------------------------+-------------------------------------+--------+-----------------------+----------------------------------------------------------------------------+---------------------+-------------+
~~~

Among these failures, all of the validation failures with overcloud nodes were caused
by missing python3 command.

~~~
(undercloud) [stack@undercloud-0 ~]$ openstack tripleo validator show run 525400df-30c2-3d3e-7a30-00000000000b
{
    "task": {
        "hosts": {
            "compute-0": {
                "_ansible_no_log": false,
                "action": "command",
                "changed": false,
                "failed": true,
                "module_stderr": "Shared connection to 192.168.24.37 closed.\r\n",
                "module_stdout": "/bin/sh: /usr/bin/python3: No such file or directory\r\n",
                "msg": "The module failed to execute correctly, you probably need to set the interpreter.\nSee stdout/stderr for the exact error",
                "rc": 127
            }
        },
        "name": "get failed systemd units",
        "status": "FAILED"
    }
}
...
~~~

I think these errors are "reasonable" regarding the fact that overcloud nodes still have OSP13 installed
and don't require python3.
We need some consideration in tripleo-validation or documentation to avoid this false errors.

Version-Release number of selected component (if applicable):
RHOSP13z12
~~~
ansible-tripleo-ipsec-8.1.1-0.20190513184007.7eb892c.el7ost.noarch
openstack-tripleo-common-8.7.1-20.el7ost.noarch
openstack-tripleo-common-containers-8.7.1-20.el7ost.noarch
openstack-tripleo-heat-templates-8.4.1-58.1.el7ost.noarch
openstack-tripleo-image-elements-8.0.3-1.el7ost.noarch
openstack-tripleo-puppet-elements-8.1.1-2.el7ost.noarch
openstack-tripleo-ui-8.3.2-3.el7ost.noarch
openstack-tripleo-validations-8.5.0-4.el7ost.noarch
puppet-tripleo-8.5.1-14.el7ost.noarch
python-tripleoclient-9.3.1-7.el7ost.noarch
~~~



How reproducible:
Always

Steps to Reproduce:
1. Run validation according to the documentation[1]

Actual results:
The pre-upgrade validation reports failures because of missing python3

Expected results:
The pre upgrade validation reports no failures caused by missing python3

Additional info:

Comment 1 Jose Luis Franco 2020-09-01 05:28:19 UTC
Moving this BZ back to DFG:DF, as this is a pure Validations Framework issue. My guess is that the fact of having the Undercloud in RHEL8 with OSP16.1 (python3) and the overcloud nodes in RHEL7 with OSP13 (no python3) causes the issue. The Framework will probably need to set up the ansible_python_interpreter to /usr/libexec/platform-python (which is present in RHEL7 and RHEL8) or add some logic to capture the right python binary in the target system:
https://github.com/redhat-openstack/infrared/blob/c2f6cb0b793c12a5f072ef5c2f29dc98e3ff0aeb/plugins/tripleo-undercloud/update_inventory.yml#L28-L45

Something like it's done here...it relies on the raw module (which doesn't use python underneath) to capture the binary in the system and then it sets it up.

Comment 2 Cédric Jeanneret 2020-09-01 06:28:14 UTC
have to check, but iirc the tripleo-ansible-inventory script takes some options, among them the python interpreter. Maybe we can tweak it a bit.

Comment 3 Jose Luis Franco 2020-09-01 14:03:51 UTC
I can see that the OSP16.1 Undercloud has Ansible 2.9 version, so maybe it's just a fact of changing these ansible options: https://docs.ansible.com/ansible/latest/reference_appendices/interpreter_discovery.html

Comment 4 Jose Luis Franco 2020-09-04 06:49:11 UTC
Running the validation with Mathieus patch worked:

openstack tripleo validator run --debug --plan qe-Cloud-0 --validation check-rhsm-version --python-interpreter /usr/libexec/platform-python

(undercloud) [stack@undercloud-0 ~]$ openstack tripleo validator show run 5254007e-7d72-bcbc-d185-00000000000b                                                              
{
    "task": {
        "hosts": {
            "compute-0": {
                "_ansible_no_log": false,
                "action": "fail",
                "changed": false,
                "failed": true,
                "msg": "8.2 does not match configured rhsm_version Release not set"
            }
        },
        "name": "Check RHSM version",
        "status": "FAILED"
    }
}
{
    "task": {
        "hosts": {
            "compute-1": {
                "_ansible_no_log": false,
                "action": "fail",
                "changed": false,
                "failed": true,
                "msg": "8.2 does not match configured rhsm_version Release not set"
            }
        },
        "name": "Check RHSM version",
        "status": "FAILED"
    }
}
{                                                                                                                                                                    
    "task": {
        "hosts": {
            "controller-0": {
                "_ansible_no_log": false,                                                                                                                                   
                "action": "fail",
                "changed": false,                                                                                                                                           
                "failed": true,
                "msg": "8.2 does not match configured rhsm_version Release not set"
            }
        },
        "name": "Check RHSM version",
        "status": "FAILED"
    }
}
{
    "task": {
        "hosts": {
            "controller-1": {
                "_ansible_no_log": false,
                "action": "fail",
                "changed": false,
                "failed": true,
                "msg": "8.2 does not match configured rhsm_version Release not set"
            }
        },
        "name": "Check RHSM version",
        "status": "FAILED"
    }
}
{
    "task": {
        "hosts": {
            "controller-2": {
                "_ansible_no_log": false,
                "action": "fail",
                "changed": false,
                "failed": true,
                "msg": "8.2 does not match configured rhsm_version Release not set"
            }
        },
        "name": "Check RHSM version",
        "status": "FAILED"
    }
}


While, if I run it without the parameter I would get:

(undercloud) [stack@undercloud-0 ~]$ openstack tripleo validator show run 5254007e-7d72-e718-2945-00000000000b                                                               
{                                                                                                                                                                            
    "task": {                                                                                                                                                                
        "hosts": {                                                                                                                                                           
            "compute-0": {                                                                                                                                                   
                "_ansible_no_log": false,                                                                                                                                    
                "action": "command",                                                                                                                                         
                "changed": false,                                                                                                                                            
                "failed": true,                                                                                                                                              
                "module_stderr": "Shared connection to 192.168.24.51 closed.\r\n",                                                                                           
                "module_stdout": "/bin/sh: /usr/bin/python3: No such file or directory\r\n",                                                                                 
                "msg": "The module failed to execute correctly, you probably need to set the interpreter.\nSee stdout/stderr for the exact error",                           
                "rc": 127                                                                                                                                                    
            }                                                                                                                                                                
        },                                                                                                                                                                   
        "name": "Retrieve RHSM version",                                                                                                                                     
        "status": "FAILED"                                                                                                                                                   
    }
}
{
    "task": {
        "hosts": {
            "compute-1": {
                "_ansible_no_log": false,
                "action": "command",
                "changed": false,
                "failed": true,
                "module_stderr": "Shared connection to 192.168.24.38 closed.\r\n",
                "module_stdout": "/bin/sh: /usr/bin/python3: No such file or directory\r\n",
                "msg": "The module failed to execute correctly, you probably need to set the interpreter.\nSee stdout/stderr for the exact error",
                "rc": 127
            }
        },
        "name": "Retrieve RHSM version",
        "status": "FAILED"
    }
}
{                                                                                                                                                                  
    "task": {
        "hosts": {
            "controller-0": {
                "_ansible_no_log": false,
                "action": "command",
                "changed": false,
                "failed": true,
                "module_stderr": "Shared connection to 192.168.24.16 closed.\r\n",
                "module_stdout": "/bin/sh: /usr/bin/python3: No such file or directory\r\n",
                "msg": "The module failed to execute correctly, you probably need to set the interpreter.\nSee stdout/stderr for the exact error",
                "rc": 127
            }
        },
        "name": "Retrieve RHSM version",
        "status": "FAILED"
    }
}
{
    "task": {
        "hosts": {
            "controller-1": {
                "_ansible_no_log": false,
                "action": "command",
                "changed": false,
                "failed": true,
                "module_stderr": "Shared connection to 192.168.24.6 closed.\r\n",
                "module_stdout": "/bin/sh: /usr/bin/python3: No such file or directory\r\n",
                "msg": "The module failed to execute correctly, you probably need to set the interpreter.\nSee stdout/stderr for the exact error",
                "rc": 127
            }
        },
        "name": "Retrieve RHSM version",
        "status": "FAILED"
    }
}
{
    "task": {
        "hosts": {
            "controller-2": {
                "_ansible_no_log": false,
                "action": "command",
                "changed": false,
                "failed": true,
                "module_stderr": "Shared connection to 192.168.24.14 closed.\r\n",
                "module_stdout": "/bin/sh: /usr/bin/python3: No such file or directory\r\n",
                "msg": "The module failed to execute correctly, you probably need to set the interpreter.\nSee stdout/stderr for the exact error",
                "rc": 127
            }
        },
        "name": "Retrieve RHSM version",
        "status": "FAILED"
    }
}
sys:1: ResourceWarning: unclosed <ssl.SSLSocket fd=4, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=6, laddr=('192.168.24.2', 37782), raddr=('192.168.24.2
', 13000)>


The only complain is the fact of having to pass an extra parameter for all the validations run in this type of situation (different RHEL versions between UC and OC nodes). It would be nicer that the code would realize automagically that it has to use /usr/libexec/platform-python

Comment 7 Jose Luis Franco 2020-11-03 15:20:41 UTC
*** Bug 1894000 has been marked as a duplicate of this bug. ***

Comment 21 errata-xmlrpc 2020-12-15 18:36:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform 16.1.3 bug fix and enhancement advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2020:5413

Comment 22 Takashi Kajinami 2020-12-17 03:53:53 UTC
I think we should also update the document to use the new option.
I opened another bug for the documentation update.
 https://bugzilla.redhat.com/show_bug.cgi?id=1908569


Note You need to log in before you can comment on or make changes to this bug.