Bug 1758024
Summary: | Long running Ansible tasks timeout and abort for RHV-H hosts with STIG/Security Profiles applied | ||
---|---|---|---|
Product: | Red Hat Enterprise Virtualization Manager | Reporter: | John Call <jcall> |
Component: | ovirt-engine | Assignee: | Dana <delfassy> |
Status: | CLOSED ERRATA | QA Contact: | Qin Yuan <qiyuan> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | unspecified | CC: | mavital, mperina, mtessun |
Target Milestone: | ovirt-4.4.2 | Keywords: | ZStream |
Target Release: | 4.4.2 | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | |||
Fixed In Version: | rhv-4.4.2-2 ovirt-engine-4.4.2.1 | Doc Type: | No Doc Update |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2020-09-23 16:11:04 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | Infra | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
John Call
2019-10-03 03:49:02 UTC
Here are the RHVM (ovirt-engine) logs that show the disconnect after 5 minutes waiting for yum to complete (which takes ~8 minutes on these hosts), followed by the `yum history` from the affected host. [root@rhvm host-deploy]# cat ovirt-host-mgmt-ansible-20190930160816-rhv1.cluster.net-bddbd03f-3a7f-4e6c-b79d-d4651f411d0f.log [[[[ output from successful tasks omitted]]]] 2019-09-30 16:08:16,580 p=29853 u=ovirt | Using /usr/share/ovirt-engine/playbooks/ansible.cfg as config file 2019-09-30 16:08:17,459 passlib.registry registered 'md5_crypt' handler: <class 'passlib.handlers.md5_crypt.md5_crypt'> 2019-09-30 16:08:17,466 p=29853 u=ovirt | PLAY [all] ********************************************************************* 2019-09-30 16:08:17,476 p=29853 u=ovirt | TASK [Gathering Facts] ********************************************************* 2019-09-30 16:08:22,727 p=29853 u=ovirt | ok: [rhv1.cluster.net] 2019-09-30 16:08:22,763 p=29853 u=ovirt | TASK [Check host certificate] ************************************************** 2019-09-30 16:08:25,915 p=29853 u=ovirt | changed: [rhv1.cluster.net] => (item={u'ca': u'/etc/pki/vdsm/certs/cacert.pem', u'cert': u'/etc/pki/vdsm/certs/vdsmcert.pem'}) => { 2019-09-30 16:08:28,919 p=29853 u=ovirt | changed: [rhv1.cluster.net] => (item={u'ca': u'/etc/pki/libvirt/../CA/cacert.pem', u'cert': u'/etc/pki/libvirt/clientcert.pem'}) => { 2019-09-30 16:08:31,950 p=29853 u=ovirt | changed: [rhv1.cluster.net] => (item={u'ca': u'/etc/pki/vdsm/libvirt-spice/ca-cert.pem', u'cert': u'/etc/pki/vdsm/libvirt-spice/server-cert.pem'}) => { 2019-09-30 16:08:31,990 p=29853 u=ovirt | TASK [Set enrollment variable] ************************************************* 2019-09-30 16:08:32,009 p=29853 u=ovirt | ok: [rhv1.cluster.net] => { 2019-09-30 16:08:32,045 p=29853 u=ovirt | TASK [ovirt-host-enroll-certificate : include_tasks] *************************** 2019-09-30 16:08:32,058 p=29853 u=ovirt | skipping: [rhv1.cluster.net] => (item=loadconfig) => { 2019-09-30 16:08:32,062 p=29853 u=ovirt | skipping: [rhv1.cluster.net] => (item=vdsm) => { 2019-09-30 16:08:32,065 p=29853 u=ovirt | skipping: [rhv1.cluster.net] => (item=vmconsole) => { 2019-09-30 16:08:32,100 p=29853 u=ovirt | TASK [ovirt-host-upgrade : Install ovirt-host package if it isn't installed] *** 2019-09-30 16:08:45,971 p=29853 u=ovirt | ok: [rhv1.cluster.net] => { "changed": false, "changes": { "installed": [], "updated": [] }, "obsoletes": { "redhat-virtualization-host-image-update-placeholder": { "dist": "noarch", "repo": "installed", "version": "4.3.5-3.el7ev" } }, "rc": 0, "results": [ "All packages providing ovirt-host are up to date", "" ] } 2019-09-30 16:08:46,004 p=29853 u=ovirt | TASK [ovirt-host-upgrade : Update system] ************************************** 2019-09-30 16:13:48,011 p=29853 u=ovirt | fatal: [rhv1.cluster.net]: UNREACHABLE! => { "changed": false, "unreachable": true } MSG: Failed to connect to the host via ssh: FIPS mode initialized Shared connection to rhv1.cluster.net closed. 2019-09-30 16:13:48,012 p=29853 u=ovirt | PLAY RECAP ********************************************************************* 2019-09-30 16:13:48,012 p=29853 u=ovirt | rhv1.cluster.net : ok=4 changed=1 unreachable=1 failed=0 skipped=1 rescued=0 ignored=0 [root@rhvm host-deploy]# And here are the yum history info logs from the host where the update failed/aborted [root@rhv1 ~]# yum history Loaded plugins: enabled_repos_upload, imgbased-persist, package_upload, product-id, search-disabled-repos, subscription-manager, : vdsmupgrade, versionlock ID | Login user | Date and time | Action(s) | Altered ------------------------------------------------------------------------------- 4 | root <root> | 2019-09-30 16:09 | Obsoleting | 2 ** 3 | root <root> | 2019-09-30 15:02 | Erase | 1 2 | root <root> | 2019-09-30 13:47 | Install | 1 1 | System <unset> | 2019-09-30 11:59 | Install | 11 history list Uploading Enabled Repositories Report Loaded plugins: product-id, subscription-manager, versionlock Unable to upload Enabled Repositories Report [root@rhv1 ~]# yum history info 4 Loaded plugins: enabled_repos_upload, imgbased-persist, package_upload, product-id, search-disabled-repos, subscription-manager, : vdsmupgrade, versionlock Transaction ID : 4 Begin time : Mon Sep 30 16:09:40 2019 Begin rpmdb : 899:25e583f565e071183b4b6d2b29f8489f6e5463d1 User : root <root> Return-Code : ** Aborted ** ######################## ABORTED ######################### Command Line : -d 2 -y update Transaction performed with: Installed rpm-4.11.3-40.el7.x86_64 installed Installed subscription-manager-1.24.13-1.el7.x86_64 installed Installed yum-3.4.3-163.el7.noarch installed Installed yum-plugin-versionlock-1.1.31-52.el7.noarch installed Packages Altered: ** Obsoleting redhat-virtualization-host-image-update-4.3.5-20190920.0.el7_7.noarch @rhel-7-server-rhvh-4-rpms Excluding 5 updates due to versionlock (use "yum versionlock status" to show them) ** Obsoleted redhat-virtualization-host-image-update-placeholder-4.3.5-3.el7ev.noarch ? Scriptlet output: 1 warning: %post(redhat-virtualization-host-image-update-4.3.5-20190920.0.el7_7.noarch) scriptlet failed, signal 1 history info Uploading Enabled Repositories Report Loaded plugins: product-id, subscription-manager, versionlock Unable to upload Enabled Repositories Report [root@rhv1 ~]# Verified with: ovirt-engine-4.4.2.2-0.3.el8ev.noarch Steps: 1. Test with VPP security profile selected (Use RHVH 4.3 in this test, because RHVH 4.4 can't be added to engine when VPP is selected, see bug 1833254) 1) install redhat-virtualization-host-4.3.10-20200513.0.el7_8 host, select VPP security profile during installation 2) create 4.3 data center, cluster on engine, and add rhvh 4.3 host to engine 3) create a repo providing redhat-virtualization-host-4.3.10-20200615.0.el7_8 update 4) upgrade rhvh 4.3 host on engine 2. Test with /etc/ssh/sshd_config configured 1) install redhat-virtualization-host-4.4.1-20200722.0.el8_2 host 2) add rhvh 4.4 host to engine 3) modify /etc/ssh/sshd_config and restart sshd.service on host: ClientAliveInterval 300 ClientAliveCountMax 0 4) create a repo providing redhat-virtualization-host-4.4.2-20200812.3.el8_2 update 5) upgrade rhvh 4.4 on engine Results: 1. When VPP security profile was selected during host installation, upgrade packages task succeeded: 2020-08-19 10:45:33 IDT - TASK [ovirt-host-upgrade : Upgrade packages] *********************************** 2020-08-19 10:45:33 IDT - 2020-08-19 10:45:33 IDT - { "status" : "OK", "msg" : "", "data" : { "event_data" : { "playbook" : "ovirt-host-upgrade.yml", "playbook_uuid" : "f22e9551-699e-43ec-a56e-fb296996f8e2", "play" : "all", "play_uuid" : "001a4a16-1064-9792-4690-000000000009", "play_pattern" : "all", "task" : "Upgrade packages", "task_uuid" : "001a4a16-1064-9792-4690-000000000034", "task_action" : "yum", "task_args" : "", "task_path" : "/usr/share/ovirt-engine/ansible-runner-service-project/project/roles/ovirt-host-upgrade/tasks/main.yml:30", "role" : "ovirt-host-upgrade", "host" : "10.73.131.97", "remote_addr" : "10.73.131.97", "res" : { "results" : [ { "changed" : true, "results" : [ "Installed:\n redhat-virtualization-host-image-update.noarch 0:4.3.10-20200615.0.el7_8 \n\nReplaced:\n redhat-virtualization-host-image-update-placeholder.noarch 0:4.3.10-1.el7ev \n\nComplete!\nUploading Enabled Repositories Report\n" ], "start" : "2020-08-19T07:34:50.825566", "end" : "2020-08-19T07:45:32.197702", 2. When /etc/ssh/sshd_config was configured, upgrade packages task succeeded: 2020-08-19 16:37:29 IDT - TASK [ovirt-host-upgrade : Upgrade packages] *********************************** 2020-08-19 16:46:25 IDT - { "status" : "OK", "data" : { "event_data" : { "playbook" : "ovirt-host-upgrade.yml", "playbook_uuid" : "44f35c32-1975-4e6b-b27d-be87eabc0936", "play" : "all", "play_uuid" : "001a4a16-1064-115d-9337-00000000000a", "play_pattern" : "all", "task" : "Upgrade packages", "task_uuid" : "001a4a16-1064-115d-9337-000000000035", "task_action" : "yum", "task_args" : "", "task_path" : "/usr/share/ovirt-engine/ansible-runner-service-project/project/roles/ovirt-host-upgrade/tasks/main.yml:30", "role" : "ovirt-host-upgrade", "host" : "10.73.73.13", "remote_addr" : "10.73.73.13", "res" : { "results" : [ { "msg" : "", "changed" : true, "results" : [ "Installed: redhat-virtualization-host-image-update-4.4.2-20200812.3.el8_2.noarch", "Removed: redhat-virtualization-host-image-update-placeholder-4.4.1-1.el8ev.noarch" ], "start" : "2020-08-19T13:37:29.373697", "end" : "2020-08-19T13:46:24.113720", "duration" : 534.740023, Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: Red Hat Virtualization security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:3807 |