Bug 1528974

Summary: host-upgrade fails after upgrading from ovirt 4.1
Product: [oVirt] ovirt-engine Reporter: Arik <ahadas>
Component: GeneralAssignee: Ondra Machacek <omachace>
Status: CLOSED DUPLICATE QA Contact: Pavel Stehlik <pstehlik>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.2.0.2CC: ahadas, bugs, contact, jas, omachace, petr.istenik
Target Milestone: ovirt-4.2.1Flags: rule-engine: ovirt-4.2+
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-01-09 15:38:49 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Arik 2017-12-25 13:08:38 UTC
The "host update" function fails.  All the hosts have been updated with yum to the latest, and rebooted, but host updated fails.

This is what I see:

2017-12-23 19:11:36,479-05 INFO [org.ovirt.engine.core.bll.hos
tdeploy.HostUpgradeCheckCommand] (default task-156)
[ae11a704-3b40-45d3-9850-932f6ed91ed9] Running command:
HostUpgradeCheckCommand internal: false. Entities affected :  ID:
45f8b331-842e-48e7-9df8-56adddb93836 Type: VDSAction group
EDIT_HOST_CONFIGURATION with role type ADMIN
2017-12-23 19:11:36,496-05 INFO [org.ovirt.engine.core.dal.dbb
roker.auditloghandling.AuditLogDirector] (default task-156) [] EVENT_ID:
HOST_AVAILABLE_UPDATES_STARTED(884), Started to check for available
updates on host virt1.
2017-12-23 19:11:36,500-05 INFO [org.ovirt.engine.core.bll.hos
tdeploy.HostUpgradeCheckInternalCommand] (EE-ManagedThreadFactory-commandCoordinator-Thread-7)
[ae11a704-3b40-45d3-9850-932f6ed91ed9] Running command:
HostUpgradeCheckInternalCommand internal: true. Entities affected : ID:
45f8b331-842e-48e7-9df8-56adddb93836 Type: VDS
2017-12-23 19:11:36,504-05 INFO [org.ovirt.engine.core.common.utils.ansible.AnsibleExecutor]
(EE-ManagedThreadFactory-commandCoordinator-Thread-7)
[ae11a704-3b40-45d3-9850-932f6ed91ed9] Executing Ansible command:
ANSIBLE_STDOUT_CALLBACK=hostupgradeplugin [/usr/bin/ansible-playbook,
--check, --private-key=/etc/pki/ovirt-engine/keys/engine_id_rsa,
--inventory=/tmp/ansible-inventory1039100972039373314,
/usr/share/ovirt-engine/playbooks/ovirt-host-upgrade.yml] [Logfile: null]
2017-12-23 19:11:37,897-05 INFO [org.ovirt.engine.core.common.utils.ansible.AnsibleExecutor]
(EE-ManagedThreadFactory-commandCoordinator-Thread-7)
[ae11a704-3b40-45d3-9850-932f6ed91ed9] Ansible playbook command has
exited with value: 4
2017-12-23 19:11:37,897-05 ERROR [org.ovirt.engine.core.bll.host.HostUpgradeManager]
(EE-ManagedThreadFactory-commandCoordinator-Thread-7)
[ae11a704-3b40-45d3-9850-932f6ed91ed9] Failed to run check-update of host
'virt1-mgmt'.
2017-12-23 19:11:37,897-05 ERROR [org.ovirt.engine.core.bll.hostdeploy.HostUpdatesChecker]
(EE-ManagedThreadFactory-commandCoordinator-Thread-7)
[ae11a704-3b40-45d3-9850-932f6ed91ed9] Failed to check if updates are
available for host 'virt1' with error message 'Failed to run check-update
of host 'virt1-mgmt'.'
2017-12-23 19:11:37,904-05 ERROR [org.ovirt.engine.core.dal.dbb
roker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-commandCoordinator-Thread-7)
[ae11a704-3b40-45d3-9850-932f6ed91ed9] EVENT_ID:
HOST_AVAILABLE_UPDATES_FAILED(839), Failed to check for available updates
on host virt1 with message 'Failed to run check-update of host
'virt1-mgmt'.'.

The logs are available at bz 1528868.

Comment 1 Yaniv Kaul 2017-12-26 07:39:21 UTC
Can we get the Ansible logs as well?

Comment 2 Arik 2017-12-26 08:05:40 UTC
(In reply to Yaniv Kaul from comment #1)
> Can we get the Ansible logs as well?

IIUC Ansible logs are disabled in this flow [1] - I don't know if that's because of the high frequency of this operation (checking for upgrades) or it simply not been adjusted yet to leverage the ability to generate logs and report stdout to the engine simultaneously [2].

[1] https://github.com/oVirt/ovirt-engine/blob/master/backend/manager/modules/bll/src/main/java/org/ovirt/engine/core/bll/host/HostUpgradeManager.java#L49
[2] https://gerrit.ovirt.org/#/c/85099/

Comment 3 jas 2017-12-29 19:26:25 UTC
If there is a way for me to enable the logs and run the update so that you can get further details on the problem, let me know.

Comment 4 Ondra Machacek 2017-12-30 13:48:05 UTC
I think it's same issue as in: https://bugzilla.redhat.com/show_bug.cgi?id=1529851#c5

We need to figure out why it's happening, can you please try to execute following command:

/usr/bin/ansible-playbook
--check --private-key=/etc/pki/ovirt-engine/keys/engine_id_rsa
--inventory=IP_OF_YOUR_HOST,
/usr/share/ovirt-engine/playbooks/ovirt-host-upgrade.yml


Please be aware that after IP_OF_YOUR_HOST must be the comma, it's not a typo.

Can you please share the output of that command and also the content of this file: /usr/share/ovirt-engine/playbooks/ovirt-host-upgrade.yml ?

Comment 5 jas 2017-12-30 15:55:29 UTC
Sure!

# /usr/bin/ansible-playbook --check --private-key=/etc/pki/ovirt-engine/keys/engine_id_rsa --inventory=172.16.2.34, /usr/share/ovirt-engine/playbooks/ovirt-host-upgrade.yml

And I get:

PLAY [all] *********************************************************************

TASK [ovirt-host-upgrade : Install ovirt-host package if it isn't installed] ***
changed: [172.16.2.34]

TASK [ovirt-host-upgrade : Update system] **************************************
ok: [172.16.2.34]

PLAY RECAP *********************************************************************
172.16.2.34                : ok=2    changed=1    unreachable=0    failed=0

What does this do exactly? Run the upgrade task?

It looks like it runs successfully?

/usr/share/ovirt-engine/playbooks/ovirt-host-upgrade.yml is:

- hosts: all
  remote_user: root
  gather_facts: false

  roles:
    - ovirt-host-upgrade

Comment 6 Ondra Machacek 2018-01-02 11:36:34 UTC
Yes, all looks good, this playbook checks if host has availabale updates.

So I think it looks we have incorrect permissions/owner set for the /etc/pki/ovirt-engine/keys/engine_id_rsa, can you please share the output of:

 # ls -l /etc/pki/ovirt-engine/keys/engine_id_rsa

Thanks!

Comment 7 jas 2018-01-02 13:06:12 UTC
It is mode 600, user ovirt and group ovirt.
I tried to make it 644 and try again from the GUI but no luck.
There is still the failed to check for update error.

Comment 8 Ondra Machacek 2018-01-09 11:34:58 UTC
So the problem is ipa-client. If it's not a must on your ovirt-engine machine, you can uninstall it to workaround the issue, I will try to find the root cause of conflict.

Comment 9 jas 2018-01-09 11:39:39 UTC
Unfortunately ipa-client is necessary on engine because that's how myself and my colleagues access the server. It's good to know that you've found the problem though.

Comment 10 Ondra Machacek 2018-01-09 15:38:49 UTC

*** This bug has been marked as a duplicate of bug 1529851 ***