Bug 1472812
Summary: | [3.6.z-async] Running engine-host-update.py does not work with RHEVH hosts | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Virtualization Manager | Reporter: | Lukas Svaty <lsvaty> | |
Component: | ovirt-engine | Assignee: | Lev Veyde <lveyde> | |
Status: | CLOSED ERRATA | QA Contact: | Jiri Belka <jbelka> | |
Severity: | medium | Docs Contact: | ||
Priority: | medium | |||
Version: | 3.6.10 | CC: | amarchuk, apinnick, bugs, cshao, danken, jbelka, lsurette, lsvaty, mkalinin, rbalakri, Rhev-m-bugs, srevivo, ykaul, ylavi | |
Target Milestone: | ovirt-3.6.z-async | Keywords: | ZStream | |
Target Release: | --- | |||
Hardware: | All | |||
OS: | All | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | If docs needed, set a value | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1503447 (view as bug list) | Environment: | ||
Last Closed: | 2017-11-27 13:34:01 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | Integration | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1503447 |
Description
Lukas Svaty
2017-07-19 12:44:00 UTC
Which version is the RHVH host are you trying to update? It was upgrade 4.1.3 -> 4.1.4 candidate. However problem is not in host, rather in the utility, current flow: 1. Deactivate host 2. Reinstall host 3. Activate host Reinstall host uses this code: host.install( ovirtsdk.xml.params.Action( ssh=ovirtsdk.xml.params.SSH( authentication_method='publickey' ), host=ovirtsdk.xml.params.Host(override_iptables=True), ) ) My wild guess: This method reinstalls the current image a) for rhel hosts installs the packages and redeploy vdsm/libvirt..., which is correct b)for rhevh it I believe it just reinstalls the current image, even that new image is available. For new image to be installed, upgrade should be used, not install/reinstall. Do we want to fix this? When is the Ansible to do this planned to be released? (In reply to Yaniv Lavi (Dary) from comment #3) > Do we want to fix this? I do. Most of our install base is using RHVH. > When is the Ansible to do this planned to be released? 4.1.6, according to Bug 1473535 Making this work with the vintage node is more critical, than RHVH. (In reply to Yaniv Lavi (Dary) from comment #5) > Making this work with the vintage node is more critical, than RHVH. I talked to mperina@ and he clarified how 'install' works. 1. first 'host.install' action does check what is host type: - if EL host, it _only_ installs packages which should are defined in host-deploy code - if node (and legacy?) it supposes it has all packages available 2. even "installing" packages on EL host does _NOT_ update all packages. These are defined in the DB (PackageNamesForCheckUpdate), thus simple 'host.install' won't update all packages in (PackageNamesForCheckUpdate). 3. for node/ngn I suppose 'host.install' does not touch any packaging and upgrade-manager updates 'ovirt-node-ng-image-update' ('OvirtNodePackageNamesForCheckUpdate' in DB) to update node/ngn. Thus to update node/ngn (and to correct vds_type if it has been wrong) it would need to to reinstall ('host.install') and tell upgrade-manager to upgrade it as well. Hi Lukas, Can you please test my latest patch to see if that solves the issue for you? Moving needinfo to Jirka - do not reinstall rhevh (legacy) it causes confusing message in engine events. rhevh (legacy) is distributed as an iso, please move 'if vdsType in ('rhev-h', 'RHEV_H'):' a little bit up Host dell-r210ii-13 installation in progress . Failed to install fluentd packages.Please check the log for details. Host dell-r210ii-13 installation in progress . Vintage node, skipping kernel arguments.. Host dell-r210ii-13 installation in progress . Cannot validate host name settings, reason: resolved host does not match any of the local addresses. ... Performing RHEVH (Legacy) upgrade... Installing........................ Rebooting............................................................*.*.*.*.*Error: RuntimeError('Unable to complete the reinstall operational, host is in mode: non_responsive',) imo it should make another attempt to get host status, it is quite often that host is a little bit in non-responsive state after upgrade this works incorrectly for RHVH (ngn or aka ovirt-node). it does host.install, that's useless and does not update anything. 508 if vdsType in ('rhev-h', 'RHEV_H'): 509 print('Performing RHEVH (Legacy) upgrade...') 510 upgradeRHevhhost(api, name) 511 verifyHost(api, name) but... Processing Host: dell-r210ii-04 Type: ovirt_node Moving host to the maintenance. Host moved to maintenance. Installing........ ^^^ why? Installed. ^^^ not upgraded anyting! Activating host.. Host activated. Requering the host type, type: ovirt_node Verifying that host stays up.................. Verified. Closed connection. #6 was ignored here. summary: - for rhel7 hosts, host.install is ok - for rhevh legacy, host.upgrade should be used, no host.install at all! - for rhvh/ovirt-node (ngn), host.upgrade probably with this time without iso current implementation is bogus. Fixed the issues for RHEVH legacy, working to fix the issues for the oVirt NGN as well. This helper script need to be updated in the KBase, but doesn't require a backport to 3.6.z codebase. (In reply to Lev Veyde from comment #12) > Fixed the issues for RHEVH legacy, working to fix the issues for the oVirt > NGN as well. Works fine now for RHEVH-legacy. (In reply to Jiri Belka from comment #11) > this works incorrectly for RHVH (ngn or aka ovirt-node). > > it does host.install, that's useless and does not update anything. > > 508 if vdsType in ('rhev-h', 'RHEV_H'): > 509 print('Performing RHEVH (Legacy) upgrade...') > 510 upgradeRHevhhost(api, name) > 511 verifyHost(api, name) > > but... > > Processing Host: dell-r210ii-04 > Type: ovirt_node ^^^ this was obviously tested with NGN on 4.1 > Moving host to the maintenance. > Host moved to maintenance. > Installing........ > ^^^ why? > > Installed. > ^^^ not upgraded anyting! > > Activating host.. > Host activated. > Requering the host type, type: ovirt_node > Verifying that host stays up.................. > Verified. > Closed connection. > > [...] > - for rhvh/ovirt-node (ngn), host.upgrade probably with this time without iso IMO, we cannot do anything with NGN on 3.6 engine as 3.6 engine does _NOT_ know anything about NGN. See below on 3.6 engine with 3.6 NGN: ... Cluster Default contains the following hosts: ['10-37-137-130'] Processing Host: 10-37-137-130 Type: rhel ... The type is 'rhel' as 3.6 engine does not know NGN as vds_type at all. engine=# select vds_name,vds_type,pretty_name from vds; vds_name | vds_type | pretty_name ---------------+----------+----------------------------------------- 10-37-137-130 | 0 | Red Hat Virtualization Host 3.6 (el7.3) I'm not sure if we should care about NGN on 3.6 engine (and this script is not to be used on 4.x as we have ovirt ansible roles there). A problem is we don't have any possibility to distinguish RHEL and NGN via ovirt API. We can do that only from DB. Thus, there should either be a decission to convert this script to run _only_ on engine VM and to use DB query to add support for NGN, or not to care about NGN on 3.6 at all and document this behavior. (In reply to Jiri Belka from comment #15) > > Thus, there should either be a decission to convert this script to run > _only_ on engine VM and to use DB query to add support for NGN, or not to > care about NGN on 3.6 at all and document this behavior. We do not care about 3.6 NGN, please document. ok, rhevm-backend-3.6.12.2-0.1.el6.noarch ... Processing Host: dell-r210ii-13.example.com Type: rhev-h Performing oVirt Node/RHEVH (Legacy) upgrade... Installing.......................... Rebooting...........................................................*.*.*.*.*.*. Installed. Verifying that host stays up.................. Verified. Closed connection. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:3262 |