Bug 1472812 - [3.6.z-async] Running engine-host-update.py does not work with RHEVH hosts [NEEDINFO]
[3.6.z-async] Running engine-host-update.py does not work with RHEVH hosts
Status: VERIFIED
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine (Show other bugs)
3.6.10
All All
medium Severity medium
: ovirt-3.6.z-async
: ---
Assigned To: Lev Veyde
Jiri Belka
: ZStream
Depends On:
Blocks: 1503447
  Show dependency treegraph
 
Reported: 2017-07-19 08:44 EDT by Lukas Svaty
Modified: 2017-11-17 12:21 EST (History)
14 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1503447 (view as bug list)
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: Integration
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
mkalinin: needinfo? (amarchuk)
ykaul: needinfo? (amarchuk)


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
oVirt gerrit 81687 master POST packaging: Update engine-host-update to support oVirt Node 2017-11-01 10:21 EDT
oVirt gerrit 83646 ovirt-engine-3.6 MERGED packaging: Add engine-host-update w/ support for oVirt Node 2017-11-08 07:14 EST

  None (edit)
Description Lukas Svaty 2017-07-19 08:44:00 EDT
Description of problem:
Running engine-host-update.py for RHVH hosts

Version-Release number of selected component (if applicable):
ovirt-engine-4.1.4.2-0.1.el7.noarch

How reproducible:
100%

Steps to Reproduce:
1. run `./engine-host-update.py --insecure --engine=localhost --username=admin@internal --password=mypass` --host=rhevh

Actual results:
RHVH host is reinstalled. Updates still available.

Expected results:
When host is upgraded properly.
Comment 1 Yaniv Lavi 2017-07-31 05:11:17 EDT
Which version is the RHVH host are you trying to update?
Comment 2 Lukas Svaty 2017-07-31 05:47:11 EDT
It was upgrade 4.1.3 -> 4.1.4 candidate.
However problem is not in host, rather in the utility, current flow:

1. Deactivate host
2. Reinstall host
3. Activate host

Reinstall host uses this code:

host.install(
            ovirtsdk.xml.params.Action(
                ssh=ovirtsdk.xml.params.SSH(
                    authentication_method='publickey'
                ),
                host=ovirtsdk.xml.params.Host(override_iptables=True),
            )
        )

My wild guess:
This method reinstalls the current image
a) for rhel hosts installs the packages and redeploy vdsm/libvirt..., which is correct
b)for rhevh it I believe it just reinstalls the current image, even that new image is available. For new image to be installed, upgrade should be used, not install/reinstall.
Comment 3 Yaniv Lavi 2017-08-07 05:06:07 EDT
Do we want to fix this?
When is the Ansible to do this planned to be released?
Comment 4 Dan Kenigsberg 2017-08-21 08:32:58 EDT
(In reply to Yaniv Lavi (Dary) from comment #3)
> Do we want to fix this?

I do. Most of our install base is using RHVH.

> When is the Ansible to do this planned to be released?

4.1.6, according to Bug 1473535
Comment 5 Yaniv Lavi 2017-08-30 05:11:08 EDT
Making this work with the vintage node is more critical, than RHVH.
Comment 6 Jiri Belka 2017-08-31 05:39:42 EDT
(In reply to Yaniv Lavi (Dary) from comment #5)
> Making this work with the vintage node is more critical, than RHVH.

I talked to mperina@ and he clarified how 'install' works.

1. first 'host.install' action does check what is host type:
   - if EL host, it _only_ installs packages which should are defined in
     host-deploy code
   - if node (and legacy?) it supposes it has all packages available

2. even "installing" packages on EL host does _NOT_ update all packages. These
   are defined in the DB (PackageNamesForCheckUpdate), thus simple 'host.install'
   won't update all packages in (PackageNamesForCheckUpdate).

3. for node/ngn I suppose 'host.install' does not touch any packaging and
   upgrade-manager updates 'ovirt-node-ng-image-update'
   ('OvirtNodePackageNamesForCheckUpdate' in DB) to update node/ngn.

Thus to update node/ngn (and to correct vds_type if it has been wrong) it would need to to reinstall ('host.install') and tell upgrade-manager to upgrade it as well.
Comment 7 Lev Veyde 2017-09-26 10:34:06 EDT
Hi Lukas,

Can you please test my latest patch to see if that solves the issue for you?
Comment 8 Lukas Svaty 2017-09-27 03:58:57 EDT
Moving needinfo to Jirka
Comment 9 Jiri Belka 2017-10-03 11:25:09 EDT
- do not reinstall rhevh (legacy) it causes confusing message in engine events. rhevh (legacy) is distributed as an iso, please move 'if vdsType in ('rhev-h', 'RHEV_H'):' a little bit up

Host dell-r210ii-13 installation in progress . Failed to install fluentd packages.Please check the log for details.

Host dell-r210ii-13 installation in progress . Vintage node, skipping kernel arguments..

Host dell-r210ii-13 installation in progress . Cannot validate host name settings, reason: resolved host does not match any of the local addresses.
Comment 10 Jiri Belka 2017-10-03 11:32:29 EDT
...
Performing RHEVH (Legacy) upgrade...
	Installing........................
	Rebooting............................................................*.*.*.*.*Error: RuntimeError('Unable to complete the reinstall operational, host is in mode: non_responsive',)

imo it should make another attempt to get host status, it is quite often that host is a little bit in non-responsive state after upgrade
Comment 11 Jiri Belka 2017-10-04 04:05:44 EDT
this works incorrectly for RHVH (ngn or aka ovirt-node).

it does host.install, that's useless and does not update anything.

508            if vdsType in ('rhev-h', 'RHEV_H'):
509                print('Performing RHEVH (Legacy) upgrade...')
510                upgradeRHevhhost(api, name)
511            verifyHost(api, name)

but...

Processing Host: dell-r210ii-04
Type: ovirt_node
        Moving host to the maintenance.
        Host moved to maintenance.
        Installing........
        ^^^ why?

        Installed.
        ^^^ not upgraded anyting!

        Activating host..
        Host activated.
Requering the host type, type: ovirt_node
        Verifying that host stays up..................
        Verified.
Closed connection.

#6 was ignored here. summary:

- for rhel7 hosts, host.install is ok
- for rhevh legacy, host.upgrade should be used, no host.install at all!
- for rhvh/ovirt-node (ngn), host.upgrade probably with this time without iso

current implementation is bogus.
Comment 12 Lev Veyde 2017-10-16 13:18:12 EDT
Fixed the issues for RHEVH legacy, working to fix the issues for the oVirt NGN as well.
Comment 13 Yaniv Lavi 2017-10-18 04:48:56 EDT
This helper script need to be updated in the KBase, but doesn't require a backport to 3.6.z codebase.
Comment 14 Jiri Belka 2017-10-23 03:38:04 EDT
(In reply to Lev Veyde from comment #12)
> Fixed the issues for RHEVH legacy, working to fix the issues for the oVirt
> NGN as well.

Works fine now for RHEVH-legacy.
Comment 15 Jiri Belka 2017-10-23 05:13:18 EDT
(In reply to Jiri Belka from comment #11)
> this works incorrectly for RHVH (ngn or aka ovirt-node).
> 
> it does host.install, that's useless and does not update anything.
> 
> 508            if vdsType in ('rhev-h', 'RHEV_H'):
> 509                print('Performing RHEVH (Legacy) upgrade...')
> 510                upgradeRHevhhost(api, name)
> 511            verifyHost(api, name)
> 
> but...
> 
> Processing Host: dell-r210ii-04
> Type: ovirt_node

        ^^^ this was obviously tested with NGN on 4.1

>         Moving host to the maintenance.
>         Host moved to maintenance.
>         Installing........
>         ^^^ why?
> 
>         Installed.
>         ^^^ not upgraded anyting!
> 
>         Activating host..
>         Host activated.
> Requering the host type, type: ovirt_node
>         Verifying that host stays up..................
>         Verified.
> Closed connection.
> 
> [...]
> - for rhvh/ovirt-node (ngn), host.upgrade probably with this time without iso

IMO, we cannot do anything with NGN on 3.6 engine as 3.6 engine does _NOT_ know anything about NGN. See below on 3.6 engine with 3.6 NGN:

...
Cluster Default contains the following hosts: ['10-37-137-130']
Processing Host: 10-37-137-130
Type: rhel
...

The type is 'rhel' as 3.6 engine does not know NGN as vds_type at all.

engine=# select vds_name,vds_type,pretty_name from vds;
   vds_name    | vds_type |               pretty_name               
---------------+----------+-----------------------------------------
 10-37-137-130 |        0 | Red Hat Virtualization Host 3.6 (el7.3)

I'm not sure if we should care about NGN on 3.6 engine (and this script is not to be used on 4.x as we have ovirt ansible roles there). A problem is we don't have any possibility to distinguish RHEL and NGN via ovirt API. We can do that only from DB.

Thus, there should either be a decission to convert this script to run _only_ on engine VM and to use DB query to add support for NGN, or not to care about NGN on 3.6 at all and document this behavior.
Comment 16 Yaniv Lavi 2017-10-23 08:32:27 EDT
(In reply to Jiri Belka from comment #15)
> 
> Thus, there should either be a decission to convert this script to run
> _only_ on engine VM and to use DB query to add support for NGN, or not to
> care about NGN on 3.6 at all and document this behavior.

We do not care about 3.6 NGN, please document.
Comment 18 Jiri Belka 2017-11-13 10:40:42 EST
ok, rhevm-backend-3.6.12.2-0.1.el6.noarch

...
Processing Host: dell-r210ii-13.example.com
Type: rhev-h
        Performing oVirt Node/RHEVH (Legacy) upgrade...
        Installing..........................
        Rebooting...........................................................*.*.*.*.*.*.
        Installed.
        Verifying that host stays up..................
        Verified.
Closed connection.

Note You need to log in before you can comment on or make changes to this bug.