Bug 1443508 - [RHVH3.6] Failed to add host to rhv-m 4.1 - Package collectd-disk cannot be found
Summary: [RHVH3.6] Failed to add host to rhv-m 4.1 - Package collectd-disk cannot be f...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-host-deploy
Classification: oVirt
Component: Plugins.General
Version: 1.6.1
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ovirt-4.1.2
: 1.6.5
Assignee: Yedidyah Bar David
QA Contact: Lukas Svaty
URL:
Whiteboard:
: 1438005 1444450 (view as bug list)
Depends On:
Blocks: 1440111 1447670 1448370
TreeView+ depends on / blocked
 
Reported: 2017-04-19 11:53 UTC by Michael Burman
Modified: 2017-06-21 13:00 UTC (History)
13 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2017-05-23 08:19:14 UTC
oVirt Team: Integration
Embargoed:
rule-engine: ovirt-4.1+
rule-engine: blocker+
lsvaty: testing_ack+


Attachments (Terms of Use)
engine logs (873.70 KB, application/x-gzip)
2017-05-08 06:40 UTC, Michael Burman
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 75989 0 master MERGED metrics: Make collectd/fluentd packages optional 2017-04-27 06:02:49 UTC
oVirt gerrit 76098 0 ovirt-host-deploy-1.6 MERGED metrics: Make collectd/fluentd packages optional 2017-04-27 10:04:40 UTC
oVirt gerrit 76570 0 master MERGED metrics: Prevent errors on failed collectd/fluentd installation 2017-05-10 15:40:37 UTC
oVirt gerrit 76676 0 ovirt-host-deploy-1.6 MERGED metrics: Prevent errors on failed collectd/fluentd installation 2017-05-10 15:51:04 UTC

Description Michael Burman 2017-04-19 11:53:00 UTC
Description of problem:
[RHVH3.6] Failed to add host to rhv-m 4.1 - Package collectd-disk cannot be found

Unable to add host rhv-h 3.6 to 3.6 cluster in rhv-m 4.1

Version-Release number of selected component (if applicable):
4.1.1.8-0.1.el7
rhvh-3.6-0.20170413.0+1
vdsm-4.17.39-1.el7ev.noarch

How reproducible:
100

Steps to Reproduce:
1. Try to add latest rhv-h 3.6 to cluster 3.6 in rhv-m 4.1

Actual results:
Package collectd-disk cannot be found

Expected results:
Should work

Comment 2 Martin Perina 2017-04-19 15:41:39 UTC
There's no reason to use old 3.6 RHVH when new 4.1 is available, so closing as WONTFIX

Comment 3 Dan Kenigsberg 2017-04-20 09:40:21 UTC
In my opinion, as expressed also in idependently-opened bug 1438347 and bug 1433434, we do have a reason to add 3.6 hosts to 4.1 manager.

Conservative customers that have big 3.6 clusters, would like to upgrade their manager to 4.1, test it with a small 4.1 cluster, while keeping their 3.6 cluster in full operation. Telling them that they would not be able to extend their 3.6 cluster once engine-4.1 is install may deter them from testing 4.1.

Comment 7 Dan Kenigsberg 2017-04-20 14:49:43 UTC
https://gerrit.ovirt.org/#/c/71539/7/src/plugins/ovirt-host-deploy/vdsm/packages.py@182 has added version-related logic to ovirt-host-deploy. Something similar can be done to pull collectd-disk only when needed by the cluster.

Comment 9 Yedidyah Bar David 2017-04-25 08:53:35 UTC
(In reply to Dan Kenigsberg from comment #7)
> https://gerrit.ovirt.org/#/c/71539/7/src/plugins/ovirt-host-deploy/vdsm/
> packages.py@182 has added version-related logic to ovirt-host-deploy.
> Something similar can be done to pull collectd-disk only when needed by the
> cluster.

I don't see how this is relevant - it checks for version of vdsm, not of cluster.

Comment 10 Yedidyah Bar David 2017-04-25 08:58:20 UTC
Two simple solutions I can think of:

1. Add an env key to allow preventing installing these packages.

2. Make the failure non-fatal, so that host-deploy will emit an error (or warning) but continue.

(1.) will require adding a conf file in /etc to prevent attempting installation of these packages.

(2.) will not require anything but has the downside that unrelated reasons preventing installation might go unnoticed.

Comment 11 Yaniv Kaul 2017-04-25 09:00:25 UTC
(In reply to Yedidyah Bar David from comment #10)
> Two simple solutions I can think of:
> 
> 1. Add an env key to allow preventing installing these packages.
> 
> 2. Make the failure non-fatal, so that host-deploy will emit an error (or
> warning) but continue.
> 
> (1.) will require adding a conf file in /etc to prevent attempting
> installation of these packages.
> 
> (2.) will not require anything but has the downside that unrelated reasons
> preventing installation might go unnoticed.

#2 sounds like a reasonable short-term solution to me.

Comment 12 Yedidyah Bar David 2017-04-25 12:57:50 UTC
*** Bug 1438005 has been marked as a duplicate of this bug. ***

Comment 13 Dan Kenigsberg 2017-04-25 13:57:00 UTC
(In reply to Yedidyah Bar David from comment #9)
> 
> I don't see how this is relevant - it checks for version of vdsm, not of
> cluster.

It is enough for our use case: collectd-disk is available whereever a fresh vdsm-4.19 is.

Comment 14 Dan Kenigsberg 2017-04-25 14:00:28 UTC
*** Bug 1444450 has been marked as a duplicate of this bug. ***

Comment 15 Yedidyah Bar David 2017-04-25 14:10:34 UTC
(In reply to Dan Kenigsberg from comment #13)
> (In reply to Yedidyah Bar David from comment #9)
> > 
> > I don't see how this is relevant - it checks for version of vdsm, not of
> > cluster.
> 
> It is enough for our use case: collectd-disk is available whereever a fresh
> vdsm-4.19 is.

Does it handle also bug 1438005?

Comment 17 Sandro Bonazzola 2017-05-02 14:12:24 UTC
*** Bug 1440111 has been marked as a duplicate of this bug. ***

Comment 18 Michael Burman 2017-05-07 14:06:28 UTC
Please note that although the failure now is not fatal, engine still complains about it with misleading error message in the event log and this should be improved. 

Installing Host orchid-vds2.qa.lab.tlv.redhat.com. Stage: Package installation.

Failed to install Host orchid-vds2.qa.lab.tlv.redhat.com. Yum Cannot queue package collectd: Package collectd cannot be found.

Host orchid-vds2.qa.lab.tlv.redhat.com installation in progress . Failed to install collectd packages: Package collectd cannot be found.

Failed to install Host orchid-vds2.qa.lab.tlv.redhat.com. Yum Cannot queue package fluentd: Package fluentd cannot be found.

Host orchid-vds2.qa.lab.tlv.redhat.com installation in progress . Failed to install fluentd packages: Package fluentd cannot be found.

Installing Host orchid-vds2.qa.lab.tlv.redhat.com. Yum Status: Downloading Packages.

Those error messages are confusing as they report that host installation has failed, but it's succeeded eventually.

Comment 19 Yedidyah Bar David 2017-05-08 05:38:14 UTC
Please attach complete host-deploy log and relevant part of engine.log. Thanks.

Comment 20 Michael Burman 2017-05-08 06:37:11 UTC
2017-05-08 09:32:00,811+03 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (VdsDeploy) [327b5669] EVENT_ID: VDS_INSTALL_IN_PROGRESS(509), Correlation ID: 327b5669, Call Stack: null, Cu
stom Event ID: -1, Message: Installing Host orchid-vds2.qa.lab.tlv.redhat.com. Stage: Package installation.
2017-05-08 09:32:01,456+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (VdsDeploy) [327b5669] EVENT_ID: VDS_INSTALL_IN_PROGRESS_ERROR(511), Correlation ID: 327b5669, Call Stack: nu
ll, Custom Event ID: -1, Message: Failed to install Host orchid-vds2.qa.lab.tlv.redhat.com. Yum Cannot queue package collectd: Package collectd cannot be found.
2017-05-08 09:32:01,499+03 WARN  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (VdsDeploy) [327b5669] EVENT_ID: VDS_INSTALL_IN_PROGRESS_WARNING(510), Correlation ID: 327b5669, Call Stack: 
null, Custom Event ID: -1, Message: Host orchid-vds2.qa.lab.tlv.redhat.com installation in progress . Failed to install collectd packages: Package collectd cannot be found.
2017-05-08 09:32:01,527+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (VdsDeploy) [327b5669] EVENT_ID: VDS_INSTALL_IN_PROGRESS_ERROR(511), Correlation ID: 327b5669, Call Stack: null, Custom Event ID: -1, Message: Failed to install Host orchid-vds2.qa.lab.tlv.redhat.com. Yum Cannot queue package fluentd: Package fluentd cannot be found.
2017-05-08 09:32:01,545+03 WARN  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (VdsDeploy) [327b5669] EVENT_ID: VDS_INSTALL_IN_PROGRESS_WARNING(510), Correlation ID: 327b5669, Call Stack: null, Custom Event ID: -1, Message: Host orchid-vds2.qa.lab.tlv.redhat.com installation in progress . Failed to install fluentd packages: Package fluentd cannot be found.
2017-05-08 09:32:03,308+03 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (VdsDeploy) [327b5669] EVENT_ID: VDS_INSTALL_IN_PROGRESS(509), Correlation ID: 327b5669, Call Stack: null, Custom Event ID: -1, Message: Installing Host orchid-vds2.qa.lab.tlv.redhat.com. Yum Status: Downloading Packages.

Comment 21 Michael Burman 2017-05-08 06:40:27 UTC
Created attachment 1276978 [details]
engine logs

Comment 22 Yedidyah Bar David 2017-05-08 07:36:23 UTC
1. I agree the message is confusing, but that's actually an engine bug - it emits it on every ERROR sent to it from host-deploy [1]. I suggest to open a bug to get this message changed, but it should be discussed with engine people. IMO it should be possible for host-deploy to emit an error and still do not fail, without a wrong message to the user.

[1] https://gerrit.ovirt.org/gitweb?p=ovirt-engine.git;a=blob;f=backend/manager/modules/dal/src/main/resources/bundles/AuditLogMessages.properties;h=861a3ee3afcb26445b072ba54f64bedb58494246;hb=HEAD#l402

2. Pushed now a patch to cause host-deploy not emit these errors. Can you please try again with the jenkins build [2]? If you think it's important enough for current bug, you can move to ASSIGNED (or POST) (but then not sure it will enter 4.1.2), or open a new bug otherwise.

[2] http://jenkins.ovirt.org/job/ovirt-host-deploy_master_check-patch-el7-x86_64/155/artifact/exported-artifacts/

3. Please note that this does not prevent all the errors in the attached engine.log. In particular, the missing -nest package during "check-for-updates" is unrelated - should be handled by bug 1436655 for downstream, I do not think we have one for upstream - so we should probably open one and have bug 1426901 depend on it (or both).

Comment 23 Michael Burman 2017-05-08 08:08:58 UTC
(In reply to Yedidyah Bar David from comment #22)
> 1. I agree the message is confusing, but that's actually an engine bug - it
> emits it on every ERROR sent to it from host-deploy [1]. I suggest to open a
> bug to get this message changed, but it should be discussed with engine
> people. IMO it should be possible for host-deploy to emit an error and still
> do not fail, without a wrong message to the user.
Reported bug to improve the error message - BZ 1448798

> 
> [1]
> https://gerrit.ovirt.org/gitweb?p=ovirt-engine.git;a=blob;f=backend/manager/
> modules/dal/src/main/resources/bundles/AuditLogMessages.properties;
> h=861a3ee3afcb26445b072ba54f64bedb58494246;hb=HEAD#l402
> 
> 2. Pushed now a patch to cause host-deploy not emit these errors. Can you
> please try again with the jenkins build [2]? If you think it's important
> enough for current bug, you can move to ASSIGNED (or POST) (but then not
> sure it will enter 4.1.2), or open a new bug otherwise.
- If it's on latest master, then i will try it.
- I don't think this is so important for the current bug. New bug reported^^

> 
> [2]
> http://jenkins.ovirt.org/job/ovirt-host-deploy_master_check-patch-el7-x86_64/
> 155/artifact/exported-artifacts/
> 
> 3. Please note that this does not prevent all the errors in the attached
> engine.log. In particular, the missing -nest package during
> "check-for-updates" is unrelated - should be handled by bug 1436655 for
> downstream, I do not think we have one for upstream - so we should probably
> open one and have bug 1426901 depend on it (or both).

Comment 24 Yedidyah Bar David 2017-05-08 09:21:22 UTC
(In reply to Michael Burman from comment #23)
> (In reply to Yedidyah Bar David from comment #22)
> > 1. I agree the message is confusing, but that's actually an engine bug - it
> > emits it on every ERROR sent to it from host-deploy [1]. I suggest to open a
> > bug to get this message changed, but it should be discussed with engine
> > people. IMO it should be possible for host-deploy to emit an error and still
> > do not fail, without a wrong message to the user.
> Reported bug to improve the error message - BZ 1448798

Thanks. Commented there and change the summary line.

> 
> > 
> > [1]
> > https://gerrit.ovirt.org/gitweb?p=ovirt-engine.git;a=blob;f=backend/manager/
> > modules/dal/src/main/resources/bundles/AuditLogMessages.properties;
> > h=861a3ee3afcb26445b072ba54f64bedb58494246;hb=HEAD#l402
> > 
> > 2. Pushed now a patch to cause host-deploy not emit these errors. Can you
> > please try again with the jenkins build [2]? If you think it's important
> > enough for current bug, you can move to ASSIGNED (or POST) (but then not
> > sure it will enter 4.1.2), or open a new bug otherwise.
> - If it's on latest master, then i will try it.
> - I don't think this is so important for the current bug. New bug reported^^

No, that's a different one.
The patch of below patch is for host-deploy, not in the engine. It applies directly to current bug.

Sandro - do we have time to include this in 4.1.2? Or postpone to later (and thus have another bug to track it, if at all)?

> > [2]
> > http://jenkins.ovirt.org/job/ovirt-host-deploy_master_check-patch-el7-x86_64/
> > 155/artifact/exported-artifacts/

This is the patch:

https://gerrit.ovirt.org/76570

Comment 25 Red Hat Bugzilla Rules Engine 2017-05-10 11:54:57 UTC
Target release should be placed once a package build is known to fix a issue. Since this bug is not modified, the target version has been reset. Please use target milestone to plan a fix for a oVirt release.

Comment 26 Lukas Svaty 2017-05-15 16:15:49 UTC
verified ovirt-host-deploy-1.6.5-1.el7ev.noarch

host is installed even that package collectd is not avialable, warning is shown
Host lunar installation in progress . Failed to install collectd packages.Please check the log for details.

Comment 27 Jiri Belka 2017-05-16 14:18:10 UTC
3.6 NGN could be successfully added into 4.1 engine.


Note You need to log in before you can comment on or make changes to this bug.