Bug 1417687

Summary: when adding oVirt node 4.0 to 4.0 cluster in 4.1 engine fluentd wants to be installed
Product: [oVirt] ovirt-host-deploy Reporter: Jiri Belka <jbelka>
Component: GeneralAssignee: Yedidyah Bar David <didi>
Status: CLOSED WONTFIX QA Contact: Pavel Stehlik <pstehlik>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 1.6.0CC: bugs, didi, jbelka, mgoldboi, oourfali, ykaul, ylavi
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-02-02 09:45:01 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Metrics RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jiri Belka 2017-01-30 16:38:53 UTC
Description of problem:

when adding rhvh 4.0 to 4.0 cluster in 4.1 engine fluentd wants to be installed.

this could break in the future when 4.1 will be released and rhvh 4.0 won't have rhvh-4.1 channel with fluentd.

Version-Release number of selected component (if applicable):
ovirt-host-deploy-1.6.0-1.el7ev.noarch

How reproducible:
1/1

Steps to Reproduce:
1. have 4.1 engine with 4.0 clstr compat level
2. add 4.0 node into the clstr
3.

Actual results:
Failed to install Host dell-r210ii-03. Failed to execute stage 'Package installation': Package collectd cannot be found.

Expected results:
maybe host-deploy should distinguish required packages per cluster compat level and thus not require 4.1 packages for 4.0 hosts.

Additional info:

Comment 1 Sandro Bonazzola 2017-01-30 17:07:40 UTC
oVirt 4.0 hosts should be updated to 4.1 when using oVirt Engine 4.1.
oVirt Node NG 4.0 is not going to be supported with 4.1 release, being 4.1 able to work in 4.0 compatibility mode.

Assigning to didi for further review but I'm inclined closing this as won't fix.
Yaniv, what do you think?

Comment 2 Sandro Bonazzola 2017-01-30 17:08:02 UTC
Moran?

Comment 3 Yedidyah Bar David 2017-01-31 07:49:07 UTC
(In reply to Sandro Bonazzola from comment #1)
> oVirt 4.0 hosts should be updated to 4.1 when using oVirt Engine 4.1.
> oVirt Node NG 4.0 is not going to be supported with 4.1 release, being 4.1
> able to work in 4.0 compatibility mode.

The question is what exactly is "work". If you can't add it, it won't work.

This flow does work (IMO):
1. Install and setup 4.0 engine and hosts/nodes
2. Upgrade engine to 4.1

The update manager will probably tell you eventually that there are updates to the 4.0 hosts (depending on repos etc), but you can ignore it.

A workaround to prevent it from installing packages might be to use the offline packager. Please check the README file [1], also available on the engine machine in /usr/share/doc/ovirt-host-deploy* .

[1] https://gerrit.ovirt.org/gitweb?p=ovirt-host-deploy.git;a=blob;f=README#l92

Jiri, can you please try that? If it does not work, you can try again with the following patch:

https://gerrit.ovirt.org/#/q/I86bf3e1ac03886bae5b0fc8526338907404cc241,n,z

Inside you can find links to jenkins jobs that built it for el7 and fc24.

> 
> Assigning to didi for further review but I'm inclined closing this as won't
> fix.

I think that the exact intended behavior for this flow, as well as similar ones, was not defined. All we have is bug 1371530 and bug 1405810.

> Yaniv, what do you think?

Comment 4 Yaniv Lavi 2017-01-31 12:19:34 UTC
(In reply to Sandro Bonazzola from comment #1)
> oVirt 4.0 hosts should be updated to 4.1 when using oVirt Engine 4.1.
> oVirt Node NG 4.0 is not going to be supported with 4.1 release, being 4.1
> able to work in 4.0 compatibility mode.
> 
> Assigning to didi for further review but I'm inclined closing this as won't
> fix.
> Yaniv, what do you think?

Why are we checking fluentd on node? Doesn't it come baked in the image?
On non image hypervisor it make sense to show this, since updates are for VDSM to 4.1 and all other needed packages.

Comment 5 Yedidyah Bar David 2017-01-31 14:39:16 UTC
(In reply to Yaniv Dary from comment #4)
> Why are we checking fluentd on node?

I guess because no-one said we should not.

> Doesn't it come baked in the image?

I guess it does now, see bug 1373101 .

This bug moved to MODIFIED a month after bug 1371530 did, so for a month, it didn't.

> On non image hypervisor it make sense to show this, since updates are for
> VDSM to 4.1 and all other needed packages.

Any other condition we want to add? Perhaps add some new env var for this, so that user or engine can block installing them in the future/at will?

Please also consider the current behavior of bug 1405813. There, if central metrics store params are configured, the script configures all hosts that are in state UP with compatibility level >= 4.1. So we assume that all of these already have collectd/fluentd+plugins. If this is optional, it will fail for them. If we do not want it to fail, we have to check their state somehow, perhaps also remember it somewhere, which complicates things.

Comment 6 Yaniv Lavi 2017-01-31 14:49:15 UTC
(In reply to Yedidyah Bar David from comment #5)
> (In reply to Yaniv Dary from comment #4)
> > Why are we checking fluentd on node?
> 
> I guess because no-one said we should not.

I believe we only check for update of the image rpm, nothing else. Am I mistaken?

> 
> > Doesn't it come baked in the image?
> 
> I guess it does now, see bug 1373101 .
> 
> This bug moved to MODIFIED a month after bug 1371530 did, so for a month, it
> didn't.
> 
> > On non image hypervisor it make sense to show this, since updates are for
> > VDSM to 4.1 and all other needed packages.
> 
> Any other condition we want to add? Perhaps add some new env var for this,
> so that user or engine can block installing them in the future/at will?
> 
> Please also consider the current behavior of bug 1405813. There, if central
> metrics store params are configured, the script configures all hosts that
> are in state UP with compatibility level >= 4.1. So we assume that all of
> these already have collectd/fluentd+plugins. If this is optional, it will
> fail for them. If we do not want it to fail, we have to check their state
> somehow, perhaps also remember it somewhere, which complicates things.\\

Why should it fail? These packages are not optional on 4.1.

Comment 7 Yedidyah Bar David 2017-01-31 14:59:09 UTC
(In reply to Yaniv Dary from comment #6)
> (In reply to Yedidyah Bar David from comment #5)
> > (In reply to Yaniv Dary from comment #4)
> > > Why are we checking fluentd on node?
> > 
> > I guess because no-one said we should not.
> 
> I believe we only check for update of the image rpm, nothing else. Am I
> mistaken?

I don't think you are. But due to your question in comment 4, I looked at most/all host-deploy code that installs or updates packages. There are many such places, each with it's own set of conditions. Many also check that it's not node, not sure about all. It seems like many/most are configured also by the engine setting optional env keys.

> 
> > 
> > > Doesn't it come baked in the image?
> > 
> > I guess it does now, see bug 1373101 .
> > 
> > This bug moved to MODIFIED a month after bug 1371530 did, so for a month, it
> > didn't.
> > 
> > > On non image hypervisor it make sense to show this, since updates are for
> > > VDSM to 4.1 and all other needed packages.
> > 
> > Any other condition we want to add? Perhaps add some new env var for this,
> > so that user or engine can block installing them in the future/at will?

I also asked that ^^^^^. If you'd have replied "Yes, please add such an option", then ...

> > 
> > Please also consider the current behavior of bug 1405813. There, if central
> > metrics store params are configured, the script configures all hosts that
> > are in state UP with compatibility level >= 4.1. So we assume that all of
> > these already have collectd/fluentd+plugins. If this is optional, it will
> > fail for them. If we do not want it to fail, we have to check their state
> > somehow, perhaps also remember it somewhere, which complicates things.\\
> 
> Why should it fail? These packages are not optional on 4.1.

... it would fail.

So don't? Just check that it's not node?

Comment 8 Yedidyah Bar David 2017-01-31 15:06:07 UTC
BTW, I am pretty certain that we decided that we do want to install collectd/fluentd on 4.0 hosts (perhaps not nodes), and this was also discussed in the context of bug 1405810. IIRC 4.0 and 4.1 will have the same channel, so current bug will not actually happen. Although we still might want to solve it - to not install them on node. Adding also needinfo on ykaul, I think he specifically talked about this too.

Comment 9 Yaniv Kaul 2017-01-31 15:31:40 UTC
(In reply to Yedidyah Bar David from comment #8)
> BTW, I am pretty certain that we decided that we do want to install
> collectd/fluentd on 4.0 hosts (perhaps not nodes), and this was also
> discussed in the context of bug 1405810. IIRC 4.0 and 4.1 will have the same
> channel, so current bug will not actually happen. Although we still might
> want to solve it - to not install them on node. Adding also needinfo on
> ykaul, I think he specifically talked about this too.

That was my understanding as well.
1. I'm not sure we wish to support adding ovirt-node 4.*0* to 4.1 setups. If you have one - great, it'll work. Reinstalling or installing should be with 4.1 already.
2. Same with regular hosts - they share the channel, so they will get automagically VDSM 4.1, for example.

Comment 10 Yedidyah Bar David 2017-02-02 09:45:01 UTC
Following above comments and some internal discussion, closing this bug.

A 4.1 engine+host-deploy should not be used to update a 4.0 host/node inside 4.0.z, and in RHV also cannot be (as they use the same channel).

Also, I am pretty certain that under certain conditions, other packages can cause similar errors, as the code currently never checks if we are NODE, only if we are OVIRT_VINTAGE_NODE.

Please reopen if relevant.