Bug 1421149
| Summary: | [downstream clone - 4.0.7] engine should fail nicely when adding a 3.6 host to a 4.0 cluster. | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Virtualization Manager | Reporter: | rhev-integ |
| Component: | ovirt-engine | Assignee: | Dominik Holler <dholler> |
| Status: | CLOSED ERRATA | QA Contact: | Artyom <alukiano> |
| Severity: | low | Docs Contact: | |
| Priority: | medium | ||
| Version: | unspecified | CC: | alukiano, bugs, danken, dholler, lsurette, nsednev, rbalakri, Rhev-m-bugs, srevivo, stirabos, ykaul, ylavi |
| Target Milestone: | ovirt-4.0.7 | Keywords: | Triaged, ZStream |
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | ovirt-engine-4.0.7 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | 1375573 | Environment: | |
| Last Closed: | 2017-03-16 15:33:29 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | Network | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 1375573 | ||
| Bug Blocks: | |||
|
Description
rhev-integ
2017-02-10 13:43:46 UTC
This bug is opened from https://bugzilla.redhat.com/show_bug.cgi?id=1356127#c11. adding components from host: ovirt-vmconsole-1.0.4-1.el7ev.noarch ovirt-setup-lib-1.0.1-1.el7ev.noarch qemu-kvm-rhev-2.3.0-31.el7_2.21.x86_64 ovirt-hosted-engine-setup-1.3.7.3-1.el7ev.noarch sanlock-3.2.4-3.el7_2.x86_64 libvirt-client-1.2.17-13.el7_2.5.x86_64 mom-0.5.6-1.el7ev.noarch ovirt-host-deploy-1.4.1-1.el7ev.noarch vdsm-4.17.35-1.el7ev.noarch rhevm-sdk-python-3.6.9.1-1.el7ev.noarch ovirt-hosted-engine-ha-1.3.5.8-1.el7ev.noarch rhev-release-3.6.9-2-001.noarch ovirt-vmconsole-host-1.0.4-1.el7ev.noarch Linux version 3.10.0-327.28.3.el7.x86_64 (mockbuild.eng.bos.redhat.com) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-4) (GCC) ) #1 SMP Fri Aug 12 13:21:05 EDT 2016 Linux 3.10.0-327.28.3.el7.x86_64 #1 SMP Fri Aug 12 13:21:05 EDT 2016 x86_64 x86_64 x86_64 GNU/Linux Red Hat Enterprise Linux Server release 7.2 (Maipo) (Originally by Nikolai Sednev) Created attachment 1200497 [details]
sosreport from host
(Originally by Nikolai Sednev)
Created attachment 1200499 [details]
sosreport from engine
(Originally by Nikolai Sednev)
Dan,
here Nikolai is trying to deploy hosted-engine using a 4.0 engine on a 3.6 host and the host is not coming up due to:
2016-09-13 15:31:06,523 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.CollectVdsNetworkDataAfterInstallationVDSCommand] (org.ovirt.thread.pool-6-thread-1) [7c16adab] HostName = alma03.qa.lab.tlv.redhat.com
2016-09-13 15:31:06,524 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.CollectVdsNetworkDataAfterInstallationVDSCommand] (org.ovirt.thread.pool-6-thread-1) [7c16adab] Failed in 'CollectVdsNetworkDataAfterInstallationVDS' method, for vds: 'alma03.qa.lab.tlv.redhat.com'; host: 'alma03.qa.lab.tlv.redhat.com': Required SwitchType is not reported.
2016-09-13 15:31:06,525 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.CollectVdsNetworkDataAfterInstallationVDSCommand] (org.ovirt.thread.pool-6-thread-1) [7c16adab] Command 'CollectVdsNetworkDataAfterInstallationVDSCommand(HostName = alma03.qa.lab.tlv.redhat.com, CollectHostNetworkDataVdsCommandParameters:{runAsync='true', hostId='cef7ba08-33c3-4a9f-9646-d2b5a42d347f', vds='Host[alma03.qa.lab.tlv.redhat.com,cef7ba08-33c3-4a9f-9646-d2b5a42d347f]'})' execution failed: Required SwitchType is not reported.
2016-09-13 15:31:06,526 INFO [org.ovirt.engine.core.utils.transaction.TransactionSupport] (org.ovirt.thread.pool-6-thread-1) [7c16adab] transaction rolled back
2016-09-13 15:31:06,526 ERROR [org.ovirt.engine.core.bll.hostdeploy.InstallVdsInternalCommand] (org.ovirt.thread.pool-6-thread-1) [7c16adab] Exception: org.ovirt.engine.core.common.errors.EngineException: EngineException: java.lang.IllegalStateException: Required SwitchType is not reported. (Failed with error ENGINE and code 5001)
Is it somehow related to the OVS integration since 4.0?
hosted-engine-setup is not lowering the cluster compatibility level since AFAIK it requires an host to be active.
(Originally by Simone Tiraboschi)
Which Engine is it, exactly? Using switchType=Legacy as a default should have been fixed by bug 1373112 in ovirt-engine-4.0.2.7-0.1.el7ev (Originally by danken) (In reply to Dan Kenigsberg from comment #5) > Which Engine is it, exactly? Using switchType=Legacy as a default should > have been fixed by bug 1373112 in ovirt-engine-4.0.2.7-0.1.el7ev ovirt-engine-4.0.4.2-0.1.el7ev.noarch so it seams still there. (Originally by Simone Tiraboschi) Martin, could you see how this can be? (Originally by danken) Bug tickets must have version flags set prior to targeting them to a release. Please ask maintainer to set the correct version flags and only then set the target milestone. (Originally by rule-engine) (In reply to Dan Kenigsberg from comment #5, which had a wrong bz# mentioned) Very odd, since 1367483 has been verified, and Ib09c1c826919c8e5084148dd9599612f20f11938 is in ovirt-engine-4.0.4 branch. (Originally by danken) looking at master code (with limited logs), this function can produce this error:
private static SwitchType getSwitchType(Version clusterVersion, Map<String, Object> networkProperties) {
Object switchType = networkProperties.get(VdsProperties.SWITCH_KEY);
boolean switchTypeShouldBeReportedByVdsm = FeatureSupported.ovsSupported(clusterVersion);
if (switchTypeShouldBeReportedByVdsm && switchType == null) {
throw new IllegalStateException("Required SwitchType is not reported.");
}
return SwitchType.parse(Objects.toString(switchType, SwitchType.LEGACY.getOptionValue()));
}
so it means this: in given 'clusterVersion' switchType is supported; we know that by reading FeatureSupported.ovsSupported(clusterVersion). If ovs is supported, adequeate (also supporting) vdsm should be used. And in that case switchType should be reported.
So the reason here can be, that incompatible engine and vdsm are used at the same time.
——
comparing master to 4.0.4 it seems there are no differences in this method — all related code was backported.
(Originally by Martin Mucha)
This is not a supported flow. We should add an error. But overall there must be a match between appliance major and minor version and host major and minor version. (Originally by Yaniv Dary) (In reply to Martin Mucha from comment #10) > So the reason here can be, that incompatible engine and vdsm are used at the > same time. The engine was at 4.0.4 and VDSM at 3.6.9 (Originally by Simone Tiraboschi) (In reply to Yaniv Dary from comment #11) > This is not a supported flow. We should add an error. > But overall there must be a match between appliance major and minor version > and host major and minor version. Indeed we have, at least on master and 4.0, that's why Nikolai had to force it deploying the engine VM via PXE. (Originally by Simone Tiraboschi) (In reply to Simone Tiraboschi from comment #13) > (In reply to Yaniv Dary from comment #11) > > This is not a supported flow. We should add an error. > > But overall there must be a match between appliance major and minor version > > and host major and minor version. > > Indeed we have, at least on master and 4.0, that's why Nikolai had to force > it deploying the engine VM via PXE. PXE is not supported anymore, only appliance. Let's add a warning in the engine as well then when you add a 3.6 host to 4.0 cluster. (Originally by Yaniv Dary) (In reply to Yaniv Dary from comment #14) > (In reply to Simone Tiraboschi from comment #13) > > (In reply to Yaniv Dary from comment #11) > > > This is not a supported flow. We should add an error. > > > But overall there must be a match between appliance major and minor version > > > and host major and minor version. > > > > Indeed we have, at least on master and 4.0, that's why Nikolai had to force > > it deploying the engine VM via PXE. > > PXE is not supported anymore, only appliance. Let's add a warning in the > engine as well then when you add a 3.6 host to 4.0 cluster. If PXE is not supported, then why it still exposed to customer? (Originally by Nikolai Sednev) *** Bug 1375240 has been marked as a duplicate of this bug. *** (Originally by Yaniv Dary) (In reply to Nikolai Sednev from comment #15) > (In reply to Yaniv Dary from comment #14) > > (In reply to Simone Tiraboschi from comment #13) > > > (In reply to Yaniv Dary from comment #11) > > > > This is not a supported flow. We should add an error. > > > > But overall there must be a match between appliance major and minor version > > > > and host major and minor version. > > > > > > Indeed we have, at least on master and 4.0, that's why Nikolai had to force > > > it deploying the engine VM via PXE. > > > > PXE is not supported anymore, only appliance. Let's add a warning in the > > engine as well then when you add a 3.6 host to 4.0 cluster. > > If PXE is not supported, then why it still exposed to customer? It will be removed in the next version. We have put a message on deprecation in the docs. (Originally by Yaniv Dary) Removing flag until doc text if provided. (Originally by Yaniv Dary) 4.0.6 has been the last oVirt 4.0 release, please re-target this bug. (Originally by Sandro Bonazzola) May you please provide the target release? Verified on # Host # rpm -qa | grep vdsm vdsm-jsonrpc-4.17.37-1.el7ev.noarch vdsm-xmlrpc-4.17.37-1.el7ev.noarch vdsm-cli-4.17.37-1.el7ev.noarch vdsm-4.17.37-1.el7ev.noarch vdsm-infra-4.17.37-1.el7ev.noarch vdsm-yajsonrpc-4.17.37-1.el7ev.noarch vdsm-python-4.17.37-1.el7ev.noarch vdsm-hook-vmfex-dev-4.17.37-1.el7ev.noarch # Engine # rpm -qa | grep rhevm rhevm-doc-4.0.7-1.el7ev.noarch rhevm-spice-client-x86-msi-4.0-3.el7ev.noarch rhevm-branding-rhev-4.0.0-7.el7ev.noarch rhevm-dependencies-4.0.0-1.el7ev.noarch rhevm-guest-agent-common-1.0.12-4.el7ev.noarch rhevm-4.0.7.3-0.1.el7ev.noarch rhevm-spice-client-x64-msi-4.0-3.el7ev.noarch rhevm-setup-plugins-4.0.0.3-1.el7ev.noarch When I the HE deploy add the host I can see normal message under the engine log: 2017-02-28 16:20:07,365 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler6) [480bdf83] Correlation ID: 480bdf83, Job ID: c7d066be-3e5c-4c1a-8962-057c9f41d0b9, Call Stack: null, Custom Event ID: -1, Message: Host hosted_engine_1 is compatible with versions (3.4,3.5,3.6) and cannot join Cluster Default which is set to version 4.0. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2017-0542.html |