Bug 1373847
| Summary: | Host that is set with protocol=xml fails cluster upgrade | ||
|---|---|---|---|
| Product: | [oVirt] ovirt-engine | Reporter: | Arik <ahadas> |
| Component: | BLL.Infra | Assignee: | Moti Asayag <masayag> |
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Jiri Belka <jbelka> |
| Severity: | medium | Docs Contact: | |
| Priority: | high | ||
| Version: | 4.0.4 | CC: | ahadas, bugs, jbelka, masayag, mgoldboi |
| Target Milestone: | ovirt-4.0.5 | Flags: | rule-engine:
ovirt-4.0.z+
mgoldboi: planning_ack+ masayag: devel_ack+ pstehlik: testing_ack+ |
| Target Release: | 4.0.5.2 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: |
Cause:
Since 4.0, we support only hosts with 3.6 cluster compatibility and above. Therefore the only supported host protocol is JSON-RPC. Having hosts with XML RPC protocol are no longer supported.
Consequence:
Before the fix, the admin couldn't change the host protocol from XML RPC to JSON RPC without DB intervention.
Fix:
All hosts will be updated to JSON RPC.
Result:
Installing a host that doesn't support JSON RPC will end up with an installation failure.
An attempt to activate a host which doesn't support JSON RPC (3.4 hosts and lower) will end up in Non-responsive state of the host.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2017-01-18 07:38:40 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | Infra | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Arik
2016-09-07 09:18:28 UTC
Actually the severity is high, the priority can be lower as we are not sure how comes that the host on rhev.tlv was set with XML Moti - can you take a look? We shouldn't get into a situation where cluster 3.6 contains host with XML-RPC protocol. The only way to get to this stage is by re-installing the host or by adding host which failed to communicate with the engine by JSON-RPC and fallback to XML-RPC. If the host does support JSON-RPC, we should investigate the reason for failing to communicate with the engine via JSON-RPC, else, if the host doesn't support JSON-RPC, it should not be part of any 3.6 cluster. In order to recover from that state - one can remove the host from the engine and add it again so there will be another attempt to communicate with the host via JSON-RPC (no need to deal with the DB). Are there any logs from the failed installation of that host ? (In reply to Moti Asayag from comment #3) > We shouldn't get into a situation where cluster 3.6 contains host with > XML-RPC protocol. > > The only way to get to this stage is by re-installing the host or by adding > host which failed to communicate with the engine by JSON-RPC and fallback to > XML-RPC. > > If the host does support JSON-RPC, we should investigate the reason for > failing to communicate with the engine via JSON-RPC, else, if the host > doesn't support JSON-RPC, it should not be part of any 3.6 cluster. > > In order to recover from that state - one can remove the host from the > engine and add it again so there will be another attempt to communicate with > the host via JSON-RPC (no need to deal with the DB). > > Are there any logs from the failed installation of that host ? so, are you suggesting to move xml-rpc based hosts to non-operational mode once we found them active on 3.6 cluster? (In reply to Moran Goldboim from comment #4) > (In reply to Moti Asayag from comment #3) > > We shouldn't get into a situation where cluster 3.6 contains host with > > XML-RPC protocol. > > > > The only way to get to this stage is by re-installing the host or by adding > > host which failed to communicate with the engine by JSON-RPC and fallback to > > XML-RPC. > > > > If the host does support JSON-RPC, we should investigate the reason for > > failing to communicate with the engine via JSON-RPC, else, if the host > > doesn't support JSON-RPC, it should not be part of any 3.6 cluster. > > > > In order to recover from that state - one can remove the host from the > > engine and add it again so there will be another attempt to communicate with > > the host via JSON-RPC (no need to deal with the DB). > > > > Are there any logs from the failed installation of that host ? > > so, are you suggesting to move xml-rpc based hosts to non-operational mode > once we found them active on 3.6 cluster? Yes, we should. (In reply to Moti Asayag from comment #3) > Are there any logs from the failed installation of that host ? I didn't look for a failed installation. It happened on rhev.tlv, maybe the log still exists there. The fix for this patch for 4.0.x will include the following: 1. Add an upgrade script to move all 3.6 and above hosts to json. 2. Remove the fallback code which used to reconnect failed 3.6 hosts via json to xmlrpc. As a result, failed attempt to communicate with 3.6 hosts via json rpc during installation will end up with an installation failure. ok, ovirt-engine-4.0.5.2-0.2.el7ev.noarch
before engine update:
engine=# select protocol from vds_static where vds_name = 'dell-r210ii-04';
protocol
----------
0
(1 row)
after engine update:
engine=# select protocol from vds_static where vds_name = 'dell-r210ii-04';
protocol
----------
1
(1 row)
engine-setup went smoothly.
Jiri, Could you attempt to install a 3.5 host and after its failed installation to activate it and see that it behaves as expected: 1. Installation should fail. 2. Activation should move the host to Non-responsive. (In reply to Moti Asayag from comment #9) > Jiri, > > Could you attempt to install a 3.5 host and after its failed installation to > activate it and see that it behaves as expected: > > 1. Installation should fail. > 2. Activation should move the host to Non-responsive. Ad 2 - the host ends in non-operational state as 3.5 host can't be managed by 4.0 engine as 4.0 supports only 3.6 and 4.0 cluster level. (In reply to Jiri Belka from comment #10) > (In reply to Moti Asayag from comment #9) > > Jiri, > > > > Could you attempt to install a 3.5 host and after its failed installation to > > activate it and see that it behaves as expected: > > > > 1. Installation should fail. > > 2. Activation should move the host to Non-responsive. > > Ad 2 - the host ends in non-operational state as 3.5 host can't be managed > by 4.0 engine as 4.0 supports only 3.6 and 4.0 cluster level. You are right: This is the expected behavior with 3.5 host, since it supports also jsonrpc. 3.5 was the first host version which supported json-rpc, that's the reason why the ovirt-engine managed to communicate with it. Could you try the same with 3.4 host ? In 3.4 there is no JSON rpc support, and such case might occur only if the admin mis-configure the repositories file on the host, or attempted to register 3.4 rhev-h. |