Bug 1149655 - Cannot register rhev-h 3.4.z host to rhevm 3.5 due to protocol incompatibility
Summary: Cannot register rhev-h 3.4.z host to rhevm 3.5 due to protocol incompatibility
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 3.5.0
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: ---
: 3.5.0
Assignee: Piotr Kliczewski
QA Contact: Petr Kubica
URL:
Whiteboard: infra
: 1156322 (view as bug list)
Depends On:
Blocks: 1078055 1101565 1111075 1117393 1152191
TreeView+ depends on / blocked
 
Reported: 2014-10-06 11:57 UTC by Petr Kubica
Modified: 2016-02-10 19:09 UTC (History)
17 users (show)

Fixed In Version: org.ovirt.engine-root-3.5.0-18
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-02-17 17:08:43 UTC
oVirt Team: Infra
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
logs from engine and host (798.83 KB, application/x-gzip)
2014-10-06 11:57 UTC, Petr Kubica
no flags Details
ifconfig (1.30 KB, text/plain)
2014-10-07 10:25 UTC, Petr Kubica
no flags Details
ip addr show (1.52 KB, text/plain)
2014-10-07 10:26 UTC, Petr Kubica
no flags Details
rhev-h admin tui (10.18 KB, image/png)
2014-10-07 10:29 UTC, Petr Kubica
no flags Details
rhev-h tui eth0 (7.86 KB, image/png)
2014-10-07 10:34 UTC, Petr Kubica
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 34098 0 ovirt-engine-3.5 MERGED core: protocol fall back for older vdsms Never
oVirt gerrit 34255 0 master MERGED core: protocol fall back for older vdsms Never

Description Petr Kubica 2014-10-06 11:57:21 UTC
Created attachment 944218 [details]
logs from engine and host

Description of problem:
I wanted to register rhev-h to rhevm. When I approved host in rhevm, installation of host began. Installation edned with error:

Host rhevh installation failed. Command returned failure code 1 during SSH session 'root.62.49'.

This same problem I had with ovirt-node 3.0.x when registering to ovirt-engine 3.5.0-0.0.master.20140923231936.git42065cc.el6 too

I tried to register from rhevh/ovirt-node and with set root password and added host manualy from engine. The result is always the same 

Version-Release number of selected component (if applicable):
rhev-h release 6.5 (20140930.1.el6ev)
rhevm 3.5.0-0.13.beta.el6ev

Attached log from rhevh and rhevm

How reproducible:
always 

Steps to Reproduce:
1. Have a rhev manager and rhev-h
2a. Try to register rhev-h to rhevm
-- or --
2b. Try to add as a new host rhev-h to rhevm

Actual results:
Error during installation of host

Expected results:
rhevh host in rhevm

Additional info:
---

Comment 1 Douglas Schilling Landgraf 2014-10-06 13:18:40 UTC
(In reply to Petr Kubica from comment #0)
> Created attachment 944218 [details]
> logs from engine and host
> 
> Description of problem:
> I wanted to register rhev-h to rhevm. When I approved host in rhevm,
> installation of host began. Installation edned with error:
> 
> Host rhevh installation failed. Command returned failure code 1 during SSH
> session 'root.62.49'.
> 
> This same problem I had with ovirt-node 3.0.x when registering to
> ovirt-engine 3.5.0-0.0.master.20140923231936.git42065cc.el6 too
> 
> I tried to register from rhevh/ovirt-node and with set root password and
> added host manualy from engine. The result is always the same 
> 
> Version-Release number of selected component (if applicable):
> rhev-h release 6.5 (20140930.1.el6ev)
> rhevm 3.5.0-0.13.beta.el6ev
> 
> Attached log from rhevh and rhevm
> 
> How reproducible:
> always 
> 
> Steps to Reproduce:
> 1. Have a rhev manager and rhev-h
> 2a. Try to register rhev-h to rhevm
> -- or --
> 2b. Try to add as a new host rhev-h to rhevm
> 
> Actual results:
> Error during installation of host
> 
> Expected results:
> rhevh host in rhevm
> 
> Additional info:
> ---

Hi Petr,

In RHEV-H do you still have network? Could you please check if you have the interfaces? What's the output of ip addr show or ifconfig? 

On engine.log I see some:
2014-10-06 13:29:57,252 ERROR [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo] (DefaultQuartzScheduler_Worker-76) Failure to refresh Vds runtime info: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException: VDSGenericException: VDSNetworkException: Message timeout which can be caused by communication issues'

Comment 2 Jiri Belka 2014-10-06 13:40:08 UTC
I had same issue with RHEVH 20140725.0.el6ev.

I tried my best to track it down but no big success...

(manually on host after extracting oivrt-host-deploy.tar)

./setup DIALOG/dialect=str:machine DIALOG/customization=bool:True
# ./setup DIALOG/dialect=str:machine DIALOG/customization=bool:True
***L:INFO Stage: Initializing
***CONFIRM DEPLOY_PROCEED Proceed with ovirt-host-deploy
### Continuing will configure this host for serving as hypervisor. Are you sure you want to continue? (yes/no) 
### Response is CONFIRM DEPLOY_PROCEED=yes|no or ABORT DEPLOY_PROCEED
CONFIRM DEPLOY_PROCEED=yes
***L:INFO Stage: Environment setup
### Configuration files: ['/etc/ovirt-host-deploy.conf.d/50-offline-packager.conf']
### Log file: /tmp/ovirt-host-deploy-20141006132917-w4trmb.log
### Version: otopi-1.3.0_master (otopi-1.3.0-0.0.1.master.el6ev)
### Version: ovirt-host-deploy-1.3.0_master (ovirt-host-deploy-1.3.0-0.0.4.master.el6ev)
***L:INFO Stage: Environment packages setup
***L:INFO Stage: Programs detection
***L:INFO Stage: Environment customization
***L:INFO Kdump unsupported
***Q:STRING CUSTOMIZATION_COMMAND
###
### Customization phase, use 'install' to proceed
### COMMAND> 
install
***L:INFO Stage: Setup validation
***L:INFO Hardware supports virtualization
***L:INFO Stage: Transaction setup
***L:INFO Stage: Misc configuration
***L:INFO Stage: Package installation
***L:INFO Stage: Misc configuration
### Setting up PKI
###
###
### Please issue VDSM certificate based on this certificate request
###
***D:MULTI-STRING VDSM_CERTIFICATE_REQUEST --=451b80dc-996f-432e-9e4f-2b29ef6d1141=--
-----BEGIN CERTIFICATE REQUEST-----
MIICRDCCASwCADAAMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAtewy
k63ufTiJWX2p+QxHcR76+aqImKtUSpRMeUa75hpr8fmP2rxflbX7HNENUhCgEWbA
mh4zpq8chZj3Wl1tgfUxdlKUiv5hTvZSs/t2GBosBjQJjI47AnI++ZG5o7elcPky
+bTNS89MkW9Z81HoEDXwRUnDwhX371pU1hGwmDHd6SnQggOvmFBN3AQ05qDAvEgo
0sHw3ZujOa6HSCel7nzTD5PNJMMkBwUYsYNTDGi1rXaXcCXD5elhtb2PyzDqDqQR
Xq2UagHWNm3HOS8o1zc7AaQ/z3+znrmcVSWj3rbxnrtOFZC7JiUSlLXDTYNZ6XE/
CLfvATGbugxBnGlIvwIDAQABoAAwDQYJKoZIhvcNAQEFBQADggEBAE2DlkFKb09Z
lF0uVyaJ3KZu5mkU1NVNvIjHEAfwrsG6XtYqUYogzG3idZSdVdImxLuH46Bz6K1P
5PO44/2PnAQXD7obY5lIEnbnHTDl4O7GwSgPbS5sFYgEt7m1ZLoS7hkFze2RM1mL
JX4VpKgK4S8ytDPQ2AOY5tglcJqCE5xUBYRU6IyEBnaunQnByx2HFETfCjMqvdAb
IbShIwD1M/xrErqAzW4Usiv7LA+BsWtsTnEPVDlQ3GAJ5pcSUKERysdvgNcRfmEo
dGklMlBt+e1gneuT6wHHOkR+uxN6WfcxGdVRgaJQvOxrjBuf948F/8iZKLlkgI9e
ErlvEHjbo1w=
-----END CERTIFICATE REQUEST-----
--=451b80dc-996f-432e-9e4f-2b29ef6d1141=--
***Q:MULTI-STRING VDSM_CERTIFICATE_CHAIN --=451b80dc-996f-432e-9e4f-2b29ef6d1141=-- --=451b80dc-996f-ABORT-9e4f-2b29ef6d1141=--
###
###
### Please input VDSM certificate chain that matches certificate request, top is issuer
###
### type '--=451b80dc-996f-432e-9e4f-2b29ef6d1141=--' in own line to mark end, '--=451b80dc-996f-ABORT-9e4f-2b29ef6d1141=--' aborts
DqQRXq2UagHWNm3HOS8o1zc7AaQ/
z3+znrmcVSWj3rbxnrtOFZC7JiUSlLXDTYNZ6XE/CLfvATGbugxBnGlIvwIDAQAB
o4IBmDCCAZQwHQYDVR0OBBYEFO+jb9PMFBR9hqGe7vYEhL+SJgreMIGdBggrBgEF
BQcBAQSBkDCBjTCBigYIKwYBBQUHMAKGfmh0dHA6Ly9qYi1yaGV2bTM1LnJoZXYu
bGFiLmVuZy5icnEucmVkaGF0LmNvbTo4MC9vdmlydC1lbmdpbmUvc2VydmljZXMv
cGtpLXJlc291cmNlP3Jlc291cmNlPWNhLWNlcnRpZmljYXRlJmZvcm1hdD1YNTA5
LVBFTS1DQTCBlQYDVR0jBIGNMIGKgBSpqLNZHQ3vhn1hU/AVdwwPo+93+qFupGww
ajELMAkGA1UEBhMCVVMxJDAiBgNVBAoTG3JoZXYubGFiLmVuZy5icnEucmVkaGF0
LmNvbTE1MDMGA1UEAxMsamItcmhldm0zNS5yaGV2LmxhYi5lbmcuYnJxLnJlZGhh
dC5jb20uNDk0ODeCAhAAMAkGA1UdEwQCMAAwDgYDVR0PAQH/BAQDAgWgMCAGA1Ud
JQEB/wQWMBQGCCsGAQUFBwMBBggrBgEFBQcDAjANBgkqhkiG9w0BAQUFAAOCAQEA
KzoCH5/XPJbLnb3zxbNzp+77OQgNl7F5zNM0hhFTNcOLUAaQhywCxXnpRlWHcUL5
cVyiCN56CQq1SQMEsyP3Q9QJmDvYQTPB3uwgizSJ8rOqf7/PXPTLO/T4hDh5Z0Kw
S85Rk2QV+o5xnD4TYImQ3X9Q9D0qbkpGzSQ1GmIavQpmK9n+fpoqavEOCHRjd6n/
5qpqNpI4a9w87lsHleWCCTDvrXplLAkOs5I8MmfwI69aGiXwAThkKZkiKPFAd5K3
2ugaGftvAR+ikKK41jXrN8tOwBjNMiExzG55iYIcx6H9gRtzAF/tz4j3t469rpJN
LxCCxO8mZDhpwzewneiFJw==
-----END CERTIFICATE-----
--=451b80dc-996f-ABORT-9e4f-2b29ef6d1141=--
***L:WARNING Aborted
***L:INFO Stage: Pre-termination
***Q:STRING TERMINATION_COMMAND
###
### Processing ended, use 'quit' to quit
### COMMAND> 
^C***L:ERROR Failed to execute stage 'Pre-termination': SIG2
***L:INFO Stage: Termination
***TERMINATE

Comment 3 Jiri Belka 2014-10-06 13:40:58 UTC
FYI host-deploy log is totally useless, is there a way to trace/debug what is going on during host deployment?

Comment 4 Jiri Belka 2014-10-06 13:45:22 UTC
I discovered that /etc/init.d/libvirtd is missing on these hypervisors.

Comment 5 Fabian Deutsch 2014-10-06 13:49:46 UTC
Hey Jiri, yes we discovered that the libvirtd is missing on el6.

This will be fixed with the next build

Comment 6 Fabian Deutsch 2014-10-06 13:51:39 UTC

*** This bug has been marked as a duplicate of bug 1149658 ***

Comment 7 Fabian Deutsch 2014-10-06 15:12:10 UTC
Re-Opening this because this might be something else.

Comment 8 Alon Bar-Lev 2014-10-06 15:20:40 UTC
(In reply to Jiri Belka from comment #3)
> FYI host-deploy log is totally useless, is there a way to trace/debug what
> is going on during host deployment?

everything that host-deploy does is available in its log. what else do you expect?

Comment 9 Alon Bar-Lev 2014-10-06 15:22:35 UTC
(In reply to Fabian Deutsch from comment #7)
> Re-Opening this because this might be something else.

I cannot see what is happening because the ovirtfunctions kills the log. it should be solved in next build in which the ovirtfunctions is not used any more.

But it may related to other node issues.

Comment 10 Petr Kubica 2014-10-07 10:25:33 UTC
(In reply to Douglas Schilling Landgraf from comment #1)
> (In reply to Petr Kubica from comment #0)
> > Created attachment 944218 [details]
> > logs from engine and host
> > 
> > Description of problem:
> > I wanted to register rhev-h to rhevm. When I approved host in rhevm,
> > installation of host began. Installation edned with error:
> > 
> > Host rhevh installation failed. Command returned failure code 1 during SSH
> > session 'root.62.49'.
> > 
> > This same problem I had with ovirt-node 3.0.x when registering to
> > ovirt-engine 3.5.0-0.0.master.20140923231936.git42065cc.el6 too
> > 
> > I tried to register from rhevh/ovirt-node and with set root password and
> > added host manualy from engine. The result is always the same 
> > 
> > Version-Release number of selected component (if applicable):
> > rhev-h release 6.5 (20140930.1.el6ev)
> > rhevm 3.5.0-0.13.beta.el6ev
> > 
> > Attached log from rhevh and rhevm
> > 
> > How reproducible:
> > always 
> > 
> > Steps to Reproduce:
> > 1. Have a rhev manager and rhev-h
> > 2a. Try to register rhev-h to rhevm
> > -- or --
> > 2b. Try to add as a new host rhev-h to rhevm
> > 
> > Actual results:
> > Error during installation of host
> > 
> > Expected results:
> > rhevh host in rhevm
> > 
> > Additional info:
> > ---
> 
> Hi Petr,
> 
> In RHEV-H do you still have network? Could you please check if you have the
> interfaces? What's the output of ip addr show or ifconfig? 
> 
> On engine.log I see some:
> 2014-10-06 13:29:57,252 ERROR
> [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo]
> (DefaultQuartzScheduler_Worker-76) Failure to refresh Vds runtime info:
> org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException:
> VDSGenericException: VDSNetworkException: Message timeout which can be
> caused by communication issues'

It's difficult to answer. In RHEV-H all network interfaces have a status 'Unconfigured' after tried to add RHEV-H in TUI, but I still can ssh to host and ifconfig seems ok.
The outputs from ifconfig and ip addr show are in attachments.

Comment 11 Petr Kubica 2014-10-07 10:25:57 UTC
Created attachment 944509 [details]
ifconfig

Comment 12 Petr Kubica 2014-10-07 10:26:29 UTC
Created attachment 944510 [details]
ip addr show

Comment 13 Petr Kubica 2014-10-07 10:29:56 UTC
Created attachment 944523 [details]
rhev-h admin tui

Comment 14 Petr Kubica 2014-10-07 10:34:03 UTC
Created attachment 944524 [details]
rhev-h tui eth0

Comment 15 Jiri Belka 2014-10-08 10:18:20 UTC
I tried to approve 3.4.2 rhevh 20140821.1.el6ev into 3.5 (vt4) setup, although 3.4 clstr level, and it failed.

...
2014-10-08 12:09:31,488 INFO  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (VdsDeploy) Correlation ID: 755e70e3, Call Stack: null, Custom Event ID: -1, Message: Installing Host dell-r210ii-13.rhev.lab.eng.brq.redhat.com. Retrieving installation logs to: '/var/log/ovirt-engine/host-deploy/ovirt-20141008120931-10.34.62.205-755e70e3.log'.
2014-10-08 12:09:31,650 INFO  [org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient] (SSL Stomp Reactor) Connecting to /10.34.62.205
2014-10-08 12:09:33,655 ERROR [org.ovirt.vdsm.jsonrpc.client.reactors.Reactor] (SSL Stomp Reactor) Unable to process messages: java.io.IOException: Broken pipe
        at sun.nio.ch.FileDispatcherImpl.write0(Native Method) [rt.jar:1.7.0_65]
        at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) [rt.jar:1.7.0_65]
        at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) [rt.jar:1.7.0_65]
        at sun.nio.ch.IOUtil.write(IOUtil.java:65) [rt.jar:1.7.0_65]
        at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:487) [rt.jar:1.7.0_65]
        at org.ovirt.vdsm.jsonrpc.client.reactors.SSLEngineNioHelper.write(SSLEngineNioHelper.java:95) [vdsm-jsonrpc-java-client.jar:]
        at org.ovirt.vdsm.jsonrpc.client.reactors.SSLClient.write(SSLClient.java:90) [vdsm-jsonrpc-java-client.jar:]
        at org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient.processOutgoing(ReactorClient.java:199) [vdsm-jsonrpc-java-client.jar:]
        at org.ovirt.vdsm.jsonrpc.client.reactors.ReactorClient.process(ReactorClient.java:170) [vdsm-jsonrpc-java-client.jar:]
        at org.ovirt.vdsm.jsonrpc.client.reactors.SSLClient.process(SSLClient.java:115) [vdsm-jsonrpc-java-client.jar:]
        at org.ovirt.vdsm.jsonrpc.client.reactors.Reactor.processChannels(Reactor.java:86) [vdsm-jsonrpc-java-client.jar:]
        at org.ovirt.vdsm.jsonrpc.client.reactors.Reactor.run(Reactor.java:62) [vdsm-jsonrpc-java-client.jar:]

2014-10-08 12:12:33,736 ERROR [org.ovirt.engine.core.vdsbroker.vdsbroker.PollVDSCommand] (ajp-/127.0.0.1:8702-9) [755e70e3] org.ovirt.engine.core.vdsbroker.vdsbroker.VDSNetworkException: VDSGenericException: VDSNetworkException: Message timeout which can be caused by communication issues'
...

...
5: rhevm: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN 
    link/ether d4:ae:52:c7:1a:39 brd ff:ff:ff:ff:ff:ff
    inet 10.34.62.205/22 brd 10.34.63.255 scope global rhevm
    inet6 2620:52:0:223c:d6ae:52ff:fec7:1a39/64 scope global dynamic 
       valid_lft 2591654sec preferred_lft 604454sec
    inet6 fe80::d6ae:52ff:fec7:1a39/64 scope link 
       valid_lft forever preferred_lft forever

Comment 16 Oved Ourfali 2014-10-08 10:37:11 UTC
You shouldn't get jsonrpc code running if the cluster level is 3.4. Can you make sure the hosts don't have it enabled? Through the new/edit host dialog.

Comment 17 Petr Kubica 2014-10-08 15:54:29 UTC
I have cluster with 3.5. When I turned off JSON protocol and reinstall it, it give me error: "compatible with versions (3.0,3.1,3.2,3.3,3.4) and cannot join Cluster Default which is set to version 3.5" I created a cluster with compatibility version 3.4 and try to add the host and it was successful.

Comment 18 Piotr Kliczewski 2014-10-09 07:00:19 UTC
This is expected behavior. Older versions are not able to understand jsonrpc which is default protocol for cluster with compatibility version set to 3.5.

We understand that this is confusing for users so I pushed http://gerrit.ovirt.org/#/c/33728/ which falls back to xmlrpc.

Comment 19 Jiri Belka 2014-10-09 07:26:47 UTC
#15 states it is 3.4 clstr level in 3.5 rhevm. i said _approve_, thus one add from TUI and then approves.

Comment 20 Piotr Kliczewski 2014-10-13 18:05:26 UTC
The fix for this issue was merged on 22 of SEP: http://gerrit.ovirt.org/#/c/33137/.

Can you please check whether your engine version contains this fix?

Comment 21 Oved Ourfali 2014-10-14 10:21:23 UTC
Reducing urgency. There is an issue here, but will only effect 3.4 hosts in 3.5 clusters, which isn't supposed to work anyway. We should provide the non-operational reason, and that's indeed what we'll do.

Comment 22 Piotr Kliczewski 2014-10-24 09:10:34 UTC
*** Bug 1156322 has been marked as a duplicate of this bug. ***

Comment 23 Petr Kubica 2014-10-30 13:31:28 UTC
Verified in 3.5.0-0.18.beta.el6ev

Comment 24 Eyal Edri 2015-02-17 17:08:43 UTC
rhev 3.5.0 was released. closing.


Note You need to log in before you can comment on or make changes to this bug.