Bug 1211236 - [upgrade] failed to start hosts once they upgraded - SSLEngine error
Summary: [upgrade] failed to start hosts once they upgraded - SSLEngine error
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm
Version: 3.5.1
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ---
: ---
Assignee: Yaniv Bronhaim
QA Contact: Eldad Marciano
URL:
Whiteboard: infra
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2015-04-13 11:50 UTC by Eldad Marciano
Modified: 2016-02-10 19:01 UTC (History)
11 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-05-06 07:43:36 UTC
oVirt Team: Infra
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
logs (946.75 KB, application/zip)
2015-04-29 08:26 UTC, Eldad Marciano
no flags Details

Description Eldad Marciano 2015-04-13 11:50:56 UTC
Description of problem:
once hosts was upgraded from vt14 to vt14.1 they failed to start and hosts become "Non-responsive"
by the logs there seems to be a problem related to certificate:

2015-04-12 08:29:49,620 DEBUG [org.ovirt.vdsm.jsonrpc.client.internal.ResponseWorker] (ResponseWorker) Message received: {"jsonrpc":"2.0","error":{"message":"General SSLEngine problem","code":"host05-rack04.scale.openstack.engineering.redhat.com:738906710"},"id":null}


not sure what went wrong, this is a standard use case and this issue should work fine.

when i tried to re-install the hosts it runs perfect.

Version-Release number of selected component (if applicable):
vt14.1

How reproducible:
100%

Steps to Reproduce:
1. run engine on top of vt14.1
2. run vdsm using vt14 and upgrade to vt14.1 (add vt14.1 repo then yum update)
3. start the hosts via engine webadmin

Actual results:
host failed to start and become non responsive.

Expected results:
hosts start as expected with no errors

Additional info:
re install the hosts resolving the problem.

Comment 2 Oved Ourfali 2015-04-14 07:47:56 UTC
Eldad - Can you check vt14.3?
I suspect maybe related to:
Bug 1208752 - Vdsm upgrade 3.4 >> 3.5.1 doesn't restart vdsmd service

but not sure.

Comment 3 Oved Ourfali 2015-04-14 07:50:33 UTC
In addition, can you attach all relevant logs?

Comment 4 Yaniv Bronhaim 2015-04-14 07:57:34 UTC
please check with latest vdsm for 3.5 as oved asked already (vdsm-4.16.13) and vdsm.log , /var/log/messages , /var/log/yum.log should be enough to figure the errors

Comment 5 Yaniv Bronhaim 2015-04-26 05:53:18 UTC
If still appears please reopen with the requested info

Comment 6 Eldad Marciano 2015-04-29 08:25:26 UTC
I have installed vt14.3 and the problem still reproduced
logs will attached

Comment 7 Eldad Marciano 2015-04-29 08:26:13 UTC
Created attachment 1020024 [details]
logs

Comment 8 Piotr Kliczewski 2015-05-03 20:38:26 UTC
Please attach engine logs as well

Comment 10 Yaniv Bronhaim 2015-05-06 07:43:36 UTC
We see certificate exception in vdsm.log due to the installation flow - after reinstall the certificate is installed currently by host-deploy on host side.

The steps that lead to this error were that Eldad added this host to engine, then removed manually the vdsm rpms on host and installed new once (then did the upgrade, but its not related) - this flow should not work without adding the host or reinstall the host by the engine (using the host-deploy). manual rpm installation requires the user to copy the engine's certificate as well - if the host already part of the engine setup it doesn't mean that it should work as expected if user changed configurations on host manually.

Comment 11 Christopher Pereira 2015-05-13 05:23:41 UTC
Confirmed. Removing VDSM RPMs causes lost certificates.
The solution is to go into maintenance mode and reinstall the host.


Note You need to log in before you can comment on or make changes to this bug.