Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1148663

Summary: After host reboot engine gets VDSNetworkException: Message timeout which can be caused by communication issues
Product: Red Hat Enterprise Virtualization Manager Reporter: Gal Amado <gamado>
Component: ovirt-engineAssignee: Piotr Kliczewski <pkliczew>
Status: CLOSED DUPLICATE QA Contact: Pavel Stehlik <pstehlik>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 3.5.0CC: acanan, ecohen, gklein, iheim, lpeer, lsurette, oourfali, pkliczew, rbalakri, Rhev-m-bugs, yeylon
Target Milestone: ---   
Target Release: 3.5.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: infra
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-10-06 08:46:14 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Infra RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Gal Amado 2014-10-02 04:45:01 UTC
Description of problem:
Rebooting a host cause a VDSNetworkException on the engine.
As a result, the host get a faulty "none responding" status forever (tested for more than 15 hours) on engine's admin GUI   

Version-Release number of selected component (if applicable):
Red Hat Enterprise Virtualization Manager Version: 3.5.0-0.13.beta.el6ev
vdsm: vdsm-4.16.5-2.el6ev.x86_64


How reproducible:
Happens all the time.

Steps to Reproduce:
Setup : up and running engine with 1 host.
1.on the host , run "reboot"
2.wait for vdsm service to be up + some reasonable idle time for update (some 5 min !?) 
3. check engine, for a clear log
4. check host status on engine's admin GUI - should be OK (green) 

Actual results:
Exception on engine's log : 
VDSNetworkException: Message timeout which can be caused by communication issues  
On engine's Admin GUI :
- Host status is read
- repeated msg on engines event log "Host1 is not responding ..."
 
Expected results:
- sometime after the host is up and running, the host status should be OK on admin's GUI , and no exceptions on engines log.


Additional info:
- restarting engine service (by "service ovirt-engine restart"), and the host seems OK again.
This bug blocks Automation tests that reboot the hosts.

Comment 1 Oved Ourfali 2014-10-05 11:11:44 UTC
Seems like a duplicate of Bug 1148688 (the description is different, but same symptoms).
Piotr - can you verify, and if so close it as duplicate?

Comment 2 Piotr Kliczewski 2014-10-06 08:46:14 UTC
Yes. This issue was already fixed for Bug 1148688.

*** This bug has been marked as a duplicate of bug 1148688 ***