Bug 1052921

Summary: novnc didn't connect because the clocks of websocket proxy and the host weren't in sync.
Product: Red Hat Enterprise Virtualization Manager Reporter: Ilanit Stein <istein>
Component: ovirt-websocket-proxyAssignee: Frantisek Kobzik <fkobzik>
Status: CLOSED CURRENTRELEASE QA Contact: Ilanit Stein <istein>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 3.3.0CC: alonbl, bazulay, gklein, iheim, mavital, michal.skrivanek, yeylon, yzaslavs
Target Milestone: ---   
Target Release: 3.5.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: virt
Fixed In Version: ovirt-3.5.0-alpha2 Doc Type: Bug Fix
Doc Text:
The fix adds add 5-second time tolerance to websocket proxy to prevent connection refusal caused by really small time drifts between clocks of the engine and the machine where the proxy is deployed.
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-02-17 08:28:54 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1142923, 1156165    

Description Ilanit Stein 2014-01-14 12:05:02 UTC
Description of problem:

novnc console failed to connect in case websocket proxy configured on host other than engine, due to clocks of engine host and websocket proxy server no synced.

fkobzik: "The machine with engine was 40 seconds ahead of the host. So engine issues a ticket that is not yet valid on the host..." 

Version-Release number of selected component (if applicable):
is30

Comment 1 Michal Skrivanek 2014-01-15 13:07:09 UTC
IIRC there was an infra request to NTP  sync everything...is it still planned? 
Even just reporting host clocks would be great to see they are out of sync
I don't think we should fix this specific case as there are quite likely other races like this.

Comment 2 Frantisek Kobzik 2014-01-15 15:02:28 UTC
But the problem is that we tolerate some drift (120 secs by default) but only "to the future". We have 0 tolerance to the past. And even if we have all host in sync using NTP, small drift still can be present which cause the ticket to be invalid.

Comment 3 Michal Skrivanek 2014-01-16 08:36:58 UTC
yeah, but if you're NTP-synced it should never exceed the RTT you need for getting the ticket, acting on it in the backend(blazing fast!), and then send it

Comment 4 Alon Bar-Lev 2014-03-16 20:41:21 UTC
We can have 5 seconds tolerance into the past, no need more than that in sane environment.

Clock synchronization is a must. Even in disconnected environment, ntpd can be installed at engine machine to sync the entire environment.

Comment 5 Michal Skrivanek 2014-03-24 10:42:32 UTC
(In reply to Alon Bar-Lev from comment #4)
+1
there should be an infra feature to alert on nonsychronized hosts

Comment 6 Ilanit Stein 2014-08-12 14:15:50 UTC
Verified on ovirt-engine 3.5 -rc1.

Set host, on which runs the web socket to 1 minute behind engine, and see that noVNC console fail to connect.

Then sync time back, and see that the noVNC console is working OK.

Comment 7 Michal Skrivanek 2014-09-01 12:49:47 UTC
Barak, I'm actually still interested in the answer;-)
Are there any plans to manage time keeping between hosts&engine?

Comment 8 Barak 2014-09-02 18:29:23 UTC
This is actually under discussion,
The original feature was to enable as a part of the engine setup (whether to install and configure ntpd) and than configure all hypervisors to sync with it.

There were various issues with such an implementetion:
- it looks like the ntp config should be per DC, as DC may  be remote that will influence sync
- post installation this can not be changed without redeploying each DC
and ...

Currently we explore the option of integration with foreman(through puppet) to configure the hypervisor's ntp.

The time sync influence mostly the migration ...

Comment 9 Omer Frenkel 2015-02-17 08:28:54 UTC
RHEV-M 3.5.0 has been released