In RFE bug 970711 we are to report the migration downtime, which libvirt reports to vdsm, but it does not take into account clock differences on src and dst. It is done in a simple way using a src clock value, on dst host subtract from local time and that's how long it took (this is a part of the total reported number). We depend on a precise time synchronization between hosts For most migrations the downtime values are in order of tens of milliseconds, so we need to make sure that the src and dst host clock are synchronized at least with that precision. Otherwise the reported value is biased beyond being meaningful Customers do use the parameter for realtime workloads where they have a hard requirement on the allowed downtime, e.g. 100ms, which we are supposed to set as a migration convergence criteria There might be other ways how to make sure the time is in sync but - in <3.2 we used to enable ntpd (depending on admin's proper setup) - current code monitors for hosts time differences, but only in a naive way, with seconds precision, and default of 300s to alert We need better time sync reporting and help with setting it up correctly or manage it ourselves. E.g. if DHCP provides NTP server addresses we should start ntpd
Just to be sure, since this bug has been assigned to me but the whiteboard says infra, the reques here is to add ntp service as ovirt-engine-setup dependency and have engine-setup configuring and starting ntpd? Because looking at description, this doesn't seem to be an installer issue since the ntpd daemon should be running on the hosts, so ovirt-host-deploy looks like a better component for this issue.
as we discussed in irc, host-deploy cannot just enable ntpd service and assume that ntpd is functioning. yes, there is a chance that a valid ntpd configuration is available via the dhcp, but this is not assumption we should make by sysadmin and not us. if we require it as a feature of vdsm we must be sure that this is functioning. we should also not assume which ntp service is to be used by sysadmin to sync clocks, there are multiple choices out there. these are all minors and important technicals. the more important issue is that host-deploy is just automation of vdsm setup, nothing more. if you added a feature of *vdsm* that *requires* clock synchronization, then vdsm should take care of ntpd management, such as it takes care of iscsi or any other dependency. this will make sure that even if sysadmin stopped the ntpd post host-deploy or removed it from start at boot list, starting vdsm will trigger ntpd start. I would also suggest that every timestamp that is sent that requires clock synchronization will also send a boolean if clock is indeed synchronized or not, so manager can consider only these that actually synchronized. I my-self would have tried very hard to implement this feature without any need for clock synchronization by calculating the downtime differences and not absolute times. I am not sure what data is available for you, but each host should know the time when process starts and when the time its process ends, so it could calculate its delta, and the delta between its-self and the remote host, even if clocks are not synchronized.
note this can be solved by implementing bug 1162588
*** This bug has been marked as a duplicate of bug 1162588 ***