Description of problem:
static void _spice_timer_set(SpiceTimer *timer, uint32_t ms, uint32_t now)
The _spice_timer_set() function takes a 32-bit integer for the "now" value. The now value passed in however, can exceed 2^32 (it's in ms and derived from CLOCK_MONOTONIC, which will wrap around a 32-bit integer in around 46 days).
If the now value passed in exceeds 2^32, this will mean timers are inserted into the active list with expiry values before the current time, they will immediately trigger, and (if they don't make themselves inactive) be reinserted still before the current time.
This leads to an infinite loop in spice_timer_queue_cb().
Version-Release number of selected component (if applicable):
(but the same bug is in upstream git).
Created attachment 870759 [details]
Attaching a tentative fix for the bug. This also adds some extra casts to make sure we can't truncate ime values in a few other places.
patch looks good to me, moving to POST, to reflect that
Sent upstream http://lists.freedesktop.org/archives/spice-devel/2014-March/016302.html
Ok, I'm a bit baffled as to how the bug is not appearing under those conditions.
Do you have an active spice console to the VM? Is it responding correctly?
(In reply to David Gibson from comment #12)
> Ok, I'm a bit baffled as to how the bug is not appearing under those
> Do you have an active spice console to the VM? Is it responding correctly?
Hm yes, I opened two spice consoles to the VMs which are running since January 31st and it seems to be fine, responsive and host seems to be fine as well. So we are not able to reproduce so I suggest sanityOnly Verification provided It solved customer problem.
Ok, I'm really baffled as to how those systems can fail to show the bug.
Could you use gcore to grab a core from the running qemu processes on that high-uptime system so I can investigate what the triggering factor is?
David, did you get a chance to look at Marian's core?
Sorry, I've been busy.
I did take a look at the core, but it wasn't as useful as I hoped. Without a running process, or a core at exactly the right moment, it's very difficult to tell why this bug isn't triggering.
I don't really have time to track this down more thoroughly, so I guess we'll just have to go with the sanity checking we've had.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.