Note: This bug is displayed in read-only format because
the product is no longer active in Red Hat Bugzilla.
RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Cause: an integer overflow on a 32 bit timer value
Consequence: infinite loop in spice-server on long running VMs (> 46 days) causing SPICE sessions to be unresponsive
Fix: use 64 bit timer values where appropriate
Result:
Description of problem:
static void _spice_timer_set(SpiceTimer *timer, uint32_t ms, uint32_t now)
The _spice_timer_set() function takes a 32-bit integer for the "now" value. The now value passed in however, can exceed 2^32 (it's in ms and derived from CLOCK_MONOTONIC, which will wrap around a 32-bit integer in around 46 days).
If the now value passed in exceeds 2^32, this will mean timers are inserted into the active list with expiry values before the current time, they will immediately trigger, and (if they don't make themselves inactive) be reinserted still before the current time.
This leads to an infinite loop in spice_timer_queue_cb().
Version-Release number of selected component (if applicable):
spice-server-0.12.4-6.el6.x86_64
(but the same bug is in upstream git).
Created attachment 870759[details]
Tentative fix
Attaching a tentative fix for the bug. This also adds some extra casts to make sure we can't truncate ime values in a few other places.
Comment 4Marc-Andre Lureau
2014-03-07 12:57:23 UTC
patch looks good to me, moving to POST, to reflect that
Comment 5Christophe Fergeau
2014-03-10 11:45:39 UTC
Ok, I'm a bit baffled as to how the bug is not appearing under those conditions.
Do you have an active spice console to the VM? Is it responding correctly?
(In reply to David Gibson from comment #12)
> Ok, I'm a bit baffled as to how the bug is not appearing under those
> conditions.
>
> Do you have an active spice console to the VM? Is it responding correctly?
Hm yes, I opened two spice consoles to the VMs which are running since January 31st and it seems to be fine, responsive and host seems to be fine as well. So we are not able to reproduce so I suggest sanityOnly Verification provided It solved customer problem.
Ok, I'm really baffled as to how those systems can fail to show the bug.
Could you use gcore to grab a core from the running qemu processes on that high-uptime system so I can investigate what the triggering factor is?
Comment 18Christophe Fergeau
2014-05-30 13:26:20 UTC
David, did you get a chance to look at Marian's core?
Sorry, I've been busy.
I did take a look at the core, but it wasn't as useful as I hoped. Without a running process, or a core at exactly the right moment, it's very difficult to tell why this bug isn't triggering.
I don't really have time to track this down more thoroughly, so I guess we'll just have to go with the sanity checking we've had.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory, and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
http://rhn.redhat.com/errata/RHBA-2014-1435.html