426861 – timer stops running after live migrate or dom0 reboot & save/restore of a Xen guest

Bug 426861 - timer stops running after live migrate or dom0 reboot & save/restore of a Xen guest

Summary: timer stops running after live migrate or dom0 reboot & save/restore of a Xen...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 5
Classification:	Red Hat
Component:	xen
Sub Component:
Version:	5.2
Hardware:	All
OS:	Linux
Priority:	urgent
Severity:	high
Target Milestone:	rc
Target Release:	---
Assignee:	Rik van Riel
QA Contact:	Gurhan Ozen
Docs Contact:
URL:
Whiteboard:
Duplicates (4):	Roldyx 430245 459384 467253 (view as bug list)
Depends On:
Blocks:	391501 448753 RHEL5u3_relnotes 462680 464455
TreeView+	depends on / blocked

Reported:	2007-12-27 16:28 UTC by Rik van Riel
Modified:	2018-10-27 14:35 UTC (History)
CC List:	40 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:	In live migrations of paravirtualized guests, time-dependent guest processes may function improperly if the corresponding hosts' (dom0) times are not synchronized. Use NTP to synchronize system times for all corresponding hosts before migration.
Clone Of:
Environment:
Last Closed:	2009-01-20 21:11:55 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
*patch to reset monotonic_tv. on backwards time jump** (1.15 KB, patch) 2008-07-09 18:34 UTC, Rik van Riel	no flags	Details \| Diff
add printk, timespec uses ns, timeval uses usec (1.34 KB, patch) 2008-07-09 19:45 UTC, Rik van Riel	no flags	Details \| Diff
Don't clobber wallclock on restore (5.98 KB, patch) 2008-10-02 10:42 UTC, Daniel Berrangé	no flags	Details \| Diff
Show Obsolete (2) View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2009:0118	0	normal	SHIPPED_LIVE	xen bug fix and enhancement update	2009-01-20 16:04:49 UTC

Description Rik van Riel 2007-12-27 16:28:30 UTC

Description of problem:

After doing a save/restore of a Xen guest, eg. to reboot dom0, crond no longer
starts up cron jobs.  Restarting crond after a save/restore works around the issue.

Version-Release number of selected component (if applicable):

vixie-cron-4.1-70.el5

Steps to Reproduce:
1. have a dom0 with various (paravirt) Xen guests
2. set up a regular cron job in one of the guests
3. type "init 6" in dom0
4. watch the guests get saved to disk
5. after reboot, watch the guests get loaded
6. watch /var/log/cron in the guest with the test cron job
7. cron jobs do not get started after the guest restore
  
Expected results:

Cron jobs continue to be started after a save/restore.

Additional info:

This seems to be happening in both RHEL4 and RHEL5 paravirt guests.

Comment 1 Rik van Riel 2007-12-27 18:41:00 UTC

I am not yet sure how to reliably reproduce this, or whether it also affects
other programs.

However, I did just notice that I had 2 week old email sitting in the queue and
kicking exim dislodged it.   I have only seen that once though, so it could be
another fluke.

The problem of crond not starting any cron jobs after a save/restore is an issue
I have seen multiple times on 3 virtual machines, so that is confirmed...

Comment 2 Rik van Riel 2007-12-27 20:55:16 UTC

I just found other stuck processes, so it's not just crond.

# strace -p 24859
Process 24859 attached - interrupt to quit
restart_syscall(<... resuming interrupted call ...>

Unfortunately, I have no idea what syscall it is restarting.  The good news is,
the only syscalls that should take a long time in this process are nanosleep and
wait4 - both of which are potential candidates for crond trouble, too.

Comment 4 Chris Verhoef 2008-02-07 13:59:48 UTC

Have the same issue with migration at a customer site with dom0's and domU's
running RHEL5u1 X86_64.
The first time a domU migrate to another dom0 it seems to be okay. The second
migration succeed according xen migrate, but not all services/applications are
coming up again. Is this case the clock (date) is not running any more and also
ssh access isn't their anymore.

Comment 5 Chris Verhoef 2008-02-08 12:31:58 UTC

Here some time information about the migration.
-server3 and server4 are the dom0's
-student08 is de domU migrated from server3 to server4.

#Before migration
[root@satellite ~]# for i in server3 server4 student12 student08; do echo $i:
`ssh $i date`; done
server3: Fri Feb 8 12:41:14 CET 2008
server4: Fri Feb 8 12:41:14 CET 2008
student12: Fri Feb 8 12:41:14 CET 2008
student08: Fri Feb 8 12:41:14 CET 2008

#After migration
[root@satellite ~]# for i in server3 server4 student12 student08; do echo $i:
`ssh $i date`; done
server3: Fri Feb 8 12:41:22 CET 2008
server4: Fri Feb 8 12:41:22 CET 2008
student12: Fri Feb 8 12:41:22 CET 2008
student08: Fri Feb 8 12:48:16 CET 2008

#HW-clock done at the same time.
[root@server3 virt]# hwclock --show
Fri 08 Feb 2008 12:51:50 PM CET  -0.739546 seconds
[root@server4 ~]# hwclock --show
Fri 08 Feb 2008 12:45:49 PM CET  -0.291921 seconds


################################
Fixing HW clock
hwclock -w

#Before migration
[root@satellite ~]# for i in server3 server4 student12 student08; do echo $i:
`ssh $i date`; done
server3: Fri Feb 8 13:06:00 CET 2008
server4: Fri Feb 8 13:06:00 CET 2008
student12: Fri Feb 8 13:06:00 CET 2008
student08: Fri Feb 8 13:06:00 CET 2008

#After migration
[root@satellite ~]# for i in server3 server4 student12 student08; do echo $i:
`ssh $i date`; done
server3: Fri Feb 8 13:09:22 CET 2008
server4: Fri Feb 8 13:09:22 CET 2008
student12: Fri Feb 8 13:09:23 CET 2008
student08: Fri Feb 8 13:16:16 CET 2008

#HW-clock
[root@satellite ~]# for i in server3 server4; do echo $i date: `ssh $i date`;
echo $i hwclock: `ssh $i hwclock --show`;doneserver3 date: Fri Feb 8 13:10:31
CET 2008
server3 hwclock: Fri 08 Feb 2008 01:10:33 PM CET -0.858803 seconds
server4 date: Fri Feb 8 13:10:32 CET 2008
server4 hwclock: Fri 08 Feb 2008 01:10:34 PM CET -0.707565 seconds

Comment 6 Chris Verhoef 2008-02-08 13:48:16 UTC

Same time reported on multiple date commands within several seconds:

#Before migration
[root@satellite ~]# for i in server3 server4 student12 student08; do echo $i:
`ssh $i date`; done
server3: Fri Feb 8 14:18:54 CET 2008
server4: Fri Feb 8 14:18:54 CET 2008
student12: Fri Feb 8 14:18:54 CET 2008
student08: Fri Feb 8 14:18:54 CET 2008

#After migration of student08 from server4 to server3 the date command
[root@student08 ~]# date
Fri Feb  8 14:18:59 CET 2008
[root@student08 ~]# date
Fri Feb  8 14:18:59 CET 2008
[root@student08 ~]# date
Fri Feb  8 14:18:59 CET 2008
[root@student08 ~]# date
Fri Feb  8 14:18:59 CET 2008
[root@student08 ~]# date
Fri Feb  8 14:18:59 CET 2008
[root@student08 ~]# date
Fri Feb  8 14:18:59 CET 2008
[root@student08 ~]# date
Fri Feb  8 14:18:59 CET 2008
[root@student08 ~]# date
Fri Feb  8 14:18:59 CET 2008
[root@student08 ~]# date
Fri Feb  8 14:18:59 CET 2008
[root@student08 ~]# date
Fri Feb  8 14:18:59 CET 2008
[root@student08 ~]# date
Fri Feb  8 14:18:59 CET 2008
[root@student08 ~]# date
Fri Feb  8 14:18:59 CET 2008

Time diffs:
[root@satellite ~]# for i in server3 server4 student12 student08; do echo $i:
`ssh $i date`; done
server3: Fri Feb 8 14:19:34 CET 2008
server4: Fri Feb 8 14:19:34 CET 2008
student12: Fri Feb 8 14:19:34 CET 2008
student08: Fri Feb 8 14:18:59 CET 2008

Comment 8 Joe Orton 2008-02-19 08:17:34 UTC

*** Bug 430245 has been marked as a duplicate of this bug. ***

Comment 9 Rabie van der Merwe 2008-02-25 11:21:37 UTC

I have experienced the same issue doing live migrations.
Normally this is fixed by rebooting all the nodes in the cluster.

Time on guest is fine, once migration completes the clock on the guest will
either be stuck (time will not change) or it wil be set to some arb time/date
which I have not been able to corrolate to anything of note. Using the date
command has worked once of twice to get the clock on stuck, however if the clock
is still running but of an incorrect value nothing help asside from shutting the
guest down and starting it again.

Once the guest is broken I get this on the console which may or may not be related:
BUG: soft lockup detected on CPU#1!

Call Trace:
 <IRQ>  [<ffffffff802aaa83>] softlockup_tick+0xd5/0xe7
 [<ffffffff8026cb4a>] timer_interrupt+0x396/0x3f2
 [<ffffffff80210afe>] handle_IRQ_event+0x2d/0x60
 [<ffffffff802aae0b>] __do_IRQ+0xa4/0x105
 [<ffffffff80288712>] _local_bh_enable+0x61/0xc5
 [<ffffffff8026a90e>] do_IRQ+0xe7/0xf5
 [<ffffffff8025db2b>] child_rip+0x11/0x12
 [<ffffffff8039664c>] evtchn_do_upcall+0x86/0xe0
 [<ffffffff8025d8ce>] do_hypervisor_callback+0x1e/0x2c
 <EOI>  [<ffffffff802063aa>] hypercall_page+0x3aa/0x1000
 [<ffffffff802063aa>] hypercall_page+0x3aa/0x1000
 [<ffffffff8026be87>] raw_safe_halt+0x84/0xa8
 [<ffffffff80269453>] xen_idle+0x38/0x4a
 [<ffffffff80247b8e>] cpu_idle+0x97/0xba

Comment 10 Chris Lalancette 2008-02-25 13:57:54 UTC

No, that softlockup is probably not related, and, in fact, that message is
probably fixed in 5.2 by the patch in BZ 250994.  The time stopping is
definitely a problem, though.

Chris Lalancette

Comment 11 Rik van Riel 2008-03-13 02:56:12 UTC

I have just upgraded my home system to 5.2 beta and rebooted dom0.  Inside the
guests, crond got stuck and I took a crashdump of one of the guests.

I'll try to get more info on what is going wrong soon.

Comment 12 Zak Berrie 2008-03-19 21:17:48 UTC

We commonly experience this bug in GLS.  Let me know if any resources from the
training organization might be helpful in dealing with this issue.

-Zak Brown

Comment 13 Rik van Riel 2008-03-19 22:32:10 UTC

Hi Zak,

one of the things you can do is take a crash dump of the domain after migration
and analyze the domain to see what's going on.  This is the first time I have
used a crash dump to analyze a bug (I started kernel hacking before tools were
available, and I still don't use them :)), so maybe you'll find something that I
miss...

Comment 17 Roger Nunn 2008-04-22 16:38:44 UTC

I experienced the same issue: Red Hat Enterprise Linux 5.1 xen host 
Xen guest running Red Hat Enterprise linux 4 U6 64bit.
As a workaround to the varying time I isolated the xen guest from 
its host clock:
echo "1" > /proc/sys/xen/independent_wallclock
for persistence:
add: xen.independent_wallclock = 1 to /etc/sysctl.conf
activate with
sysctl -p
When migrating live xen guests within a cluster suite environment this solved
the wandering time issue I was experiencing with only a short pause encountered
on the guest. 
This means however that you will need to rely on ntp for your xen guests to
maintain time sync.  
Hope that this is usefull.

Comment 18 Rafael Godínez Pérez 2008-04-22 17:28:27 UTC

This parameters on sysctl.conf worked ok for me on RHEL 5.1 guests in RHEL 5.1
hosts.
No more issues with clock in live migration.

Comment 20 Chris Verhoef 2008-05-05 13:37:56 UTC

Today at the customer site and tested the independent_wallclock parameter with
success. So also at this customer the live migration works smoothly running
RHEl5.1 Dom0's and DomU's.

Comment 22 RHEL Program Management 2008-06-09 21:59:25 UTC

This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 23 Bryn M. Reeves 2008-06-11 17:07:54 UTC

One report of this problem mentions that in their case, they see time go
forwards or stall when migrating to a system with greater uptime than the
originating host and backwards shifts when migrating to a system with a shorter
uptime.

Not sure if this is related/relevant or already known but I didn't see it
mentioned in this bugzilla.

Comment 24 Bill Burns 2008-06-20 13:49:28 UTC

Thanks for info.

Comment 25 Diego Woitasen 2008-06-28 02:32:39 UTC

For you information, I've got the same problem with xen 3.2.1 from xen.org. I
only tested it with live migration. I move a VM from A to B and then back to A
again, and the clock stops.

Comment 26 Diego Woitasen 2008-06-28 02:49:58 UTC

http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=1282

Comment 27 Diego Woitasen 2008-06-28 02:53:54 UTC

echo "1" > /proc/sys/xen/independent_wallclock workaround works for me. I set
this on both dom0s and domU.

Comment 28 Rik van Riel 2008-07-09 16:29:46 UTC

The independent_wallclock setting works fairly well for live migration, but has
the issue that the wall clock time does not advance across a dom0 reboot (where
the guests are saved and restored), which puts the guest several minutes behind
for me.

I suspect we should just reset the monotonic_tv values on restore and live
migrate, which can cause the time to go backwards in guests (if the dom0 clocks
are out of sync), but at least things will continue to run.

I will run some experiments with this.

Comment 29 Rik van Riel 2008-07-09 18:34:56 UTC

Created attachment 311400 [details]
patch to reset monotonic_tv.* on backwards time jump

If the monotonic_tv time is more than 1/8th of a second (maximum drift ntpd
allows) ahead of the hypervisor time, reset monotonic_tv.

This patch is still untested.

Comment 32 Rik van Riel 2008-07-09 19:45:16 UTC

Created attachment 311410 [details]
add printk, timespec uses ns, timeval uses usec

Comment 37 Rik van Riel 2008-07-10 15:16:39 UTC

Comment on attachment 311410 [details]
add printk, timespec uses ns, timeval uses usec

Since gettimeofday should never go backwards, I believe it is better to have it
return the same time over and over again than make a backwards jump.

Comment 40 Rik van Riel 2008-08-06 18:38:12 UTC

NTP syncing is no fix for this problem.  Jiffies and realtime in the guest seem to advance normally after a live migration between two NTP synced dom0s, but the wakeup event still gets lost!

I guess the main problem is that wakeups can get lost after a live migrate.  Not having the time the same between two hosts can be fixed by NTP syncing the dom0s involved.

Comment 41 Rik van Riel 2008-08-06 19:27:56 UTC

OK, it turns out NTP syncing does not always fix this problem.

However, a time jump observed correlates very nicely with the difference in uptime between the host systems!

On the guest:

# while sleep 1 ; do date ; done
Wed Aug  6 14:45:48 EDT 2008
Sun Aug 10 20:38:32 EDT 2008


The dom0s in question:

[root@tethys ~]# uptime
15:23:52 up 6 days,  6:25,  1 user,  load average: 0.11, 0.07, 0.01
[root@kenny xen]# uptime
15:23:54 up 2 days, 33 min,  1 user,  load average: 0.00, 0.00, 0.00

Comment 42 Rik van Riel 2008-08-06 19:42:01 UTC

OK, found a problem.

On the kernel side, get_time_values_from_xen() gets its time by looking at the vcpu_time_info struct that is exported by the hypervisor.

This contains the field "system_time", which is the time in nanoseconds since system bootup - host system bootup!

This means the guest keeps its clock by looking at how much time has elapsed since the host booted up.  This totally breaks down in a live migrate, when the guest moves from a host with one uptime to a host with another uptime.

I'll start a discussion upstream to see what kind of fix would be best.

Comment 43 Rik van Riel 2008-08-07 15:56:01 UTC

Fixed with upstream changeset xen-unstable.hg:15706.

It turns out it was a userspace bug, with the restore code overwriting some time data of the current hypervisor with data from the old hypervisor.

The problem is that Xen time is calculated as (HV boot time + HV uptime),
but on a restore the userland code would clobber the HV boot time with
the boot time of the HV on which the guest previously ran.   This screws
up timekeeping.

Avoiding that clobber avoids the problem.

Comment 45 Rik van Riel 2008-08-11 17:58:02 UTC

Release note added. If any revisions are required, please set the 
"requires_release_notes" flag to "?" and edit the "Release Notes" field accordingly.
All revisions will be proofread by the Engineering Content Services team.

Comment 46 Joe Pruett 2008-08-12 18:28:50 UTC

it looks like this is slated for 5.3.  any chance of an interim fix until then?

Comment 47 Rik van Riel 2008-08-12 18:53:00 UTC

Making hotfix RPMs available is something only support can do.

However, as an engineer I can provide *test* RPMs.  Note that these contain all kinds of other commits (they are a snapshot of RHEL 5.3 development), some of which may end up being reverted before 5.3 comes out.

If you still want to test them, you can get a test RPM from http://people.redhat.com/riel/.bz426861/

If you feel adventurous, feel free to try it out. Please let us know whether or not it RPM works for you.

With NTP synced dom0s and the test RPM, live migration of paravirtualized guests should work correctly.

Comment 48 Joe Pruett 2008-08-12 19:50:25 UTC

is ntp and the unsynch'ed clock necessary?  or just the updated userland restore tool?

Comment 49 Rik van Riel 2008-08-12 20:10:03 UTC

Joe,

a paravirtualized Xen guest derives its clock from the hypervisor clock.  As a consequence, if you have two host systems with their clocks out of sync, a live migration will cause a clock jump in the guest.

If the clocks of both hosts are synced (preferably using ntp), everything will work fine.

Comment 50 Chris Lalancette 2008-08-18 12:14:57 UTC

*** Bug 459384 has been marked as a duplicate of this bug. ***

Comment 51 Klaus Steinberger 2008-08-19 06:32:55 UTC

Rik, there is no i386 RPM in you're Test download. and I had trouble to build that:

+ chmod +x tools/check/check_libvncserver
+ CFLAGS='-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m32 -march=i386 -mtune=generic -fasynchronous-unwind-tables'
+ /usr/bin/make XENFB_TOOLS=n XEN_PYTHON_NATIVE_INSTALL=1 DESTDIR=/var/tmp/xen-3.0.3-69.0-root tools docs
/usr/bin/make -C tools install
make[1]: Entering directory `/usr/src/redhat/BUILD/xen-3.1.0-src/tools'
/usr/bin/make -C check
make[2]: Entering directory `/usr/src/redhat/BUILD/xen-3.1.0-src/tools/check'
make[2]: *** ../../.config: Is a directory.  Stop.
make[2]: Leaving directory `/usr/src/redhat/BUILD/xen-3.1.0-src/tools/check'
make[1]: *** [check] Error 2
make[1]: Leaving directory `/usr/src/redhat/BUILD/xen-3.1.0-src/tools'
make: *** [install-tools] Error 2
error: Bad exit status from /var/tmp/rpm-tmp.4331 (%build)


RPM build errors:
    Bad exit status from /var/tmp/rpm-tmp.4331 (%build)

Any chance to get a i386 RPM?

Comment 52 Joe Pruett 2008-08-26 23:09:26 UTC

i grabbed the srpm that rik provided and rebuilt on my system and it has fixed the problem.  so hopefully this will get pushed into a real release sometime soon.

Comment 53 Ryan Lerch 2008-08-26 23:53:55 UTC

Release note updated. If any revisions are required, please set the 
"requires_release_notes"  flag to "?" and edit the "Release Notes" field accordingly.
All revisions will be proofread by the Engineering Content Services team.

Diffed Contents:
@@ -1,3 +1 @@
-In order for the time of a paravirtualized Xen guest to stay constant during a live migration, the dom0s should have their clocks in sync.  Using NTP to sync the clocks of the hosts is recommended.
+In live migrations of paravirtualized guests, time-dependent guest processes may function improperly if the corresponding hosts' (dom0) times are not synchronized. Use NTP to synchronize system times for all corresponding hosts before migration.-
-(please wordsmith this into something readable :))

Comment 55 Chris Lalancette 2008-09-01 13:59:36 UTC

Built into xen-3.0.3-71.el5

Comment 58 Bill Burns 2008-09-02 13:20:15 UTC

Re Comment #54. You need to get the hotfix flag set to ? I expect. I cannot do it...

Comment 62 Bill Burns 2008-09-11 19:24:05 UTC

Release note updated. If any revisions are required, please set the 
"requires_release_notes"  flag to "?" and edit the "Release Notes" field accordingly.
All revisions will be proofread by the Engineering Content Services team.

Diffed Contents:
@@ -1 +1,3 @@
+The following release note is no longer required.
+
 In live migrations of paravirtualized guests, time-dependent guest processes may function improperly if the corresponding hosts' (dom0) times are not synchronized. Use NTP to synchronize system times for all corresponding hosts before migration.

Comment 64 Bill Burns 2008-09-11 20:21:14 UTC

Release note updated. If any revisions are required, please set the 
"requires_release_notes"  flag to "?" and edit the "Release Notes" field accordingly.
All revisions will be proofread by the Engineering Content Services team.

Diffed Contents:
@@ -1,3 +1 @@
-The following release note is no longer required.
-
 In live migrations of paravirtualized guests, time-dependent guest processes may function improperly if the corresponding hosts' (dom0) times are not synchronized. Use NTP to synchronize system times for all corresponding hosts before migration.

Comment 65 Alain RICHARD 2008-09-26 07:36:03 UTC

I can confirm that the xen-3.0.3-69.0 recompiled for 32 bits fix this issue, and this is the first time since RHEL 5.0 (one year ago) that I am able to live migrate all my vm several times between two hosts without any glitch !

It is a pity habing to wait for the 5.3 release for a such blocking bug !

Good job for correcting it.

Regards,

Comment 66 Shad L. Lords 2008-09-27 00:35:50 UTC

I can also confirm that this fixed my live migrations as well.  Please don't make us wait for 5.3 to get this issue fixed.

Comment 70 Daniel Berrangé 2008-10-02 10:42:37 UTC

Created attachment 319209 [details]
Don't clobber wallclock on restore

The final patch applied to the RPM for RHEL-5.3

Comment 71 Joe Pruett 2008-10-07 16:27:46 UTC

now that there is a new version of xen (3.0.3-64.el5_2.3), for those of us who have installed the 3.0.3-69.0 version, what should we do?  is the errata RHSA-2008:0892-10 an issue for 3.0.3-69?

Comment 72 Florian La Roche 2008-10-07 19:19:29 UTC

Seems -69.0 already contains those security updates, so I assume
you can continue using it.

regards,

Florian La Roche

Comment 73 Chris Lalancette 2008-10-08 07:10:58 UTC

Further, I'm not quite sure what running -69.0 is buying you; the patch to fix this issue wasn't committed until -71.  That being said, it is all beta stuff, and not supported at all yet, so I would usually recommend to stay with the supported stuff.  There is an updated xen package coming out soon (if it's not already out) with this fix in it.

Chris Lalancette

Comment 74 Shad L. Lords 2008-10-08 13:51:27 UTC

(In reply to comment #71)
> now that there is a new version of xen (3.0.3-64.el5_2.3), for those of us who
> have installed the 3.0.3-69.0 version, what should we do?  is the errata
> RHSA-2008:0892-10 an issue for 3.0.3-69?

According to bug 464455 the fix will be in 3.0.3-64.el5_2.4.

(In reply to comment #73)
> Further, I'm not quite sure what running -69.0 is buying you; the patch to fix
> this issue wasn't committed until -71.

I don't know what issue you are taking about but the existing -64 live migrate will freeze the domU timer about 80% of the time.  With -69 I've yet to have the clock freeze.  So obviously there is something between -64 and -69 that fixes this issue.

Comment 75 Chris Lalancette 2008-10-08 14:12:20 UTC

Huh, very odd.  Like I said, the patch we specifically proposed and integrated for this BZ was committed in -71 (I know, I did it!).  If something else between -64 and -69 helped the situation, then great, but we also need this patch (and the fix that's in this BZ is also the one that is going to be committed to 3.0.3-64.el5_2.4).

Chris Lalancette

Comment 76 Rik van Riel 2008-10-08 15:00:26 UTC

Chris,

I believe the -69 they are referring to is the test RPM I put on my people.redhat.com page, not the -69 directly from our CVS tree.

Joe, Shad, xen 3.0.3-64.el5_2.4 will have the migrate fix that you need.

Comment 77 Joe Pruett 2008-10-08 15:04:33 UTC

yes, i am talking about the -69 rpm that rik provided.

when is the 3.0.3-64.el5_2.4 version going to be released?

and i've always wondered what is the best way to go backwards on an rpm.  is:

rpm -Uvh --force

the best way?

Comment 78 Rik van Riel 2008-10-08 15:20:10 UTC

Joe,

you'll want to use rpm -Uvh --oldpackage

Comment 79 Shad L. Lords 2008-10-08 15:39:00 UTC

I'm also refering to the one in Rik's page.

Comment 80 Joe Pruett 2008-10-08 16:54:24 UTC

so rik, do you have an idea when the 2.4 rpm will be released?

Comment 81 Chris Lalancette 2008-10-16 15:39:25 UTC

*** Bug 467253 has been marked as a duplicate of this bug. ***

Comment 91 Gurhan Ozen 2008-12-29 22:56:01 UTC

This is retested with -80. Verifying the bug.

Comment 92 Klaus Steinberger 2008-12-30 07:33:34 UTC

Hey, many people are waiting for this fix, so please release it for 5.2 !

Nobody understands why it needs months for this little but urgent fix for QA.

Sincerly,
Klaus

Comment 93 Rik van Riel 2008-12-30 15:32:10 UTC

Klaus, the fixed Xen package for RHEL 5.2 (xen-3.0.3-64.el5_2.4) was released on November 11th.

Comment 94 Joe Pruett 2008-12-30 16:05:01 UTC

2.4 is not available via yum.  2.3 is the latest version i see:

# yum list xen
Loading "rhnplugin" plugin
rhel-i386-server-5        100% |=========================| 1.4 kB    00:00     
rhel-i386-server-vt-5     100% |=========================| 1.4 kB    00:00     
rhn-tools-rhel-i386-serve 100% |=========================| 1.2 kB    00:00     
Available Packages
xen.i386                                 3.0.3-64.el5_2.3       rhel-i386-server

Comment 95 Klaus Steinberger 2008-12-30 16:09:19 UTC

Rik, are you really sure the package has gone out? 

I can't find any errata regarding xen-3.0.3-64.el5_2.4 as well as no source RPM
on the redhat servers. Also both CentOS and Scientific don't have it, so I suspect
it has not gone out.

Sincerly,
Klaus

Comment 96 Rik van Riel 2008-12-30 16:23:47 UTC

You are right, it appears that xen-3.0.3-64.el5_2.4 is not on RHN. From what I understood it was supposed to have been released. I will try to figure out what happened.

Comment 99 Klaus Steinberger 2009-01-07 07:55:58 UTC

Hi Rik,

did you find out what happened? Can we expect the release of xen-3.0.3-64.el5_2.4?

Sincerly,
Klaus

Comment 100 Gurhan Ozen 2009-01-07 10:39:30 UTC

(In reply to comment #99)
> Hi Rik,
> 
> did you find out what happened? Can we expect the release of
> xen-3.0.3-64.el5_2.4?
> 
> Sincerly,
> Klaus

Klaus, 
it was just  pushed out today (January 7th) .
http://rhn.redhat.com/errata/RHSA-2009-0003.html

Thanks.

Comment 101 Daniel Berrangé 2009-01-07 10:48:02 UTC

Klaus, the reason for the delay, is that this fix was in an RPM update that also contained two security fixes. These required quite extensive QA before we could release them to avoid risk of regressions, which unfortunately delayed the release of this timer / migration bug fix. As Gurhan just mentioned, this should be on RHN today.

Comment 105 errata-xmlrpc 2009-01-20 21:11:55 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2009-0118.html

Comment 109 Rik van Riel 2009-04-10 17:13:29 UTC

*** Bug 360741 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.

alain.richard
bburns
bdevouge
berrange
bmr
cbolz
ccarlino
Chris.Verhoef
clalance
diegows
dkelson
drussell
duck
florian.laroche
gozen
grimme
herbert
herrold
jarod
jburke
joey
john.haxby
jonstanley
jorton
jplans
jskrabal
k.georgiou
klaus.steinberger
mkranz
mmcgrath
rlerch
rroldan
sdodson
sghosh
slords
sputhenp
srao
tao
tscherf
xen-maint