Bug 623729 - Windows HVM guest save/restore doesn't work on AMD platform
Summary: Windows HVM guest save/restore doesn't work on AMD platform
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel-xen
Version: 5.7
Hardware: All
OS: Linux
low
medium
Target Milestone: rc
: ---
Assignee: Xen Maintainance List
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On:
Blocks: 514489
TreeView+ depends on / blocked
 
Reported: 2010-08-12 16:11 UTC by Michal Novotny
Modified: 2014-02-02 22:38 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-11-26 13:51:41 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Michal Novotny 2010-08-12 16:11:38 UTC
Description of problem:
When user tries to save and restore Windows HVM guest on AMD platform the restore fails since it's not being properly saved and there are following errors printed into the `dm dmesg` output:

(XEN) save.c:174:d0 HVM restore: Xen changeset was not saved.
(XEN) lapic_load to rearm the actimer:bus cycle is 10ns, saved tmict count
116770000, period 1167700000ns, irq=253

This happens only on AMD hardware (tried both on Barcelona and Phenome B2) for Windows guests. For the Linux HVM guest I can see the following messages in `xm dmesg` but the guest is working fine:

(XEN) save.c:115:d0 HVM save: CPU
(XEN) save.c:115:d0 HVM save: PIC
(XEN) save.c:115:d0 HVM save: IOAPIC
(XEN) save.c:115:d0 HVM save: LAPIC
(XEN) save.c:115:d0 HVM save: LAPIC_REGS
(XEN) save.c:115:d0 HVM save: PCI_IRQ
(XEN) save.c:115:d0 HVM save: ISA_IRQ
(XEN) save.c:115:d0 HVM save: PCI_LINK
(XEN) save.c:115:d0 HVM save: PIT
(XEN) save.c:115:d0 HVM save: RTC
(XEN) save.c:115:d0 HVM save: HPET
(XEN) save.c:115:d0 HVM save: PMTIMER
(XEN) save.c:174:d0 HVM restore: Xen changeset was not saved.
(XEN) save.c:218:d0 HVM restore: CPU 0
(XEN) save.c:218:d0 HVM restore: PIC 0
(XEN) save.c:218:d0 HVM restore: PIC 1
(XEN) save.c:218:d0 HVM restore: IOAPIC 0
(XEN) save.c:218:d0 HVM restore: LAPIC 0
(XEN) save.c:218:d0 HVM restore: LAPIC_REGS 0
(XEN) lapic_load to rearm the actimer:bus cycle is 10ns, saved tmict count 6249, period 999840ns, irq=239
(XEN) save.c:218:d0 HVM restore: PCI_IRQ 0
(XEN) save.c:218:d0 HVM restore: ISA_IRQ 0
(XEN) save.c:218:d0 HVM restore: PCI_LINK 0
(XEN) save.c:218:d0 HVM restore: PIT 0
(XEN) save.c:218:d0 HVM restore: RTC 0
(XEN) save.c:218:d0 HVM restore: HPET 0
(XEN) save.c:218:d0 HVM restore: PMTIMER 0

On the Windows guests the system simply hangs with both with and without PV drivers. On Intel platform no such messages are being printed to the `xm dmesg` output.

Version-Release number of selected component (if applicable):
kernel-2.6.18-194.8.1.el5xen
xen-3.0.3-115.el5virttest31.g7e4798b

How reproducible:
100%

Steps to Reproduce:
1. create a Windows HVM guest on AMD machine
2. try to local migrate the guest (xm migrate $dom localhost)

Actual results:
Windows guest hangs and you can't access the guest at all.

Expected results:
Windows guest should be working fine like on Intel machine.

Additional info:

 We've been investigating this and we've found out both local
and remote migration is working fine on Intel but neither local nor remote
migration was not working on AMD.

I can see no error in xend.log but in `xm dmesg` output I've discovered
following messages at the end:

(XEN) traps.c:1877:d0 Domain attempted WRMSR 00000000c001001f from
00582000:00000008 to 00586000:00000008.
(XEN) save.c:174:d0 HVM restore: Xen changeset was not saved.
(XEN) lapic_load to rearm the actimer:bus cycle is 10ns, saved tmict count
116770000, period 1167700000ns, irq=253
(XEN) save.c:174:d0 HVM restore: Xen changeset was not saved.
(XEN) lapic_load to rearm the actimer:bus cycle is 10ns, saved tmict count
116770000, period 1167700000ns, irq=253
(XEN) save.c:174:d0 HVM restore: Xen changeset was not saved.
(XEN) lapic_load to rearm the actimer:bus cycle is 10ns, saved tmict count
2562350000, period 4148663520ns, irq=253
(XEN) save.c:174:d0 HVM restore: Xen changeset was not saved.
(XEN) lapic_load to rearm the actimer:bus cycle is 10ns, saved tmict count
2562350000, period 4148663520ns, irq=253
(XEN) save.c:174:d0 HVM restore: Xen changeset was not saved.
(XEN) lapic_load to rearm the actimer:bus cycle is 10ns, saved tmict count
3403050000, period 3965728928ns, irq=253

So I guess this is something hypervisor related since according to the testing
it's always working on Intel but never on AMD.

Comment 1 Michal Novotny 2010-08-13 16:01:39 UTC
That's strange, I did testing of this once again just for save/restore and the guest was working so I'm afraid it's not always reproducible when doing save/restore but according to my testing it's 100% reproducible when doing local migrations.

Here's the `xm dmesg` output from case it was working fine on AMD Opteron when doing save and restore (so the missing changeset in the headers is not fatal thing as well as the "(XEN) lapic_load to rearm the actimer:bus cycle is 10ns, saved tmict count 3962750000, period 972794336ns, irq=253" message):

(XEN) save.c:115:d0 HVM save: CPU
(XEN) save.c:115:d0 HVM save: PIC
(XEN) save.c:115:d0 HVM save: IOAPIC
(XEN) save.c:115:d0 HVM save: LAPIC
(XEN) save.c:115:d0 HVM save: LAPIC_REGS
(XEN) save.c:115:d0 HVM save: PCI_IRQ
(XEN) save.c:115:d0 HVM save: ISA_IRQ
(XEN) save.c:115:d0 HVM save: PCI_LINK
(XEN) save.c:115:d0 HVM save: PIT
(XEN) save.c:115:d0 HVM save: RTC
(XEN) save.c:115:d0 HVM save: HPET
(XEN) save.c:115:d0 HVM save: PMTIMER
(XEN) sysctl.c:51: Allowing physinfo call with newer ABI version
(XEN) sysctl.c:51: Allowing physinfo call with newer ABI version
(XEN) sysctl.c:51: Allowing physinfo call with newer ABI version
(XEN) sysctl.c:51: Allowing physinfo call with newer ABI version
(XEN) sysctl.c:51: Allowing physinfo call with newer ABI version
(XEN) sysctl.c:51: Allowing physinfo call with newer ABI version
(XEN) save.c:174:d0 HVM restore: Xen changeset was not saved.
(XEN) save.c:218:d0 HVM restore: CPU 0
(XEN) save.c:218:d0 HVM restore: PIC 0
(XEN) save.c:218:d0 HVM restore: PIC 1
(XEN) save.c:218:d0 HVM restore: IOAPIC 0
(XEN) save.c:218:d0 HVM restore: LAPIC 0
(XEN) save.c:218:d0 HVM restore: LAPIC_REGS 0
(XEN) lapic_load to rearm the actimer:bus cycle is 10ns, saved tmict count 3962750000, period 972794336ns, irq=253
(XEN) save.c:218:d0 HVM restore: PCI_IRQ 0
(XEN) save.c:218:d0 HVM restore: ISA_IRQ 0
(XEN) save.c:218:d0 HVM restore: PCI_LINK 0
(XEN) save.c:218:d0 HVM restore: PIT 0
(XEN) save.c:218:d0 HVM restore: RTC 0
(XEN) save.c:218:d0 HVM restore: HPET 0
(XEN) save.c:218:d0 HVM restore: PMTIMER 0

Hope it can be helpful,
Michal

Comment 2 Paolo Bonzini 2010-08-20 07:50:29 UTC
So, save/restore and remote migration both work on AMD?

Comment 3 Michal Novotny 2010-08-20 11:10:33 UTC
(In reply to comment #2)
> So, save/restore and remote migration both work on AMD?

Well, I saw remote migration not working but save/restore was working sometimes and sometimes not according to my testing but maybe it's machine dependent - i.e. bad hardware or something.

Michal

Comment 4 Paolo Bonzini 2010-11-23 15:50:45 UTC
Michal, can we close this?

Comment 5 Michal Novotny 2010-11-23 16:22:39 UTC
(In reply to comment #4)
> Michal, can we close this?

Did you retest on AMD on to save/restore again? It's been a while I reported that and I don't recall whether I saw it working already from that time or not.

Michal

Comment 6 Michal Novotny 2010-11-23 16:37:01 UTC
Ok, I retested it now and I was unable to reproduce it. It's OK to close it.

Michal

Comment 7 Paolo Bonzini 2010-11-26 13:51:41 UTC
see comment #6


Note You need to log in before you can comment on or make changes to this bug.