Description of problem: When user tries to save and restore Windows HVM guest on AMD platform the restore fails since it's not being properly saved and there are following errors printed into the `dm dmesg` output: (XEN) save.c:174:d0 HVM restore: Xen changeset was not saved. (XEN) lapic_load to rearm the actimer:bus cycle is 10ns, saved tmict count 116770000, period 1167700000ns, irq=253 This happens only on AMD hardware (tried both on Barcelona and Phenome B2) for Windows guests. For the Linux HVM guest I can see the following messages in `xm dmesg` but the guest is working fine: (XEN) save.c:115:d0 HVM save: CPU (XEN) save.c:115:d0 HVM save: PIC (XEN) save.c:115:d0 HVM save: IOAPIC (XEN) save.c:115:d0 HVM save: LAPIC (XEN) save.c:115:d0 HVM save: LAPIC_REGS (XEN) save.c:115:d0 HVM save: PCI_IRQ (XEN) save.c:115:d0 HVM save: ISA_IRQ (XEN) save.c:115:d0 HVM save: PCI_LINK (XEN) save.c:115:d0 HVM save: PIT (XEN) save.c:115:d0 HVM save: RTC (XEN) save.c:115:d0 HVM save: HPET (XEN) save.c:115:d0 HVM save: PMTIMER (XEN) save.c:174:d0 HVM restore: Xen changeset was not saved. (XEN) save.c:218:d0 HVM restore: CPU 0 (XEN) save.c:218:d0 HVM restore: PIC 0 (XEN) save.c:218:d0 HVM restore: PIC 1 (XEN) save.c:218:d0 HVM restore: IOAPIC 0 (XEN) save.c:218:d0 HVM restore: LAPIC 0 (XEN) save.c:218:d0 HVM restore: LAPIC_REGS 0 (XEN) lapic_load to rearm the actimer:bus cycle is 10ns, saved tmict count 6249, period 999840ns, irq=239 (XEN) save.c:218:d0 HVM restore: PCI_IRQ 0 (XEN) save.c:218:d0 HVM restore: ISA_IRQ 0 (XEN) save.c:218:d0 HVM restore: PCI_LINK 0 (XEN) save.c:218:d0 HVM restore: PIT 0 (XEN) save.c:218:d0 HVM restore: RTC 0 (XEN) save.c:218:d0 HVM restore: HPET 0 (XEN) save.c:218:d0 HVM restore: PMTIMER 0 On the Windows guests the system simply hangs with both with and without PV drivers. On Intel platform no such messages are being printed to the `xm dmesg` output. Version-Release number of selected component (if applicable): kernel-2.6.18-194.8.1.el5xen xen-3.0.3-115.el5virttest31.g7e4798b How reproducible: 100% Steps to Reproduce: 1. create a Windows HVM guest on AMD machine 2. try to local migrate the guest (xm migrate $dom localhost) Actual results: Windows guest hangs and you can't access the guest at all. Expected results: Windows guest should be working fine like on Intel machine. Additional info: We've been investigating this and we've found out both local and remote migration is working fine on Intel but neither local nor remote migration was not working on AMD. I can see no error in xend.log but in `xm dmesg` output I've discovered following messages at the end: (XEN) traps.c:1877:d0 Domain attempted WRMSR 00000000c001001f from 00582000:00000008 to 00586000:00000008. (XEN) save.c:174:d0 HVM restore: Xen changeset was not saved. (XEN) lapic_load to rearm the actimer:bus cycle is 10ns, saved tmict count 116770000, period 1167700000ns, irq=253 (XEN) save.c:174:d0 HVM restore: Xen changeset was not saved. (XEN) lapic_load to rearm the actimer:bus cycle is 10ns, saved tmict count 116770000, period 1167700000ns, irq=253 (XEN) save.c:174:d0 HVM restore: Xen changeset was not saved. (XEN) lapic_load to rearm the actimer:bus cycle is 10ns, saved tmict count 2562350000, period 4148663520ns, irq=253 (XEN) save.c:174:d0 HVM restore: Xen changeset was not saved. (XEN) lapic_load to rearm the actimer:bus cycle is 10ns, saved tmict count 2562350000, period 4148663520ns, irq=253 (XEN) save.c:174:d0 HVM restore: Xen changeset was not saved. (XEN) lapic_load to rearm the actimer:bus cycle is 10ns, saved tmict count 3403050000, period 3965728928ns, irq=253 So I guess this is something hypervisor related since according to the testing it's always working on Intel but never on AMD.
That's strange, I did testing of this once again just for save/restore and the guest was working so I'm afraid it's not always reproducible when doing save/restore but according to my testing it's 100% reproducible when doing local migrations. Here's the `xm dmesg` output from case it was working fine on AMD Opteron when doing save and restore (so the missing changeset in the headers is not fatal thing as well as the "(XEN) lapic_load to rearm the actimer:bus cycle is 10ns, saved tmict count 3962750000, period 972794336ns, irq=253" message): (XEN) save.c:115:d0 HVM save: CPU (XEN) save.c:115:d0 HVM save: PIC (XEN) save.c:115:d0 HVM save: IOAPIC (XEN) save.c:115:d0 HVM save: LAPIC (XEN) save.c:115:d0 HVM save: LAPIC_REGS (XEN) save.c:115:d0 HVM save: PCI_IRQ (XEN) save.c:115:d0 HVM save: ISA_IRQ (XEN) save.c:115:d0 HVM save: PCI_LINK (XEN) save.c:115:d0 HVM save: PIT (XEN) save.c:115:d0 HVM save: RTC (XEN) save.c:115:d0 HVM save: HPET (XEN) save.c:115:d0 HVM save: PMTIMER (XEN) sysctl.c:51: Allowing physinfo call with newer ABI version (XEN) sysctl.c:51: Allowing physinfo call with newer ABI version (XEN) sysctl.c:51: Allowing physinfo call with newer ABI version (XEN) sysctl.c:51: Allowing physinfo call with newer ABI version (XEN) sysctl.c:51: Allowing physinfo call with newer ABI version (XEN) sysctl.c:51: Allowing physinfo call with newer ABI version (XEN) save.c:174:d0 HVM restore: Xen changeset was not saved. (XEN) save.c:218:d0 HVM restore: CPU 0 (XEN) save.c:218:d0 HVM restore: PIC 0 (XEN) save.c:218:d0 HVM restore: PIC 1 (XEN) save.c:218:d0 HVM restore: IOAPIC 0 (XEN) save.c:218:d0 HVM restore: LAPIC 0 (XEN) save.c:218:d0 HVM restore: LAPIC_REGS 0 (XEN) lapic_load to rearm the actimer:bus cycle is 10ns, saved tmict count 3962750000, period 972794336ns, irq=253 (XEN) save.c:218:d0 HVM restore: PCI_IRQ 0 (XEN) save.c:218:d0 HVM restore: ISA_IRQ 0 (XEN) save.c:218:d0 HVM restore: PCI_LINK 0 (XEN) save.c:218:d0 HVM restore: PIT 0 (XEN) save.c:218:d0 HVM restore: RTC 0 (XEN) save.c:218:d0 HVM restore: HPET 0 (XEN) save.c:218:d0 HVM restore: PMTIMER 0 Hope it can be helpful, Michal
So, save/restore and remote migration both work on AMD?
(In reply to comment #2) > So, save/restore and remote migration both work on AMD? Well, I saw remote migration not working but save/restore was working sometimes and sometimes not according to my testing but maybe it's machine dependent - i.e. bad hardware or something. Michal
Michal, can we close this?
(In reply to comment #4) > Michal, can we close this? Did you retest on AMD on to save/restore again? It's been a while I reported that and I don't recall whether I saw it working already from that time or not. Michal
Ok, I retested it now and I was unable to reproduce it. It's OK to close it. Michal
see comment #6