Bug 1027565 - fail to reboot guest after migration from RHEL6.5 host to RHEL7.0 host
fail to reboot guest after migration from RHEL6.5 host to RHEL7.0 host
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: qemu-kvm (Show other bugs)
7.0
x86_64 Linux
high Severity high
: rc
: ---
Assigned To: Laszlo Ersek
Virtualization Bugs
: ZStream
: 1074963 (view as bug list)
Depends On:
Blocks: RHEL7.0Virt-PostBeta(0-Day) 1091322 1103579
  Show dependency treegraph
 
Reported: 2013-11-07 01:19 EST by FuXiangChun
Modified: 2015-03-05 03:01 EST (History)
19 users (show)

See Also:
Fixed In Version: qemu-kvm-1.5.3-61.el7
Doc Type: Bug Fix
Doc Text:
Prior to this update, a bug in the migration code caused the following error on specific machine types: after a Red Hat Enterprise Linux 6.5 guest was migrated from a Red Hat Enterprise Linux 6.5 host to a Red Hat Enterprise Linux 7.0 host and then restarted, the boot failed and the guest automatically restarted. Thus, the guest entered an endless loop. With this update, the migration code has been fixed and the Red Hat Enterprise Linux 6.5 guests migrated in the aforementioned scenario now boot properly.
Story Points: ---
Clone Of:
: 1103579 (view as bug list)
Environment:
Last Closed: 2015-03-05 03:01:59 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
screenshot of reboot failed (31.51 KB, image/png)
2014-04-11 00:28 EDT, huiqingding
no flags Details
the screenshot of system_reset failed. (21.64 KB, image/png)
2014-04-11 00:43 EDT, huiqingding
no flags Details
the screenshot of shutdown/system_powerdown failed (15.40 KB, image/png)
2014-04-11 01:10 EDT, huiqingding
no flags Details
proof of concept patch for the analysis in comment 16 (1.60 KB, patch)
2014-04-12 01:09 EDT, Laszlo Ersek
no flags Details | Diff
info mtree diff between migrated-from-rhel6 and cold-booted-on-rhel7 (8.27 KB, patch)
2014-04-12 01:14 EDT, Laszlo Ersek
no flags Details | Diff
dump ramblocks in stage 1 of the RHEL-6 outgoing migration (debug patch) (2.36 KB, patch)
2014-04-12 19:03 EDT, Laszlo Ersek
no flags Details | Diff
proposed patch (downstream only) (4.01 KB, patch)
2014-04-13 03:48 EDT, Laszlo Ersek
no flags Details | Diff
compare reboot and system_reset (4.25 KB, patch)
2014-04-15 10:09 EDT, FuXiangChun
no flags Details | Diff
proposed patch (downstream only), v2; approach #2 (4.83 KB, patch)
2014-04-15 14:15 EDT, Laszlo Ersek
no flags Details | Diff
guest call trace from console (9.84 KB, text/plain)
2014-04-16 07:02 EDT, FuXiangChun
no flags Details
qemu-kvm command line, please search usb device in cli (4.56 KB, text/plain)
2014-04-16 07:04 EDT, FuXiangChun
no flags Details
4 patches for approach #1 (illustration only), mbox format (17.54 KB, text/plain)
2014-04-16 13:45 EDT, Laszlo Ersek
no flags Details
call trace log after migration and system_reset (9.35 KB, text/plain)
2014-04-21 02:16 EDT, huiqingding
no flags Details
proposed patch for the *separate* USB problem (downstream only) (5.15 KB, patch)
2014-04-21 17:58 EDT, Laszlo Ersek
no flags Details | Diff

  None (edit)
Comment 2 FuXiangChun 2013-11-10 21:02:04 EST
Re-test this issue with win7-32bit guest. hit the same issue.
Comment 3 Orit Wasserman 2013-11-26 10:27:33 EST
Can you please provide the output of /proc/cpuinfo on both hosts?
Comment 6 FuXiangChun 2014-02-26 21:10:37 EST
Re-tested this issue with nfs, still can reproduce this issue.
steps:
1.setup nfs server on rhel7 host.
# cat /etc/exports
/home *(rw,no_root_squash,async)
# systemctl restart nfs-server.service
# mount 10.66.11.149:/home /mnt

2. on rhel6 host
# mount 10.66.11.149:/home /mnt

Boot rhel6.5 guest on rhel6.5 host
# /usr/libexec/qemu-kvm -M rhel6.5.0 -cpu Opteron_G2 -enable-kvm -m 2G  -smp 4 -name rhel6.5 -uuid 6afa5f93-2d4f-420f-81c6-e5fdddbd1c83 -drive file=/mnt/RHEL-Server-6.5-64.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,serial=40c061dd-5d60-4fc5-865f-55db700407f0,cache=none,werror=stop,rerror=stop,aio=threads -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0 -net none -vnc :1  -monitor stdio -serial unix:/tmp/monitor2,server,nowait

3.Boot rhel6.5 guest with listening on rhel7 host
# /usr/libexec/qemu-kvm -M rhel6.5.0 -cpu Opteron_G2 -enable-kvm -m 2G  -smp 4 -name rhel6.5 -uuid 6afa5f93-2d4f-420f-81c6-e5fdddbd1c83 -drive file=/mnt/RHEL-Server-6.5-64.qcow2,if=none,id=drive-virtio-disk0,format=qcow2,serial=40c061dd-5d60-4fc5-865f-55db700407f0,cache=none,werror=stop,rerror=stop,aio=threads -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0 -net none -vnc :1  -monitor stdio -serial unix:/tmp/monitor2,server,nowait -incoming tcp:0:5555

4. do migration
(qemu) migrate -d tcp:10.66.11.149:5555

5. reboot rhel6.5 guest after migration
#reboot inside guest

result:
guest hang. and this is the last part of guest console output  

Please stand by while rebooting the system...
Restarting system.
machine restart


# uname -r
3.10.0-86.el7.x86_64
# rpm -qa|grep qemu
qemu-kvm-rhev-1.5.3-47.el7.x86_64

# uname -r
2.6.32-446.el6.x86_64
#rpm -qa|grep qemu
qemu-kvm-rhev-0.12.1.2-2.421.el6.x86_64
Comment 7 huiqingding 2014-04-11 00:28:55 EDT
Created attachment 885244 [details]
screenshot of reboot failed

I migrate win8-64 guest from RHEL6.5 host to RHEL7.0 host, after migriation, do reboot inside the guest, reboot is failed and the screenshot is as the attachment file.

Version-Release number of selected component (if applicable):
RHEL6.5 host:
kernel-2.6.32-456.el6.x86_64
qemu-kvm-0.12.1.2-2.423.el6.x86_64

RHEL7.0 host:
kernel-3.10.0-121.el7.x86_64
qemu-kvm-1.5.3-60.el7.x86_64

The command line of src RHEL6.5 host:
# /usr/libexec/qemu-kvm -M rhel6.5.0 -cpu Westmere -enable-kvm  -m 2048 -realtime mlock=off -smp 4,sockets=2,cores=2,threads=1,maxcpus=160 -drive file=/mnt/win8-64-ide.qcowne,media=disk,id=drive-ide1-1-0,format=qcow2,werror=stop,rerror=stop,cache=none,boot=on -device ide-drive,drive=drive-ide1-1-0,id=ide-disk0  -vnc :10  -monitor stdio -nodefconfig -net none

The command line of dest RHEL7.0 host:
# /usr/libexec/qemu-kvm -M rhel6.5.0 -cpu Westmere -enable-kvm -m 2048 -realtime mlock=off -smp 4,sockets=2,cores=2,threads=1,maxcpus=160 -drive file=/mnt/win8-64-ide.qcow2,if=none,media=disk,id=drive-ide1-1-0,format=qcow2,werror=stop,rerror=stop,cache=none,boot=on -device ide-drive,drive=drive-ide1-1-0,id=ide-disk0 -vnc :10 -monitor stdio -nodefconfig -net none -incoming tcp:0:5800
Comment 8 huiqingding 2014-04-11 00:43:10 EDT
Created attachment 885245 [details]
the screenshot of system_reset failed.

Use the command line of comment 7 to migrate a win8-64 guest from RHEL6.5 host to RHEL7.0 host, do "system_reset" on dest qemu-kvm, the guest is hang and the screenshot is as the attachment file. On dest qemu-kvm, run "info status":
(qemu) info status
VM status: running
Comment 10 huiqingding 2014-04-11 01:10:30 EDT
Created attachment 885252 [details]
the screenshot of shutdown/system_powerdown failed
Comment 11 Laszlo Ersek 2014-04-11 12:41:18 EDT
I think this could be similar to bug 1049860 -- the RHEL-7 destination qemu-kvm process may not bring up the piix4-pm memory region because some PM register state could be lost during migration.
Comment 12 Laszlo Ersek 2014-04-11 12:43:45 EDT
(In reply to huiqingding from comment #8)
> Created attachment 885245 [details]
> the screenshot of system_reset failed.
> 
> Use the command line of comment 7 to migrate a win8-64 guest from RHEL6.5
> host to RHEL7.0 host, do "system_reset" on dest qemu-kvm, the guest is hang
> and the screenshot is as the attachment file. On dest qemu-kvm, run "info
> status":
> (qemu) info status
> VM status: running

Can you please issue, at the qemu monitor, the "info mtree" command:
(1) on the source host, before migration, when the guest is up and running,
(2) on the target host, after migration, while the guest is running (apparently OK),
(3) on the target host, after migration, when you're shutting down the guest, and it hangs?

Comparing these three "info mtree" outputs could help. Thanks.
Comment 13 Laszlo Ersek 2014-04-11 18:30:50 EDT
- RHEL-6 qemu-kvm doesn't support "info mtree".
- the "piix4-pm" memory region is created on the RHEL7 dst host.
- I could reproduce the hang on the target host, with a RHEL-6.5 guest (see comment 6)

Versions:
- source:
  kernel-2.6.32-431.11.2.el6.x86_64
  qemu-kvm-0.12.1.2-2.415.el6_5.7.x86_64
- destination:
  kernel-3.10.0-121.el7.x86_64
  qemu-kvm-1.5.3-60.el7.x86_64

command line (without -incoming option):

/usr/libexec/qemu-kvm \
  -S \
  -M rhel6.5.0 \
  -enable-kvm  \
  -m 768 \
  -smp 4 \
  -drive if=none,cache=none,media=disk,id=disk1,format=qcow2,werror=stop,rerror=stop,file=/mnt/tmp/guest.qcow2 \
  -device virtio-blk-pci,drive=disk1,bootindex=1 \
  -monitor stdio \
  -nodefconfig \
  -net none \
  -vnc :0 \
Comment 14 Laszlo Ersek 2014-04-11 18:55:24 EDT
I can shutdown the RHEL-6.5 guest after migration just fine (with "shutdown -h now" in the guest.) However, I can't reboot it (see again comment 6).

The hang is actually not a hang, it's a busy loop. I looked at some VCPU IP values, and I also dumped a guest vmcore. It looks like the guest is spinning in SeaBIOS.
Comment 15 Laszlo Ersek 2014-04-11 19:20:32 EDT
Alright, so the problem is that the low-mapped BIOS is either not migrated in the wire format, or that on the destination side, it is not recreated on incoming migration. (Or maybe the PAM visibility is programmed differently.)

I dumped the memory region [1MB-128KB, 1MB) on the source, and on the target host as well:

(qemu) dump-guest-memory 0xe0000 0x20000

and then ran "strings" and "hexdump -C" on the resultant partial ELF vmcores. The dump before the migration contains a bunch of seabios strings (ACPI table names, seabios version strings etc), while on the target host, it's a big bucket of zeroes.
Comment 16 Laszlo Ersek 2014-04-12 01:08:48 EDT
(1) vmcore of RHEL-6.5 guest running on RHEL-6.5 source host
------------------------------------------------------------
PhysAddr            MemSiz        PhysAddr            MemSiz
0x0000000000000000  0x000a0000       0                640 KB
0x00000000000c0000  0x00020000     768 KB             128 KB
0x00000000000e0000  0x00020000     896 KB             128 KB <-- pc.bios
0x0000000000100000  0x1ff00000       1 MB             511 MB
0x00000000f0000000  0x01000000    3840 MB              16 MB <-- vga.vram

crash> search -c -p "seabios-"
f2a64: seabios-0.6.1.2-28.el6...R..|R...R...R...R...5..q9...?..
^^^^^

(2) vmcore of RHEL-6.5 guest after migrated to RHEL-7 destination host
----------------------------------------------------------------------
PhysAddr            MemSiz        PhysAddr            MemSiz
0x0000000000000000  0x000a0000       0                640    KB
0x00000000000c0000  0x1ff40000     768     KB         511.25 MB
0x00000000f0000000  0x01000000    3840     MB          16    MB <-- vga.vram
0x00000000fffe0000  0x00020000    4095.875 MB         128    KB <-- pc.bios

crash> search -c -p "seabios-"
[no matches]

crash> rd -p 0x00000000fffe0000  0x00020000 # pc.bios
...
        ffff2a60:  626165730000006c 2e362e302d736f69   l...seabios-0.6.
           ^^^^^
        ffff2a70:  652e38322d322e31 000f52880000366c   1.2-28.el6...R..
...

(3) vmcore of RHEL-6.5 guest booted originally on RHEL-7 host
-------------------------------------------------------------
PhysAddr            MemSiz        PhysAddr            MemSiz
0x0000000000000000  0x000a0000       0                640    KB
0x00000000000c0000  0x1ff40000     768     KB         511.25 MB
0x00000000fc000000  0x01000000    4032 MB              16    MB <-- vga.vram
0x00000000fffe0000  0x00020000    4095.875 MB         128    KB <-- pc.bios

crash> search -c -p "seabios-"
c8c40: seabios-1.7.2.2-12.el7..................................
f5be4: seabios-1.7.2.2-12.el7...................Ý..............
^^^^^
fc980: seabios-1.7.2.2/src/virtio-ring.c.BUG: failure at %s:%d/

crash> rd -p 0x00000000fffe0000  0x00020000
...
        ffff5be0:  62616573000f5bd8 2e372e312d736f69   .[..seabios-1.7.
           ^^^^^
        ffff5bf0:  652e32312d322e32 000000009000376c   2.2-12.el7......
...


(-M rhel6.5.0 was always used)

Analysis:

The migration from the RHEL-6 host to the RHEL-7 host does transfer the
contents of the "pc.bios" RAMBlock (connected by the "pc.bios" ID string),
however its guest-physical mapping changes, from 896 KB to 4095.875 MB. This
should not be a problem per se:

When a guest is booted on the RHEL-7 host fresh, pc.bios ends up at 4095.875
MB too. The following alias makes it appear at 896 KB too:

aliases
pci
0000000000000000-7ffffffffffffffe (prio 0, RW): pci
  00000000000e0000-00000000000fffff (prio 1, R-): alias isa-bios @pc.bios 0000000000000000-000000000001ffff

According to "info mtree", the "isa-bios" alias is in place both in cases
(2) and (3). For some reason it doesn't have the desired effect in case (2);
the alias doesn't show the contents of the aliased memory range.

I think I have an extremely convoluted theory about what happens.

In both RHEL-6 and RHEL-7 qemu, the PAM registers start out as "pam-pci"
(accesses PCI address space on read).

RHEL-6 maps the BIOS in PCI address space directly at 896 KB (pc.bios).

RHEL-7 maps the BIOS in PCI address space just below 4GB (pc.bios):

(*)
  old_pc_system_rom_init() [hw/i386/pc_sysfw.c]
    /* map all the bios at the top of memory */
    memory_region_add_subregion(rom_memory, (uint32_t)(-bios_size), bios)

but it creates an alias, called "isa-bios", that makes it appear at 896 KB
too (still PCI address space).

The upshot is that when SeaBIOS starts running, it can find itself on both
hosts at 896 KB, in PCI address space. It then shadows itself in-place, to
the same guest-phys address (at 896 KB), but to RAM address space, flipping
the PAM values back and forth (see src/shadow.c). When the guest OS starts
running, the E segment is "pam-ram", and the F segment is "pam-rom" --
different write perms, but they both read as RAM.

Importantly, when the VM is reset, the PAM registers are *not* reset. (We
tried that earlier, and it breaks S3 resume in SeaBIOS.) So, after reset,
the in-RAM copy of SeaBIOS runs, on both hosts, from the F segment, without
having to shadow itself again. (I think the code in src/shadow.c recognizes
the "pam-ram"/"pam-rom" settings.)

Now, if we migrate from RHEL-6 to RHEL-6, *or* RHEL-7 to RHEL-7, then
whichever RAMBlock contains the result of the shadowing, ie. the SeaBIOS
code, that appears at 896 KB in RAM address space, will be migrated to the
same place on the target. The source ramblock is associated with the target
ramblock by name, and they also have the same guest-phys mapping. Resetting
the VM on either host doesn't change the PAMs, so the migration target host
will run the in-RAM copy of SeaBIOS that has been shadowed on the source
host, before migration.

However, when we migrate from RHEL-6 to RHEL-7, the ramblock in question
seems to have a different guest-phys mapping between the source and
destination hosts. Since the PAMs are not reset to "pam-pci" during VM
reset, the "isa-bios" trick doesn't work. We jump to empty memory.

As proof for the above theory, the attached RHEL-7 qemu-kvm patch "solves"
(actually, papers over) the issue for me. It resets the PAM registers to
"pam-pci" on VM reset, hence SeaBIOS will be visible through the isa-bios
alias, and SeaBIOS will shadow itself again.

Of course, as I said before, this patch is not good, because it breaks S3
resume in SeaBIOS.

The patch is from David Woodhouse BTW.

http://thread.gmane.org/gmane.comp.bios.coreboot.seabios/5561/focus=5624

I don't know how we can solve this problem. In one sentence, the root
cause is that the RAMBlock called "pc.bios" has completely different
guest-phys mappings between RHEL-6 and RHEL-7 hosts: under the former, it
is mapped at 896 KB, in RAM address space, while under the latter, it is
mapped at 4GB-128KB, in PCI address space. The "isa-bios" alias makes it
work under RHEL-7, but only as long as the relevant PAM register is in
"pam-pci" mode (which is the case on cold boot only, at which point
SeaBIOS shadows itself).
Comment 17 Laszlo Ersek 2014-04-12 01:09:48 EDT
Created attachment 885649 [details]
proof of concept patch for the analysis in comment 16
Comment 18 Laszlo Ersek 2014-04-12 01:14:10 EDT
Created attachment 885650 [details]
info mtree diff between migrated-from-rhel6 and cold-booted-on-rhel7
Comment 19 Laszlo Ersek 2014-04-12 01:40:21 EDT
Basically, we need another internal flag (global variable), to be set differently by -M pc-i440fx-rhel7.0.0 and -M rhel6.5.0, which should control the placement of the "pc.bios" memory range. It's an extra compat aspect that we now must retrofit. If 6.x machtypes are selected, then "pc.bios" must go precisely in the old place (and isa-bios should not be used).
Comment 20 Laszlo Ersek 2014-04-12 19:03:31 EDT
Created attachment 885795 [details]
dump ramblocks in stage 1 of the RHEL-6 outgoing migration (debug patch)

(Click "Unwrap comments" for this comment.)

This patch produces the following output, when starting the migration:

(qemu) migrate -d tcp:192.168.1.105:5555

ramblock.00.0000000000000000.0000000020000000.00000000,pc.ram
ramblock.01.0000000020040000.0000000001000000.00000000,vga.vram
ramblock.02.0000000021040000.0000000000010000.00000000,0000:00:02.0_cirrus_vga.rom
ramblock.03.0000000020000000.0000000000020000.00000000,pc.bios
ramblock.04.0000000020020000.0000000000020000.00000000,pc.rom

Reordered by block->offset:

ramblock.00.0000000000000000.0000000020000000.00000000,pc.ram
ramblock.03.0000000020000000.0000000000020000.00000000,pc.bios
ramblock.04.0000000020020000.0000000000020000.00000000,pc.rom
ramblock.01.0000000020040000.0000000001000000.00000000,vga.vram
ramblock.02.0000000021040000.0000000000010000.00000000,0000:00:02.0_cirrus_vga.rom

Only the "pc.bios" file contains the "seabios-" string:

$ strings -tx ramblock.03.* |grep seabios-
  12a64 seabios-0.6.1.2-28.el6

(See also comment 16 for the offset 0xe0000 + 0x12a64 == 0xf2a64.)

RHEL-6 lacks the memory region stuff; it uses
cpu_register_physical_memory(). I'll try to follow how it stacks up the
memory areas. The priority of memory regions is basically given by the order
of cpu_register_physical_memory() calls; if a later call overlaps the
guest-phys range established by an earlier call, the new assignment takes
priority.

guest visible physical addresses
0  0xa0000     0xc0000  0xe0000   0x100000 0x20000000 0xf0000000 0xf1000000 0xfffe0000 0x100000000
    640 KB     768 KB   896 KB    1 MB         512 MB 3840 MB    3856 MB    4GB-128KB  4GB
| pc.ram |     | pc.rom | pc.bios |     pc.ram      | | vga.vram |          | pc.bios  |
|        |     |        |         |                 | |          |          |IO_MEM_ROM|
|        |     |        |         |                 | |          |          |          |
|        |     |        |         |                 | |          |          |          |
..................................................................................................
|        |              |         |                 | [ vga.vram )          |          |
|        |              |         |                 |                       |          |
|        |     [ pc.rom )         |                 |                       |          |
|        |              |         |                 |                       |          |
|        |              [ pc.bios )                 |                       [  pc.bios )
|        |                        |                 |
[\\\\\\\\\ pc.ram: 512 MB, see pc_init1() //////////)
RAMBlocks

The contents of the "pc.ram" RAMBlock, in the offset range 0xe0000..0x100000
(ie. where the "pc.bios" RAMBlock overlays it, as far as the guest sees):

$ hexdump -C -s $((16#e0000)) -n $((128*1024)) \
      ramblock.00.0000000000000000.0000000020000000.00000000,pc.ram

000e0000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00100000

It's full of zeroes. When SeaBIOS shadows itself, the writes go into the
pc.bios RAMBlock, not pc.ram.

Migration transfers the RAMBlocks fine. On the RHEL-7 target, we get:

guest visible physical addresses
0  0xa0000     0xc0000  0xe0000   0x100000 0x20000000 0xf0000000 0xf1000000 0xfffe0000 0x100000000
    640 KB     768 KB   896 KB    1 MB         512 MB 3840 MB    3856 MB    4GB-128KB  4GB
| pc.ram |     | pc.ram | pc.ram  |     pc.ram      | | vga.vram |          | pc.bios  |
|        |     |  not   |  not    |                 | |          |          | io-mem   |
|        |     | pc.rom | pc.bios |                 | |          |          |          |
|        |     |        |         |                 | |          |          |          |
..................................................................................................
|        |     |        |         |                 | [ vga.vram )          |          |
|        |     |        |         |                 |                       |          |
|        |     |        |         |                 |                       |          |
|        |     |        |         |                 |                       |          |
|        |     |        |         |                 |                       [  pc.bios )
|        |     |        |         |                 |
[\\\\\\\\\ pc.ram: 512 MB //////////////////////////)
RAMBlocks

Because, due to the contents of the PAM registers, the 0xc0000..0x100000
range is a window into pc.ram:
                                        vvvvvvv
c0000-c3fff (prio 1, R-): alias pam-rom @pc.ram c0000-c3fff
c4000-c7fff (prio 1, R-): alias pam-rom @pc.ram c4000-c7fff
c8000-cbfff (prio 1, R-): alias pam-rom @pc.ram c8000-cbfff
cc000-cffff (prio 1, RW): alias pam-ram @pc.ram cc000-cffff
d0000-d3fff (prio 1, RW): alias pam-ram @pc.ram d0000-d3fff
d4000-d7fff (prio 1, RW): alias pam-ram @pc.ram d4000-d7fff
d8000-dbfff (prio 1, RW): alias pam-ram @pc.ram d8000-dbfff
dc000-dffff (prio 1, RW): alias pam-ram @pc.ram dc000-dffff
e0000-e3fff (prio 1, RW): alias pam-ram @pc.ram e0000-e3fff
e4000-e7fff (prio 1, RW): alias pam-ram @pc.ram e4000-e7fff
e8000-ebfff (prio 1, RW): alias pam-ram @pc.ram e8000-ebfff
ec000-effff (prio 1, RW): alias pam-ram @pc.ram ec000-effff
f0000-fffff (prio 1, R-): alias pam-rom @pc.ram f0000-fffff
                                        ^^^^^^^
The PAM registers are not cleared on VM reset, hence after reboot we find a
bunch of zeroes at 0xf0000. (If the PAM registers were reset (which would
break S3, again), then these ranges would alias the isa-bios PCI address
range, which in turn aliases the pc.bios RAMBlock:

                                         vvvvvvvv
e0000-fffff (prio 1, R-): alias isa-bios @pc.bios 00000-1ffff
                                         ^^^^^^^^

The PAM registers switch between PCI and RAM visibility in both RHEL-6 and
RHEL-7; we shouldn't change that. What we *should* bring into sync is this:

RHEL-6 code, "hw/pc.c", function pc_init1():

    /* kvm tpr optimization needs the bios accessible for write, at least to qemu itself */
    cpu_register_physical_memory(0x100000 - isa_bios_size,
                                 isa_bios_size,
                                 (bios_offset + bios_size - isa_bios_size) /* | IO_MEM_ROM */);

Notice how "IO_MEM_ROM" is commented out? That means that in RHEL-6, the
"pc.bios" RAMBlock stands for guest-phys *RAM* address space. It means that
any shadowing SeaBIOS does is written into "pc.bios", and that "pc.bios" is
visible (for read accesses) when the PAM regs say "pam-rom".

Compare the RHEL-7 code, file "hw/i386/pc_sysfw.c", function
old_pc_system_rom_init():

    memory_region_init_alias(isa_bios, "isa-bios", bios,
                             bios_size - isa_bios_size, isa_bios_size);
    memory_region_add_subregion_overlap(rom_memory,
                                        0x100000 - isa_bios_size,
                                        isa_bios,
                                        1);

The first argument, "rom_memory", is what we're overlapping with the
isa-bios "trick". It means that the isa-bios window into pc.bios will only
work if the PAM regs say "pam-pci".

Note that the *exact* same argument holds not only for pc.bios, but for
"pc.rom" too. See RHEL-7 code, file "hw/i386/pc.c", function
pc_memory_init():

    option_rom_mr = g_malloc(sizeof(*option_rom_mr));
    memory_region_init_ram(option_rom_mr, "pc.rom", PC_ROM_SIZE);
    vmstate_register_ram_global(option_rom_mr);
    memory_region_add_subregion_overlap(rom_memory, // <------- here
                                        PC_ROM_MIN_VGA,
                                        option_rom_mr,
                                        1);

Again, in RHEL-6, pc.rom is mapped as RAM address space:

    option_rom_offset = qemu_ram_alloc(NULL, "pc.rom", PC_ROM_SIZE);
    cpu_register_physical_memory(PC_ROM_MIN_VGA, PC_ROM_SIZE, option_rom_offset);

No IO_MEM_ROM or'd in the last parameter!

The following fixes are needed in RHEL-7, for the RHEL-6.x machine types:

- The "isapc_ram_fw" flag needs to be set to "true". This will mean that
  both the high-mapped "pc.bios" memory region and the low-mapped "isa-bios"
  window will remain writable. Because, they are writable on RHEL-6
  qemu-kvm.

- The other change is more intrusive. Basically, we have to pass
  "ram_memory" *as* "rom_memory" to pc_memory_init(), for the RHEL-6.x
  machine types. The rom_memory parameter is used in the following spots
  (note that we have no pflash drive, *and* we already have isapc_ram_fw
  set:

  pc_memory_init()
    pc_system_firmware_init()
      old_pc_system_rom_init()
        memory_region_add_subregion_overlap() // <-- isa-bios, good
        memory_region_add_subregion() // <-- pc.bios, *not* good
    memory_region_add_subregion_overlap() // <--- pc.rom, good

See the RHEL-6 memory map diagram:
- pc.rom is RAM, so the RHEL-7 fix is clear.

- The low mapped instance of pc.bios is also RAM in RHEL-6, which also
  clarifies what we must do with isa-bios in RHEL-7, for the RHEL-6.x
  machtype.

- However, the high-mapped pc.bios is in PCI address space (== iomem) in
  *both* RHEL-6 and RHEL-7 (rhel-6.x machtype). We must not break this!
Comment 21 Laszlo Ersek 2014-04-12 20:24:16 EDT
Regardless of what kind of patches I or anyone else manages to come up with, QE will have to test the heck out of RHEL-6 machine types. Minimally:

(1) sanity checking of RHEL-6 machtype guests that have been *cold booted* on
    RHEL-7
(2) in-VM reboot of the same
(3) S3 suspend/resume of the same

(4) to (6): same three tests for RHEL-6 machine types that have been cold booted
            on RHEL-7 and then migrated to RHEL-7.

(7) to (9): same three tests for RHEL-6 machine types that have been cold booted
            on *RHEL-6* and then migrated to RHEL-7.

(This BZ is about case (8) in fact: RHEL-6 machtype booted on RHEL-6 host, migrated to RHEL-7, and then in-VM reboot fails.)
Comment 22 Laszlo Ersek 2014-04-13 01:40:10 EDT
I was so wrong.

I tried to adapt the RHEL-7 emulator's memory layout (for the rhel6.x.0 machtype) to the RHEL-6 emulator's memory layout. This is completely wrong. Because RHEL-7 is right, and RHEL-6 is broken. Namely, the handling of the PAM registers. (We mostly care about PAM0 here.)

RHEL-7 works (almost) like real hardware. The PAM registers actually work.
PAM0 WE RE
      0  0  [0] (pam-pci) RAM access disabled
      0  1  [1] (pam-rom) reads from RAM, writes thrown away
      1  0  [2] (pam-pci) reads from PCI (==high-mapped BIOS), writes to RAM
      1  1  [3] (pam-ram) reads/writes from/to RAM

When a rhel6.x.0 machtype guest is cold-booted on RHEL-7, the shadowing logic in SeaBIOS (seabios-0.6.1.2-28.el6.x86_64) actually works, just like on real hardware. It makes the F segment writeable, by setting the high nibble of PAM0 to 3, and copies the BIOS from below 4GB to the F segment. The result of shadowing is in pc.ram.

RHEL-7 qemu-kvm supports this by
- mapping the pc.bios RAMBlock under 4GB as read-only, in PCI address space
- creating an alias called isa-bios under 1MB as read only, in PCI address space
- the isa-bios alias shows the final 128KB of the pc.bios RAMBlock.

(See old_pc_system_rom_init()).

RE==1 modes alias pc.ram. RE==0 modes (pam-pci) alias pci_address_space, where the guest will either directly reach pc.bios (below 4GB), or look at the isa-bios alias (and find the same contents). See init_pam().

All is fine when the rhel6.x.0 machtype guest boots on RHEL-7 qemu-kvm, cold-boot, reboot, S3 all work.

However. RHEL-6's PAM handling is nonexistent under KVM. See "hw/piix_pci.c". The update_pam() function would do the heavy lifting (actually changing what appears to the guest, by calling cpu_register_physical_memory()). But update_pam() is only reachable from i440fx_update_memory_mappings(), which starts with the following snippet:

    if (kvm_enabled()) {
        /* FIXME: Support remappings and protection changes. */
        return;
    }

Why thank you. This means that PAM registers have no effect; their contents can be set and queried by the guest, but that's it. So what RHEL-6 qemu-kvm does is a huge theater. It makes the same pc.bios RAMblock appear under 1MB and 4GB,pre-populated. Whatever shadowing SeaBIOS might or might not do makes no difference -- the writes never reach pc.ram (at best the same data is written back to pc.bios), but it works, because no matter what PAM settings are in effect, the 896KB to 1024KB range *always* shows the BIOS.

And that's the important point. After cold boot, on either RHEL-6 qemu-kvm, or RHEL-7 qemu-kvm, we have a state where 896KB to 1024KB presents executable BIOS code in *both* PCI and RAM address space. For the RHEL-6 host this holds because it doesn't emulate the PAMs at all, it just shows the initial BIOS code there, in pc.bios. For the RHEL-7 code this holds because SeaBIOS's shadowing at cold boot actually works, and the binary code part of pc.ram and the "parallel" PCI memory say the same.

The data (variables) of SeaBIOS work a bit differently. On the RHEL-6 emulator, such writes (made in the 896 KB to 1024 KB range) are permanent in pc.bios, and appear under 4GB too. This probably bothers noone. In the RHEL-7 emulator, these variables live in pc.ram only. But, importantly, when SeaBIOS selects a RE==1 mode (it wants to read variables from RAM), it can -- because RHEL-6 has nothing else to offer, and because RHEL-7 correctly provides such modes.

What breaks though is migration. When SeaBIOS cold boots on a RHEL-6 emulator, its shadowing code has no effect. Sure, the pc.bios RAMBlock has the right contents (even before SeaBIOS starts, because qemu preloads it), but pc.ram is unchanged. When we migrate to the RHEL-7 emulator, the fact that the shadowing actually *never happened* (that pc.ram has never been populated) bites us. The pc.bios RAMBlock appears under 4GB alright, but the pc.ram region under 1MB that SeaBIOS *would have* filled, had it been cold-booted on RHEL-7, is now empty.

Basically, we must make up for SeaBIOS's lost shadowing opportunity after migration! Messing with the RHEL-7 emulator's memory regions would be wrong; they work correctly, the PAM registers work too. It's RHEL-6 what's broken, and we must patch up the guest-visible RAM (the E and F segments) from pc.bios after migration completes.

The pc.rom RAMBlock should be fine. It is mapped into the same place in both emulators.
Comment 23 Laszlo Ersek 2014-04-13 03:48:18 EDT
Created attachment 885838 [details]
proposed patch (downstream only)

The attached patch seems to fix the issue for me.

guest OS: RHEL-6.5
machine type: rhel6.5.0

/usr/libexec/qemu-kvm \
  -S \
  -M rhel6.5.0 \
  -enable-kvm  \
  -m 512 \
  -smp 4 \
  -drive if=none,cache=none,media=disk,id=disk1,format=qcow2,werror=stop,rerror=stop,file=/mnt/tmp/guest.qcow2 \
  -device virtio-blk-pci,drive=disk1,bootindex=1 \
  -monitor stdio \
  -nodefconfig \
  -net none \
  -vnc :0 \
  -global PIIX4_PM.disable_s3=0

hosts:
- RHEL-6.5:       qemu-kvm-0.12.1.2-2.415.el6_5.7.x86_64
                  seabios-0.6.1.2-28.el6.x86_64
- RHEL-7.0:       qemu-kvm-1.5.3-60.el7.x86_64
                  seabios-bin-1.7.2.2-12.el7.x86_64
- RHEL-7.0+patch: qemu-kvm-1.5.3-60.el7.bz1027565_ef_seg.x86_64
                  (Brew task 7335922)
                  seabios-bin-1.7.2.2-12.el7.x86_64

host                            cold-boot  reboot    suspend/resume
------------------------------  ---------  --------  --------------
RHEL-6.5                        PASS       PASS      PASS
RHEL-6.5->RHEL-7.0              n/a        FAIL[1]   FAIL[2]
RHEL-7.0                        PASS       PASS      FAIL[3]
RHEL-7.0+patch                  PASS       PASS      FAIL[4]
RHEL-6.5->RHEL-7.0+patch        n/a        PASS[5]   PASS[6]
RHEL-7.0->RHEL-7.0              n/a        untested  untested
RHEL-7.0+patch->RHEL-7.0+patch  n/a        untested  untested

[1] subject of this BZ
[2] same symptoms as [1]
[3] Cirrus is dead, but the VM actually works (separate issue)
[4] same as [3], not a regression
[5] fixes this BZ
[6] fixes S3 resume too
Comment 28 huiqingding 2014-04-15 06:23:41 EDT
Hi, Laszlo,

We test windows guest using qemu-kvm-1.5.3-60.el7.bz1027565_ef_seg.x86_64 according to comment23. 

migration win8-64 guest with "-vga cirrus", reboot/shutdown inside guest is failed, the screenshot is as the attachment of "screenshot of reboot failed " of comment7. Should we file a new bug about this fail? 

The detailed results are as following:

guest: win8-64
host                           vga      cold-boot  reboot    shutdown
--------------------------    -----     ---------  --------  --------------
RHEL-6.5->RHEL-7.0+patch        cirrus    PASS      FAIL      FAIL 
RHEL-6.5->RHEL-7.0+patch        std       PASS      PASS      PASS
RHEL-6.5->RHEL-7.0+patch        qxl       PASS      PASS      PASS
RHEL-7.0                        cirrus    PASS      PASS      PASS
RHEL-7.0+patch->RHEL-7.0+patch  cirrus    PASS      PASS      PASS

I also test win7-32 guest with "-vga cirrus", the result is PASS
guest: win7-32
host                           vga      cold-boot  reboot    shutdown
--------------------------    -----     ---------  --------  --------------
RHEL-6.5->RHEL-7.0+patch        cirrus    PASS      PASS      PASS 
RHEL-6.5->RHEL-7.0+patch        std       PASS      PASS      PASS
RHEL-6.5->RHEL-7.0+patch        qxl       PASS      PASS      PASS


We are also tesing rhel6.5/rhel7.0 guest and after finished, we will update the result.

Best regards
Huiqing
Comment 29 Laszlo Ersek 2014-04-15 08:13:54 EDT
Ah sorry I didn't mean to clear the needinfo, I just wanted to make comment 28 public.

So... The 64-bit Windows 8 guest is interesting. It looks like the patch from comment 23 has no effect on it. (Or maybe it would have an effect, but too late.)

I'm thinking that win8-64 fails due to some other reason *first*, and then lack of a shadowed BIOS only second, because the screenshot in comment 7 is common between the "missing bios" (==unpatched) and the "shadowed bios" (==patched) cases.

In my understanding, in comment 7, the BSOD / DRIVER_POWER_STATE_FAILURE is displayed, and the reboot that is promised on that screen ("we'll restart for you") never happens (ie. the VM starts to spin at 86% completion).

Please clarify: does the exact same happen in comment 28 too? I mean, beside the BSOD, does the VM *also* hang (spin) in comment 28, or does it indeed restart after the BSOD? Because, if it restarts, then maybe we can analyze the "error info" that win8-64 collects.

Also, please provide the following:
- "info mtree" output of the win8-64 guest started on RHEL-7+patch
- "info mtree" output of the win8-64 guest started on RHEL-6, and then migrated
  to RHEL-7+patch (just before attempting reboot)

Thanks.
Comment 30 Laszlo Ersek 2014-04-15 09:57:45 EDT
I think my patch in comment 23 is incomplete.

I mentioned pc.rom before, and now I actually looked at the dump (see "ramblock.04.0000000020020000.0000000000020000.00000000,pc.rom" in comment 20). It contains the vgabios binary (on RHEL-6, this means vgabios-0.6b-3.7.el6.noarch).

00000010  00 00 00 00 00 00 00 00  0f 01 00 00 00 00 49 42  |..............IB|
00000020  4d 00 50 6c 65 78 38 36  2f 42 6f 63 68 73 20 56  |M.Plex86/Bochs V|
00000030  47 41 42 69 6f 73 20 28  50 43 49 29 20 00 63 75  |GABios (PCI) .cu|
00000040  72 72 65 6e 74 2d 63 76  73 20 30 38 20 4e 6f 76  |rrent-cvs 08 Nov|
00000050  20 32 30 31 32 0a 0d 00  28 43 29 20 32 30 30 38  | 2012...(C) 2008|
00000060  20 74 68 65 20 4c 47 50  4c 20 56 47 41 42 69 6f  | the LGPL VGABio|
00000070  73 20 64 65 76 65 6c 6f  70 65 72 73 20 54 65 61  |s developers Tea|
00000080  6d 0a 0d 00 54 68 69 73  20 56 47 41 2f 56 42 45  |m...This VGA/VBE|
00000090  20 42 69 6f 73 20 69 73  20 72 65 6c 65 61 73 65  | Bios is release|
000000a0  64 20 75 6e 64 65 72 20  74 68 65 20 47 4e 55 20  |d under the GNU |
000000b0  4c 47 50 4c 0a 0d 0a 0d  00 50 6c 65 61 73 65 20  |LGPL.....Please |
000000c0  76 69 73 69 74 20 3a 0a  0d 20 2e 20 68 74 74 70  |visit :.. . http|
000000d0  3a 2f 2f 62 6f 63 68 73  2e 73 6f 75 72 63 65 66  |://bochs.sourcef|
000000e0  6f 72 67 65 2e 6e 65 74  0a 0d 20 2e 20 68 74 74  |orge.net.. . htt|
000000f0  70 3a 2f 2f 77 77 77 2e  6e 6f 6e 67 6e 75 2e 6f  |p://www.nongnu.o|
00000100  72 67 2f 76 67 61 62 69  6f 73 0a 0d 0a 0d 00 50  |rg/vgabios.....P|

SeaBIOS doesn't only shadow the F segment in __make_bios_writable_intel(), it shadows the C, D, and E segments too. My patch only cares about E and F, and pc.rom is mapped at [0xc0000, 0xe0000), ie. on C and D.

(I was right about pc.rom in comment 20, and wrong about it in comment 22. It is indeed mapped in the same place in both emulators, but the visibility is different if the PAM registers are set to RE + WE.)

I'm currently installing a win8-64 guest and I'll try to reproduce the problem in comment 28. If I can do that, then I'll try to extend my patch to the C and D segments too (option roms) and see if it helps.
Comment 31 FuXiangChun 2014-04-15 10:06:43 EDT
QE tested RHEL7.0 and RHEL6.5 guest with qemu-kvm-1.5.3-60.el7.bz1027565_ef_seg.x86_64 & RHEL6.5.0 & pc-i440fx-rhel7.0.0 machine type.

This is testing result.

For RHEL7.0 guest

 host                            cold-boot   reboot     suspend/resume
 ------------------------------  ---------   --------  --------------
RHEL-6.5->RHEL-7.0                 FAIL      bug 1027565 exist bug 1083478 
RHEL-7.0                           PASS       PASS       exist bug 1083478
RHEL-7.0+patch                     PASS       PASS       exist bug 1083478
RHEL-6.5->RHEL-7.0+patch           PASS       PASS       exist bug 1083478
RHEL-7.0->RHEL-7.0                 PASS       PASS       exist bug 1083478
RHEL-7.0+patch->RHEL-7.0+patch     PASS       PASS       exist bug 1083478

[2][5][6]tested machine type rhel6.5.0 & pc-i440fx-rhel7.0.0

For RHEL6.5 guest.

 host                            cold-boot   reboot     suspend/resume
 ------------------------------  ---------   --------  --------------
RHEL-6.5->RHEL-7.0                 FAIL      bug 1027565 pass 
RHEL-7.0                           PASS       PASS       pass
RHEL-7.0+patch                     PASS       PASS       pass
RHEL-6.5->RHEL-7.0+patch           PASS       PASS       pass
RHEL-7.0->RHEL-7.0                 PASS       PASS       pass
RHEL-7.0+patch->RHEL-7.0+patch     PASS       PASS       pass

[2][3][5]tested machine type rhel6.5.0 & pc-i440fx-rhel7.0.0

New issues.
N1. For rhel6.5 guest. run twice "reboot" command inside guest after migrating from RHEL6.5 to RHEL7.0 host.  The second time will reproduce this bug(as comment0). For rhel7.0 guest, don't hit this problem.

N2. For rhel6.5 & rhel7.0 guest. run "reboot" command inside guest first after migrating, and then run "system_reset" after guest rebooted.  guest will hang. I will add mtree information to attachment.

If this are two news issues, QE will file two bug to track it.  If they are the same issue. Then the build qemu-kvm-1.5.3-60.el7.bz1027565_ef_seg.x86_64 didn't fix this bug.

qemu-kvm command line as comment23.
Comment 32 FuXiangChun 2014-04-15 10:09:36 EDT
Created attachment 886494 [details]
compare reboot and system_reset
Comment 33 FuXiangChun 2014-04-15 10:12:43 EDT
Laszlo,
could you take a look at two issues in comment 31? Are they new issues?
Comment 34 Laszlo Ersek 2014-04-15 11:11:49 EDT
I can reproduce N1. After the second "reboot" command, the guest seems to be stuck in this SeaBIOS code:

(qemu) xp /02i 0xf4127
0x00000000000f4127:  hlt    
0x00000000000f4128:  jmp    0xf4127

It looks related to the panic() function in RHEL-6 SeaBIOS. I'll have to enable debug output for SeaBIOS to see what it complains about.

I can also reproduce N2 (I tried with the RHEL-6.5 guest). The guest is stuck in the same halt loop.

Hence, N1 and N2 seem to be the same. I think they belong to this BZ.

Regarding my windows-8-64 test, I simply couldn't install a windows 8 guest on my RHEL-6 host that actually worked (so I didn't even attempt migration). I installed the guest (build 9200, "en_windows_8_debug_checked_build_x64_dvd_917558.iso"), and it boots up, but then it fails to open the task manager, or cmd.exe, and RHEL-6.5 qemu-kvm just keeps producing 100% load on one of the host CPU cores. I have no idea what's going on, but it's unbearable to work with this guest, so I've put it aside for now. (My intent was to enable the administrator account and to disable fast startup (ie. auto-S4) before trying migration.)

I submitted another Brew build of RHEL-7 qemu-kvm, with my patch in comment 23 extended to the C and D segments. I'll try to see if this helps with the RHEL-6.5 guest.
Comment 35 Laszlo Ersek 2014-04-15 11:23:28 EDT
Copying the oprom (C and D segments) doesn't help.
Comment 36 Laszlo Ersek 2014-04-15 12:07:02 EDT
*groan*

Regarding the 2nd reboot that fails, this is what SeaBIOS says (I checked with my patch in comment 23):

  In resume (status=0)
  In 32bit resume
  Attempting a hard reboot

This is where I start laughing hysterically and tearing out my hair, because these messages belong to *RHEL-7* SeaBIOS.

Let's repeat this exercise:

(In reply to Laszlo Ersek from comment #16)

> (2) vmcore of RHEL-6.5 guest after migrated to RHEL-7 destination host
> ----------------------------------------------------------------------
> PhysAddr            MemSiz        PhysAddr            MemSiz
> 0x0000000000000000  0x000a0000       0                640    KB
> 0x00000000000c0000  0x1ff40000     768     KB         511.25 MB
> 0x00000000f0000000  0x01000000    3840     MB          16    MB <-- vga.vram
> 0x00000000fffe0000  0x00020000    4095.875 MB         128    KB <-- pc.bios
> 
> crash> search -c -p "seabios-"
> [no matches]
> 
> crash> rd -p 0x00000000fffe0000  0x00020000 # pc.bios
> ...
>         ffff2a60:  626165730000006c 2e362e302d736f69   l...seabios-0.6.
>            ^^^^^
>         ffff2a70:  652e38322d322e31 000f52880000366c   1.2-28.el6...R..
> ...

Now, with my patch in place, the results for the same crash commands are:

crash> search -c -p "seabios-"
f5be4: seabios-1.7.2.2-12.el7..................................
fc980: seabios-1.7.2.2/src/virtio-ring.c.BUG: failure at %s:%d/

crash> rd -p 0x00000000fffe0000  0x00020000 # pc.bios
        ffff2a60:  626165730000006c 2e362e302d736f69   l...seabios-0.6.
        ffff2a70:  652e38322d322e31 000f52880000366c   1.2-28.el6...R..

So, what happens (and I only figured this out because Dave asked me to add a warning printf() to the shadow_bios() function in my patch) is that ram_load() is called *several times* during migration (I didn't expect that, and I have absolutely no clue why this happens). So, consider what the patch does in this case (which I intended to run *only* after the entire migration of RAM completed):

    if (buffer_is_zero(ef_seg_host, bios_size)) {
        memcpy(ef_seg_host, memory_region_get_ram_ptr(bios->mr), bios_size);
    }

Suppose this code runs *before* the pc.bios block has been migrated. Then the "pc.bios" ramblock still contains the SeaBIOS blob loaded from the RHEL-7 host, and the target E and F segments are empty. Hence, we copy RHEL-7 SeaBIOS to the E and F segments (in error).

Then, at some point pc.bios is in fact loaded from the source host (containing the RHEL-6 SeaBIOS). However, at that point buffer_is_zero() returns false (because the E and F segments are already populated by the RHEL-7 SeaBIOS binary), and we end up with the following situation:

- the high-mapped PCI area contains RHEL-6 BIOS (because that's what pc.bios is), with modified (live) variables
- the low-mapped (shadowed) *RAM* area contains RHEL-7 BIOS (due to the above error in my patch)
- the low-mapped *PCI* area (the isa-bios alias) shows RHEL-6 bios, with modified (live) variables.

The above is of course garbage. RHEL-7 SeaBIOS uses a variable called HaveRunPost (that is supposed to survive both reboot and suspend/resume) to distinguish a normal warm reboot from suspend/resume; there's no telling whatever contents it has (and where it falls at all) in the above mess.

I think the idea in my patch in comment 23 is sane, it's just not implemented correctly. I need to delay that action until all incoming RAMBlocks are loaded.
Comment 37 Laszlo Ersek 2014-04-15 14:15:42 EDT
Created attachment 886593 [details]
proposed patch (downstream only), v2; approach #2

So, I have a patch now that seems to work with the RHEL-6.5 guest, even for
repeated reboots and S3 suspend/resume cycles, even intermixed with each
other. QE, please test it; I'll paste the Brew link in another (private)
comment.

If needed, I could also pursue the other direction (which I abandoned in
comment 22). This is a summary of the two approaches:

Approach #1:
- For the RHEL-6.x.0 machine type, change the MemoryRegions in such a way,
  in RHEL-7 qemu-kvm, that they resemble RHEL-6 qemu-kvm. Namely, pc.bios
  and pc.rom should be mapped to guest-phys addresses similarly to what
  happens on RHEL-6. I already have a patch for this step (I didn't attach
  it yet).
- This also needs compatibility code for PAM emulation. Basically, PAM
  emulation should be turned off for the rhel6.x.0 machine type. I didn't
  try to do this yet (see comment 22) because the way PAM emulation works in
  RHEL-6 is fundamentally broken (it doesn't work at all -- it's absent).
  Also, messing with the PAM emulation in RHEL-7 could be very intrusive and
  have hidden ties to other stuff.

Approach #2:
- For the RHEL-6.x.0 machine type, keep the RHEL-7 emulator's memory layout
  and PAM emulation that is *otherwise* correct, in the general sense (as
  in, it matches physical hardware closely; much better than the RHEL-6
  emulator does). When incoming migration from RHEL-6 finishes, patch up the
  guest RAM *contents* (the contents of the pc.ram RAMBlock) to make it look
  as if SeaBIOS had actually shadowed oproms and the BIOS itself from ROM to
  the Upper Memory Area in RAM (which is what it does otherwise on real
  hardware, and on the RHEL-7 emulator, but it couldn't work on the RHEL-6
  source host).

The patch that I'm attaching now is for approach #2.

I can try to develop approach #1 a bit more too, and if it works out, people
can vote :)
Comment 39 Laszlo Ersek 2014-04-15 15:25:45 EDT
I tried to finish approach #1, but disabling the PAM emulation breaks the video console. I feared exactly this -- the PAM emulation apparently hooks into a lot of things. I think approach #2 (comment 37 and comment 38) are our best chance.
Comment 41 Radim Krčmář 2014-04-15 15:36:17 EDT
*** Bug 1074963 has been marked as a duplicate of this bug. ***
Comment 42 Radim Krčmář 2014-04-15 16:08:06 EDT
Great job! The patch in comment #37 fixes reproducer from bug 1074963, which hanged if migration occurred in very early boot.
(Good thing I decided to search bugzilla after thinking about the code, the
 world wouldn't have been a better place with two patches for this :)

I think that bug 1031488 and bug 1037480 are solved as well, going to recommend testing with your package.
Comment 43 FuXiangChun 2014-04-16 06:59:25 EDT
QE is testing this bug with build in comment 38,  will update full testing result later.  

Until now, hit a new issue only. Could you have a look at it if QE need to file a new bug to track it?

Scenario.

1. Boot RHEL6.5 guest with usb-tablet or usb-kbd or usb-mouse in RHEL6.5 host.

2. Migration to RHEL7.0 host.

3. Reboot guest. 

result:
Guest hang.  guest's console output call trace message. 

I will update console message & qemu-kvm command line to attachment.  without usb-tablet & usb-kbd & usb-mouse. then reboot guest successfully.
Comment 44 FuXiangChun 2014-04-16 07:02:41 EDT
Created attachment 886840 [details]
guest call trace from console
Comment 45 FuXiangChun 2014-04-16 07:04:28 EDT
Created attachment 886841 [details]
qemu-kvm command line, please search usb device in cli
Comment 46 Laszlo Ersek 2014-04-16 07:35:55 EDT
Seems more related to the HDA sound card (see the azx_* functions).

I've been also testing with "-usbdevice tablet" without such calltraces.
Comment 47 Laszlo Ersek 2014-04-16 10:22:15 EDT
I've spent another day on approach #1 (see comment 37).
- I reparented pc.rom to pc.ram
- I reparented isa-bios to pc.ram
- I made isa-bios and pc.bios writeable
- I disabled SMRAM emulation
- I disabled PAM emulation
- I hooked into rom_reset() so that a writeable pc.bios not get rewritten (from
  the RHEL-7 host-side bios.bin) at each reboot.

Whatever I do, I can't make all of the following work in the same build:
- at least two reboots after migration
- S3 after migration
- at least two reboots when booting a rhel6.x.0 machtype locally
- S3 when booting a rhel6.x.0 machtype locally

So, I'm giving up on approach #1. It just doesn't work.

I hope that QE's test results for approach #2 (comment 37, comment 38) will allow me to post that patch. If the results are good for basic VM configs, then I'd prefer if device-related problems were filed separately. Thanks.
Comment 51 Laszlo Ersek 2014-04-16 13:45:22 EDT
Created attachment 886964 [details]
4 patches for approach #1 (illustration only), mbox format

I just reached a breakthrough. No, it won't solve our problem, but it *proves* that approach #1 is not feasible.

With my most recent patches for approach #1 (uploaded for illustration / posterity only), I achieved the following:

(1) RHEL-6.5 guest migrated from RHEL-6.5 host to patched RHEL-7.0 host (machtype rhel6.5.0) works well, surviving several reboots and suspend/resume cycles on the target host in sequence. The patchset achieves this by creating a guest-phys layout in the RHEL-7 emulator (for the rhel6.x.0 machtype) that very closely resembles the layout established by the RHEL-6.5 emulator. (No manual shadowing takes place after migration.) Note that in this test, the SeaBIOS version in question is seabios-0.6.1.2-28.el6.x86_64, originating (ie. migrated) from the source host.

(2) However, when I cold-booted the same guest (same machtype) on the patched RHEL-7 host (seabios firmware: seabios-1.7.2.2-12.el7.x86_64), then the guest didn't survive the first reboot (the RHEL-7 SeaBIOS binary mis-recognizes the situation as a resume, "tries a hard reboot", according to the log, and falls into an infinite loop).

This failure could imply one of two things:

(a) my RHEL-7 patchset for approach #1 is broken (NB: that's *not* what QE is currently testing, from comment 37 -- that one is approach #2),

OR

(b) the RHEL-7 SeaBIOS binary (seabios-1.7.2.2-12.el7.x86_64) simply *cannot reboot* on a guest-phys memory layout (+ PAM emulation level) that the RHEL-6.5 emulator provides.

In order to confirm (b), I copied the RHEL-7 SeaBIOS binary to my RHEL-6 host, and booted the RHEL-6.5 guest with it, on the RHEL-6.5 emulator. Surprise, as soon as I try to reboot the guest there, using the RHEL-7 SeaBIOS binary, it logs

  In resume (status=0)
  In 32bit resume
  Attempting a hard reboot

and the qemu-kvm process *exits*.

So, my patchset for approach #1 is good enough; it approximates the RHEL-6 emulator layout in RHEL-7 sufficiently. The reason it doesn't work for test (2) is because "seabios-1.7.2.2-12.el7.x86_64" is simply incompatible with RHEL-6.5 qemu-kvm's guest-phys layout + PAM support (which my patchset establishes well enough for RHEL-7 seabios to break).

(RHEL-6 SeaBIOS works quite OK on the RHEL-7 emulator's memory layout + PAM support, as evidenced by the approach #2 patch in comment 37.)

So, unless we ship the RHEL-6 SeaBIOS binary in the RHEL-7 package, and tie it to the rhel6.x.0 machtype of the RHEL-7 emulator, approach #1 (a RHEL-6.5 compatible memory layout + PAM support) will never work; and our only chance is approach #2.
Comment 52 Laszlo Ersek 2014-04-16 13:53:14 EDT
(Of course we could always try to retrofit seabios-1.7.2.2-12.el7 to the RHEL-6.5 memory layout + PAM (non-)support... but that's an academic proposition :))
Comment 53 FuXiangChun 2014-04-17 04:51:16 EDT
Re-tested this issue with qemu-kvm-1.5.3-60.el7.bz1027565_cdef_seg.x86_64.
seabios version:
seabios-bin-1.7.2.2-12.el7.x86_64 for rhel7.0 host
seabios-0.6.1.2-28.el6.x86_64     for rhel6.5 host

This is testing result. 

1.For RHEL7.0 Guest

host                          cold-boot     twice reboot     suspend/resume     
 ------------------------------  ---------   --------     --------------    
RHEL-6.5->RHEL-7.0                 FAIL      bug 1027565    exist bug 1083478
RHEL-7.0                           PASS       PASS           pass               
RHEL-7.0+patch                     PASS       PASS           pass               
RHEL-6.5->RHEL-7.0+patch           PASS       PASS           pass               
RHEL-7.0->RHEL-7.0                 PASS       PASS           pass               
RHEL-7.0+patch->RHEL-7.0+patch     PASS       PASS           pass               

[1][4]tested machine type rhel6.5.0
[2][3][5][6]tested machine type rhel6.5.0 & pc-i440fx-rhel7.0.0


2.For RHEL6.5 guest.

 host                            cold-boot   twice reboot     suspend/resume  
 ------------------------------  ---------   --------         --------------   
RHEL-6.5->RHEL-7.0                 FAIL      bug 1027565      pass
RHEL-7.0                           PASS       PASS            pass            
RHEL-7.0+patch                     PASS       PASS            pass            
RHEL-6.5->RHEL-7.0+patch           PASS       PASS            pass            
RHEL-7.0->RHEL-7.0                 PASS       PASS            pass            
RHEL-7.0+patch->RHEL-7.0+patch     PASS       PASS            pass            

[1][4] tested machine type rhel6.5.0
[2][3][5][6]tested machine type rhel6.5.0 & pc-i440fx-rhel7.0.0

3.For RHEL6.5 guest

Machine type   cold-boot   twice reboot     suspend/resume
-----------    ----------  ---------      --------------
RHEL6.4.0      PASS         PASS          PASS        
RHEL6.3.0      PASS         PASS          PASS        
RHEL6.2.0      PASS         PASS          PASS        
RHEL6.1.0      PASS         PASS          PASS        
RHEL6.0.0      PASS         PASS          PASS 

Additional. about hit issue in commnet43. QE will continue to debug it. and will file a new bug to track it.
Comment 54 huiqingding 2014-04-17 05:34:27 EDT
Re-tested this issue with qemu-kvm-1.5.3-60.el7.bz1027565_cdef_seg.x86_64.
seabios version:
seabios-bin-1.7.2.2-12.el7.x86_64 for rhel7.0 host
seabios-0.6.1.2-28.el6.x86_64     for rhel6.5 host

This is testing result. 

1.For Windows 7 32bit guest

host                          cold-boot twice reboot  suspend/resume     
 -----------------------      --------- --------     --------------    
RHEL-7.0+patch                 PASS      PASS         PASS               
RHEL-6.5->RHEL-7.0+patch       PASS      PASS         PASS            
RHEL-7.0+patch->RHEL-7.0+patch PASS      PASS         PASS           

[1][2][3] test cold-boot/reboot using three driver: -vga qxl/cirrus/std
[1][2][3] test suspend/resume using only one driver: -vga qxl
[2]tested machine type rhel6.5.0
[1][3]tested machine type rhel6.5.0 & pc-i440fx-rhel7.0.0

2.For Windows 8 64bit guest

host                          cold-boot     twice reboot     shutdown     
 ------------------------------  ---------   --------     --------------    
RHEL-7.0+patch                     PASS       PASS           PASS               
RHEL-6.5->RHEL-7.0+patch           PASS       PASS           PASS               
RHEL-7.0+patch->RHEL-7.0+patch     PASS       PASS           PASS               

[1][2][3] test cold-boot/reboot/shutdown using three driver: -vga qxl/cirrus/std
[2]tested machine type rhel6.5.0
[1][3]tested machine type rhel6.5.0 & pc-i440fx-rhel7.0.0
Comment 55 Laszlo Ersek 2014-04-17 06:35:04 EDT
Thank you for the results!

The number of possible variations is huge, so I'll prepare a table for the
following component values only:

souce hosts:
- RHEL-6.5
- RHEL-7.0 ("unpatched")
- RHEL-7.0p ("patched")

destination host:
- RHEL-7.0p ("patched") *only*. We only care if the patch fixes the bug or
  regresses something else, on the migration target host. All of the tests
  in the table below imply a RHEL-7.0p (patched) destination host.

macine types:
- rhel6.5.0
- pc-i440fx-rhel7.0.0

guests:
- RHEL-6.5
- RHEL-7.0
- win7-32
- win8-64

Numbers in brackets (as in [x]) are *not* test numbers, they are remarks
explained under the table.

I'm filling in the table from the results in comment 53 and comment 54.

Basically, what seems to remain is migration from an unpatched RHEL-7.0
source host to a patched RHEL-7.0p destination host. Eduardo asked me if
this would work "as a bonus", and I think that it should, yes. Can you
please test these cases too? Thank you very much.

  #  source     machtype             guest     2x reboot on    S3 on
     host                                      RHEL-7.0p dest  RHEL-7.0p dest
---  ------     --------             -----     --------------  --------------
000  RHEL-6.5   rhel6.5.0            RHEL-6.5  pass[1]         pass
001  RHEL-6.5   rhel6.5.0            RHEL-7.0  pass            pass
002  RHEL-6.5   rhel6.5.0            win7-32   pass            pass
003  RHEL-6.5   rhel6.5.0            win8-64   pass            no driver
004  RHEL-7.0   rhel6.5.0            RHEL-6.5  ?               ?
005  RHEL-7.0   rhel6.5.0            RHEL-7.0  ?               ?
006  RHEL-7.0   rhel6.5.0            win7-32   ?               ?
007  RHEL-7.0   rhel6.5.0            win8-64   ?               ?
008  RHEL-7.0   pc-i440fx-rhel7.0.0  RHEL-6.5  ?               ?
009  RHEL-7.0   pc-i440fx-rhel7.0.0  RHEL-7.0  ?               ?
010  RHEL-7.0   pc-i440fx-rhel7.0.0  win7-32   ?               ?
011  RHEL-7.0   pc-i440fx-rhel7.0.0  win8-64   ?               ?
012  RHEL-7.0p  rhel6.5.0            RHEL-6.5  pass            pass
013  RHEL-7.0p  rhel6.5.0            RHEL-7.0  pass            pass
014  RHEL-7.0p  rhel6.5.0            win7-32   pass            pass
015  RHEL-7.0p  rhel6.5.0            win8-64   pass            no driver
016  RHEL-7.0p  pc-i440fx-rhel7.0.0  RHEL-6.5  pass            pass
017  RHEL-7.0p  pc-i440fx-rhel7.0.0  RHEL-7.0  pass            pass
018  RHEL-7.0p  pc-i440fx-rhel7.0.0  win7-32   pass            pass
019  RHEL-7.0p  pc-i440fx-rhel7.0.0  win8-64   pass            no driver

[1] Hangs with usb-tablet, or usb-kbd, or usb-mouse. See comment 43. QE to
file new BZ.
Comment 56 juzhang 2014-04-17 06:38:24 EDT
Hi Huding,

Can you have a test and fill in the comment55 table tmr?

Best Regards,
Junyi
Comment 57 Laszlo Ersek 2014-04-17 06:42:06 EDT
Actually, nevermind! The patch only affects the incoming migration code, so it doesn't matter if you migrate from a patched vs. an unpatched RHEL-7. host. These results are sufficient; I'll post the patch. Thank you very much!
Comment 58 juzhang 2014-04-17 06:43:58 EDT
(In reply to Laszlo Ersek from comment #57)
> Actually, nevermind! The patch only affects the incoming migration code, so
> it doesn't matter if you migrate from a patched vs. an unpatched RHEL-7.
> host. These results are sufficient; I'll post the patch. Thank you very much!

Ok. Free to let qe know if further testing needed.

Best Regards,
Junyi
Comment 61 Laszlo Ersek 2014-04-20 12:37:47 EDT
I think I have a reasonable guess at the problem, and I'm CC'ing Gerd and Hans as USB experts for discussion.

In my opinion, the problem raised in comment 60 is unrelated to bug 1027565 (ie. this bug) and unrelated to my proposed patch for it (comment 37); it's an indepentend migration problem in the ich9-usb-uhci1 device. I think my patch might only unmask or otherwise expose the independent UHCI problem. As such I recommend to
- postpone the ich9-usb-uhci1 problem,
- accept my fix for this bug,
- wait until the fix goes into an official build,
- report the USB problem against that build.

Regarding the USB problem itself: the RHEL-6 migration stream doesn't serialize the "pending_int_mask" member of UHCIState ("Interrupts that should be raised at the end of the current frame").

Hans' upstream commit ecfdc15f ("uhci: Fix pending interrupts getting lost on migration"), present in RHEL-7, bumped the version identifier of "vmstate_uhci" from 2 to 3, and made sure that "pending_int_mask" would be migrated.

Now, this migration field seems to be correctly versioned (that is, the RHEL-7 recipient side can cope when the lower versioned stream, incoming from RHEL-6, lacks the field), but I think *maybe* the uhci_post_load() function should make up for it dynamically, just as it does for "expire_time". Without extra logic in uhci_post_load(), "pending_int_mask" is likely initialized to zero on incoming migration, and could cause problems in the guest.

I have no idea if it is possible to recreate some sensible "pending set" for UHCI when the actual set has not been saved in the migration stream. If not, then maybe the device and the guest should be somehow instructed to do a full reconnect after migration. Maybe just mark all relevant interrupts as pending, and let the guest handle a spurious instance for each affected IRQ number after migration. (A single spurious interrupt, one per IRQ type, is likely more benign (triggering a warning at worst) than lost interrupts (hang).)

In any case I strongly believe this UHCI problem is unrelated to the reboot problem.
Comment 62 Laszlo Ersek 2014-04-20 15:35:44 EDT
In addition, uhci_reset() doesn't seem to clear UHCIState.pending_int_mask, and the only function that does clear it is uhci_frame_timer(), which reads it first however.
Comment 63 Laszlo Ersek 2014-04-20 18:12:18 EDT
Also, I cannot reproduce the after-reboot hang at all, using the following command line:

/usr/libexec/qemu-kvm \
  -S \
  -M rhel6.5.0 \
  -enable-kvm \
  -m 1024 \
  -smp 4 \
  -drive if=none,cache=none,media=disk,id=disk1,format=qcow2,werror=stop,rerror=stop,file=$HOME/tmp/rhel650.qcow2 \
  -device virtio-blk-pci,drive=disk1,bootindex=1 \
  -drive if=none,cache=none,media=cdrom,id=cd1,format=raw,file=/mnt/data/isos/RHEL6.5-20131111.0-Server-x86_64-DVD1.iso \
  -device ide-drive,bus=ide.1,unit=0,drive=cd1,bootindex=2 \
  -monitor stdio \
  -nodefconfig \
  -net none \
  -vnc :0 \
  -vga cirrus \
  -device ich9-usb-uhci1 \
  -device usb-tablet \
  -debugcon file:debug.log -global isa-debugcon.iobase=0x402 \
  -global PIIX4_PM.disable_s3=0

I used an internal snapshot for testing -- I saved it on a RHEL-6 host, and loaded it on the patched RHEL-7 host. After loadvm, reboot works fine, and the tablet works too, just very slowly.

There is without a doubt some problem with the migration of ich9-usb-uhci1 and/or usb-tablet and/or HDA sound, but the patch in comment 37 doesn't touch device states at all, and this BZ concerns rebooting (and related S3 functionality.)

I recommend an official build for this BZ, with the patch in comment 37, and then a separate BZ, with a minimal command line, about the USB problem. RHEL6->RHEL7 migration is probably broken for several devices; we must isolate the issues from each other as much as possible (by minimizing the command lines etc.) Thank you.
Comment 64 huiqingding 2014-04-21 01:03:55 EDT
Hi, Laszlo

Thanks very much for reply.

When guest with balloon and uhci device, after migration and reboot, guest will hit call trace. Only with uhci device, guest will not hit call trace.

The command line we used is as following:
 /usr/libexec/qemu-kvm \
   -M rhel6.5.0 \
   -cpu Westmere,hv_relaxed \
  -enable-kvm  \
  -m 4096 -smp 4,sockets=2,cores=2,threads=1,maxcpus=160 \ 
  -k en-us \
  -device virtio-balloon-pci,id=ballooning,bus=pci.0,addr=0x5 \ 
  -drive file=/home/huding/RHEL-Server-6.5-64-virtio.qcow2,if=none,id=drive-virtio-disk,format=qcow2,cache=none,aio=native,werror=stop,rerror=stop,media=disk,snapshot=off,bus=1,unit=1 \
  -device virtio-blk-pci,scsi=off,drive=drive-virtio-disk,id=virtio-disk,bus=pci.0,addr=0x7,bootindex=1 \
  -device ich9-usb-uhci1,id=usb1,addr=0x11 \
  -device usb-tablet,id=input0 \
  -device usb-mouse,id=input1 \
  -monitor stdio \
  -serial unix:/tmp/monitor,server,nowait \
  -net none \
  -vnc :1


> I recommend an official build for this BZ, with the patch in comment 37, and
> then a separate BZ, with a minimal command line, about the USB problem.

It's ok in qe side open a new bz after official build comes out.
Comment 65 huiqingding 2014-04-21 02:15:53 EDT
> When guest with balloon and uhci device, after migration and reboot, guest
> will hit call trace.

The version tested is as following:
RHEL6.5 host:
2.6.32-456.el6.x86_64
qemu-kvm-0.12.1.2-2.423.el6.x86_64

RHEL7.0 host:
3.10.0-121.el7.x86_64
qemu-kvm-1.5.3-60.el7.bz1027565_cdef_seg.x86_64

Machine type is 6.5.0 and the command line is at comment64.

After migration, check the serial console, found call trace same as an existed bz 1085701.
# nc -U /tmp/monitor
Clocksource tsc unstable (delta = 3568989818 ns)
irq 10: nobody cared (try booting with the "irqpoll" option)
Pid: 0, comm: swapper Not tainted 2.6.32-431.el6.x86_64 #1
Call Trace:
 <IRQ>  [<ffffffff810e8feb>] ? __report_bad_irq+0x2b/0xa0
 [<ffffffff810e91ec>] ? note_interrupt+0x18c/0x1d0
 [<ffffffff81034b41>] ? ack_apic_level+0x191/0x1b0
 [<ffffffff810e998d>] ? handle_fasteoi_irq+0xcd/0xf0
 [<ffffffff8100faf9>] ? handle_irq+0x49/0xa0
 [<ffffffff81530fdc>] ? do_IRQ+0x6c/0xf0
 [<ffffffff8100b9d3>] ? ret_from_intr+0x0/0x11
 [<ffffffff8107a893>] ? __do_softirq+0x73/0x1e0
 [<ffffffff810ac9ca>] ? tick_program_event+0x2a/0x30
 [<ffffffff8100c30c>] ? call_softirq+0x1c/0x30
 [<ffffffff8100fa75>] ? do_softirq+0x65/0xa0
 [<ffffffff8107a795>] ? irq_exit+0x85/0x90
 [<ffffffff815310aa>] ? smp_apic_timer_interrupt+0x4a/0x60
 [<ffffffff8100bb93>] ? apic_timer_interrupt+0x13/0x20
 <EOI>  [<ffffffff8103eacb>] ? native_safe_halt+0xb/0x10
 [<ffffffff810167bd>] ? default_idle+0x4d/0xb0
 [<ffffffff81009fc6>] ? cpu_idle+0xb6/0x110
 [<ffffffff8150cbea>] ? rest_init+0x7a/0x80
 [<ffffffff81c26f8f>] ? start_kernel+0x424/0x430
 [<ffffffff81c2633a>] ? x86_64_start_reservations+0x125/0x129
 [<ffffffff81c26453>] ? x86_64_start_kernel+0x115/0x124
handlers:
[<ffffffffa00556b0>] (vp_interrupt+0x0/0x60 [virtio_pci])
Disabling IRQ #10

Then run "system_reset" at dest qemu-kvm, guest hit call trace similar as comment 43. I also upload the call trace log as the attchment file.
Comment 66 huiqingding 2014-04-21 02:16:49 EDT
Created attachment 888022 [details]
call trace log after migration and system_reset
Comment 67 Hans de Goede 2014-04-21 11:21:27 EDT
Hi,

The whole pending_int_mask thingie is only a problem with migration of usb-devices using async packet completion, such as usbredir. It is never a problem with the usb-tablet since that is not using async usb packet completion. So I think that what you're seeing is not caused by this. Also given that the pending_int_mask not being migrated only is an issue when the migration hits between an async packet completion and the frame-timer running, and that I'm afraid that reporting interrupts which are not actually there to host may cause issues, I believe that initializing pending_int_mask to 0 is correct.

Note I'm no longer in the spice/virt team, so next time if you need any usb-info please put this in need-info for Gerd, thanks!

Regards,

Hans
Comment 68 Laszlo Ersek 2014-04-21 13:47:46 EDT
(1) Thanks, Hans.

(2) I can reproduce the issue in comment 65, similarly; ie. without rebooting the VM after migration finishes. I only need to let migration finish, and then allow the VM to run a little bit on the target host. In a few seconds or so the symptoms show. I have no clue what the origin of such an IRQ 10 is. Apparently it doesn't matter what device(s) is (are) otherwise associated with IRQ 10 (or IRQ 11), virtio-balloon, uhci, or HDA. Something injects an IRQ 10 (or IRQ 11) that is unrelated to whatever device otherwise owns the interrupt.

(Again, this problem is unrelated to the patch in comment 37.)

I'll retry with -no-kvm-irqchip. (Because, despite bug 1031488 comment 11, RHEL-6 does allow turning off the in-kernel irqchip, just the option has a different name. "-no-kvm-irqchip" is parseable by both RHEL-6 and RHEL-7.)
Comment 69 Laszlo Ersek 2014-04-21 14:07:14 EDT
I specified " -no-kvm-irqchip" on both sides; it doesn't help.
Comment 70 Laszlo Ersek 2014-04-21 17:45:29 EDT
It's hard for myself to believe, but I found & fixed the problem! :)

The crucial insight was to let the migrated VM run on the target host
without touching it *at all*. The call trace didn't show. It also didn't
show when only the keyboard was used. But, a bad interrupt was reported as
soon as the mouse was moved into the VNC guest window -- that is, as soon as
the UHCI controller generated interrupts.

It was quite straightforward from there. I'll attach the patch and provide 
the link to the new build.
Comment 71 Laszlo Ersek 2014-04-21 17:58:08 EDT
Created attachment 888273 [details]
proposed patch for the *separate* USB problem (downstream only)

This patch fixes the UHCI problem ("irq 10: nobody cared") for me.

Please note that this patch is *in addition* to the patch in comment 37, and in fact it is the subject of another BZ -- to be submitted by QE later, separately, for the UHCI problem.

I'm not repeating the analysis of the UHCI bug here; please open the patch (as a plaintext file, ie. not just as a diff) and read the commit message.

Indeed this patch should fix bug 1031488 that Radim pointed out in comment 42 and comment 48 here. Hence we might not even need a new BZ.
Comment 73 juzhang 2014-04-21 21:49:46 EDT
Hi Huding,

Can you please have a try your comment65 scenario and bug 1031488 scenario by using comment72 scratch build and update the testing result in the bz?

Best Regards,
Junyi
Comment 74 huiqingding 2014-04-22 02:07:11 EDT
(In reply to juzhang from comment #73)
> Hi Huding,
> 
> Can you please have a try your comment65 scenario and bug 1031488 scenario
> by using comment72 scratch build and update the testing result in the bz?
> 

The version tested is as following:
RHEL6.5 host:
2.6.32-459.el6.x86_64
qemu-kvm-0.12.1.2-2.424.el6.x86_64

RHEL7.0 host:
3.10.0-121.el7.x86_64
qemu-kvm-1.5.3-60.el7.bz1027565_cdef_seg_ich_uhci.x86_64

Tested steps:
1. boot a rhel6.5-64 guest with balloon and uhci device on src and dest host
# /usr/libexec/qemu-kvm \
   -M rhel6.5.0 \
   -cpu Westmere,hv_relaxed \
  -enable-kvm  \
  -m 4096 -smp 4,sockets=2,cores=2,threads=1,maxcpus=160 \ 
  -k en-us \
  -device virtio-balloon-pci,id=ballooning,bus=pci.0,addr=0x5 \ 
  -drive file=/home/huding/RHEL-Server-6.5-64-virtio.qcow2,if=none,id=drive-virtio-disk,format=qcow2,cache=none,aio=native,werror=stop,rerror=stop,media=disk,snapshot=off,bus=1,unit=1 \
  -device virtio-blk-pci,scsi=off,drive=drive-virtio-disk,id=virtio-disk,bus=pci.0,addr=0x7,bootindex=1 \
  -device ich9-usb-uhci1,id=usb1,addr=0x11 \
  -device usb-tablet,id=input0 \
  -device usb-mouse,id=input1 \
  -monitor stdio \
  -serial unix:/tmp/monitor,server,nowait \
  -net none \
  -vnc :1
2. do migration from rhel6.5 host to rhel7.0 host
3. run "system_reset" on dest qemu-kvm
Actual result
after step2, check serial console, not hit call trace
after step3, check serial console and dmesg, no hit call trace

Additional info
I also test "-M rhel6.1.0/rhel6.2.0/rhel6.3.0/rhel6.4.0", not hit call trace.
Comment 75 huiqingding 2014-04-22 04:43:36 EDT
I also use comment72 scratch build to test bug 1031488 and updated the result in bug 1031488.
Comment 76 Laszlo Ersek 2014-04-23 09:05:44 EDT
OK, after the newest QE results (great job!), bug 1031488 is a different issue; I've just re-scoped it.

In addition, the ICH9-UHCI[123] migration problem, covered in comments 64 to 74 in this bug, belong to previously reported bug 1085701.

Therefore this bug here is in correct state now: POST, with the patch in comment 37 pending review.
Comment 78 Miroslav Rezanina 2014-05-02 05:01:29 EDT
Fix included in qemu-kvm-1.5.3-61.el7
Comment 80 huiqingding 2014-10-16 23:08:16 EDT
Test win7-64 guest on an amd host using the following version:
RHEL6.6 src host:
kernel-2.6.32-505.el6.x86_64
qemu-kvm-rhev-0.12.1.2-2.448.el6.x86_64

RHEL7.1 src host:
kernel-3.10.0-187.el7.x86_64
qemu-kvm-rhev-2.1.2-3.el7.x86_64

The system disk is virtio-blk, the test result is as following:
host                          cold-boot     twice reboot     shutdown    
------------------          ------------   --------------   ----------
RHEL6.6 -> RHEL7.1            PASS          PASS             PASS

The sytem disk is virtio-scsi, the test result is as following:
host                          cold-boot     twice reboot     shutdown    
------------------          ------------   --------------   ----------
RHEL6.6 -> RHEL7.1              bz1123812    bz1123812        PASS

The command line is as following:
/usr/libexec/qemu-kvm -cpu Opteron_G1 \
-enable-kvm  -m 4096 -realtime mlock=off -smp 4,sockets=2,cores=2,threads=1,maxcpus=160 -numa node,cpus=0 \
-numa node,cpus=1 -numa node,cpus=2 -numa node,cpus=3 \
-nodefconfig -nodefaults \
-global PIIX4_PM.disable_s3=0 \
-global PIIX4_PM.disable_s4=0 \
-global ide-drive.physical_block_size=4096 \
-global ide-drive.logical_block_size=4096 \
-global virtio-blk-pci.physical_block_size=512 \
-global virtio-blk-pci.logical_block_size=512 \
-boot order=cdn,once=n,menu=on,strict=on,reboot-timeout=60000 -k en-us \
-device virtio-balloon-pci,id=ballooning,bus=pci.0,addr=0x5,indirect_desc=on,event_idx=on,multifunction=on,rombar=100 \
-monitor stdio \
-name test-all-qemu-kvm-option -uuid `uuidgen` \
-usbdevice tablet -usbdevice mouse  \
-drive file=/mnt/win7-64.qcow2,if=none,id=drive-virtio-disk,format=qcow2,cache=writethrough,aio=native,werror=stop,rerror=stop,media=disk,snapshot=off,bus=1,unit=1 \
-device virtio-blk-pci,scsi=off,drive=drive-virtio-disk,id=virtio-disk,bus=pci.0,addr=0x7,physical_block_size=512,logical_block_size=512,multifunction=on,scsi=on,event_idx=on,indirect_desc=on,vectors=16,x-data-plane=off,ioeventfd=on,serial=fuxc,discard_granularity=1,min_io_size=4096,opt_io_size=4096,bootindex=1 \
-netdev tap,id=hostnet0,vhost=on,id=hostnet0,script=/etc/qemu-ifup \
-device virtio-net-pci,netdev=hostnet0,id=virtio-net-pci0,mac=e2:d5:ac:bf:61:02,bus=pci.0,addr=0x9,multifunction=on,status=on,gso=on,ioeventfd=on,vectors=8,indirect_desc=on,event_idx=off,guest_tso4=off,guest_tso6=on,guest_ecn=off,guest_ufo=on,host_tso4=off,host_tso6=on,host_ecn=on,mrg_rxbuf=off,ctrl_vq=on,host_ufo=on,mrg_rxbuf=on,ctrl_rx=on,ctrl_vlan=on,ctrl_rx_extra=on,ctrl_mac_addr=on \
-netdev tap,id=hostnet1,vhost=off,script=/etc/qemu-ifup \
-device e1000,netdev=hostnet1,id=virtio-net-pci1,mac=fa:8c:53:1c:ce:01,bus=pci.0,addr=0xa,multifunction=off \
-netdev tap,id=hostnet2,vhost=off,script=/etc/qemu-ifup \
-device rtl8139,netdev=hostnet2,id=virtio-net-pci2,mac=4a:eb:1f:b8:64:04,bus=pci.0,addr=0xb,multifunction=off \
-serial unix:/tmp/monitor2,server,nowait \
-rtc base=utc \
-drive file=/mnt/ide-disk,if=none,id=drive-data-disk,format=raw,cache=none,aio=native,werror=stop,rerror=stop,copy-on-read=off,serial=fux-ide,media=disk \
-device ide-drive,drive=drive-data-disk,id=system-disk,logical_block_size=512,physical_block_size=512,min_io_size=32,opt_io_size=64,discard_granularity=512,ver=fuxc-ver,bus=ide.0,unit=0  \
-chardev tty,id=serial1,path=/dev/ttyS0 \
-device isa-serial,chardev=serial1 \
-chardev socket,id=channel1,path=/tmp/helloworld1,server,nowait  \
-chardev socket,id=channel2,path=/tmp/helloworld2,server,nowait \
-device virtio-serial-pci,id=virtio-serial0,max_ports=16,vectors=0 \
-chardev file,id=channel3,path=/mnt/helloworld1.txt \
-device virtserialport,chardev=channel3,name=com.redhat.rhevm.vdsm1,bus=virtio-serial0.0,id=port1,nr=1 \
-chardev socket,id=isa-serial-1,path=/tmp/isa-serial-1,server,nowait \
-device isa-serial,chardev=isa-serial-1 -global pvpanic.ioport=0x0505 \
-machine rhel6.5.0,dump-guest-core=off \
-drive file=/mnt/en_windows_7_ultimate_x64_dvd_x15-65922.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw \
-device ide-drive,bus=ide.1,unit=1,drive=drive-ide0-1-0,id=ide0-1-0,logical_block_size=512,physical_block_size=512,min_io_size=32,opt_io_size=64,discard_granularity=512,unit=1,ver=fuxc-ver-cdrom,bus=ide.0,unit=1 \
-drive file=/mnt/virtio-scsi-disk,if=none,id=drive-scsi-disk,format=raw,cache=writethrough,werror=stop,rerror=stop \
-device virtio-scsi-pci,id=scsi0,addr=0x13 \
-device scsi-hd,drive=drive-scsi-disk,bus=scsi0.0,id=data-disk2 \
-device sga -spice port=5901,password=redhat-vga,disable-ticketing -vga qxl -global qxl-vga.vram_size=33554432 \
-chardev socket,path=/tmp/foo,server,nowait,id=foo \
-drive file=/usr/share/virtio-win/virtio-win_amd64.vfd,if=none,id=drive-fdc0-0-0,readonly=on,format=raw \
-global isa-fdc.driveA=drive-fdc0-0-0 \
-cdrom /mnt/driver.iso \
-device usb-ehci,id=ehci \
-device usb-storage,drive=drive-usb-0-1,id=usb-0-1,removable=on,bus=ehci.0,port=1 \
-drive file=/mnt/usb-ehci,if=none,id=drive-usb-0-1,media=disk,format=qcow2 \
Comment 81 huiqingding 2014-10-16 23:26:05 EDT
Test rhel7.1 guest on an intel host using the following version:
RHEL6.6 src host:
kernel-2.6.32-504.el6.x86_64
qemu-kvm-0.12.1.2-2.448.el6_6.x86_64

RHEL7.1 src host:
kernel-3.10.0-188.el7.x86_64
qemu-kvm-1.5.3-75.el7.x86_64

The system disk is virtio-blk or virtio-scsi, the test results are same as following:
host                          cold-boot     twice reboot     shutdown    
------------------          ------------   --------------   ----------
RHEL6.6 -> RHEL7.1            PASS          PASS             PASS

The command line is as following:
/usr/libexec/qemu-kvm -cpu Penryn \
-enable-kvm  -m 4096 -realtime mlock=off -smp 4,sockets=2,cores=2,threads=1,maxcpus=160 -numa node,cpus=0 \
-numa node,cpus=1 -numa node,cpus=2 -numa node,cpus=3 \
-nodefconfig -nodefaults \
-global PIIX4_PM.disable_s3=0 \
-global PIIX4_PM.disable_s4=0 \
-global ide-drive.physical_block_size=4096 \
-global ide-drive.logical_block_size=4096 \
-global virtio-blk-pci.physical_block_size=512 \
-global virtio-blk-pci.logical_block_size=512 \
-boot order=cdn,once=n,menu=on,strict=on,reboot-timeout=60000 -k en-us \
-device virtio-balloon-pci,id=ballooning,bus=pci.0,addr=0x5,indirect_desc=on,event_idx=on,multifunction=on,rombar=100 \
-monitor stdio \
-name test-all-qemu-kvm-option -uuid `uuidgen` \
-usbdevice tablet -usbdevice mouse  \
-drive file=/mnt/rhel7_1.qcow2,if=none,id=drive-virtio-disk,format=qcow2,cache=writethrough,aio=native,werror=stop,rerror=stop,media=disk,snapshot=off,bus=1,unit=1 \
-device virtio-blk-pci,scsi=off,drive=drive-virtio-disk,id=virtio-disk,bus=pci.0,addr=0x7,physical_block_size=512,logical_block_size=512,multifunction=on,scsi=on,event_idx=on,indirect_desc=on,vectors=16,x-data-plane=off,ioeventfd=on,serial=fuxc,discard_granularity=1,min_io_size=4096,opt_io_size=4096,bootindex=1 \
-netdev tap,id=hostnet0,vhost=on,id=hostnet0,script=/etc/qemu-ifup \
-device virtio-net-pci,netdev=hostnet0,id=virtio-net-pci0,mac=e2:d5:ac:bf:61:02,bus=pci.0,addr=0x9,multifunction=on,status=on,gso=on,ioeventfd=on,vectors=8,indirect_desc=on,event_idx=off,guest_tso4=off,guest_tso6=on,guest_ecn=off,guest_ufo=on,host_tso4=off,host_tso6=on,host_ecn=on,mrg_rxbuf=off,ctrl_vq=on,host_ufo=on,mrg_rxbuf=on,ctrl_rx=on,ctrl_vlan=on,ctrl_rx_extra=on,ctrl_mac_addr=on \
-netdev tap,id=hostnet1,vhost=off,script=/etc/qemu-ifup \
-device e1000,netdev=hostnet1,id=virtio-net-pci1,mac=fa:8c:53:1c:ce:01,bus=pci.0,addr=0xa,multifunction=off \
-netdev tap,id=hostnet2,vhost=off,script=/etc/qemu-ifup \
-device rtl8139,netdev=hostnet2,id=virtio-net-pci2,mac=4a:eb:1f:b8:64:04,bus=pci.0,addr=0xb,multifunction=off \
-serial unix:/tmp/monitor2,server,nowait \
-rtc base=utc \
-drive file=/mnt/ide-disk,if=none,id=drive-data-disk,format=raw,cache=none,aio=native,werror=stop,rerror=stop,copy-on-read=off,serial=fux-ide,media=disk \
-device ide-drive,drive=drive-data-disk,id=system-disk,logical_block_size=512,physical_block_size=512,min_io_size=32,opt_io_size=64,discard_granularity=512,ver=fuxc-ver,bus=ide.0,unit=0  \
-chardev tty,id=serial1,path=/dev/ttyS0 \
-device isa-serial,chardev=serial1 \
-chardev socket,id=channel1,path=/tmp/helloworld1,server,nowait  \
-chardev socket,id=channel2,path=/tmp/helloworld2,server,nowait \
-device virtio-serial-pci,id=virtio-serial0,max_ports=16,vectors=0 \
-chardev file,id=channel3,path=/mnt/helloworld1.txt \
-device virtserialport,chardev=channel3,name=com.redhat.rhevm.vdsm1,bus=virtio-serial0.0,id=port1,nr=1 \
-chardev socket,id=isa-serial-1,path=/tmp/isa-serial-1,server,nowait \
-device isa-serial,chardev=isa-serial-1 -global pvpanic.ioport=0x0505 \
-machine rhel6.5.0,dump-guest-core=off \
-drive file=/mnt/en_windows_7_ultimate_x64_dvd_x15-65922.iso,if=none,media=cdrom,id=drive-ide0-1-0,readonly=on,format=raw \
-device ide-drive,bus=ide.1,unit=1,drive=drive-ide0-1-0,id=ide0-1-0,logical_block_size=512,physical_block_size=512,min_io_size=32,opt_io_size=64,discard_granularity=512,unit=1,ver=fuxc-ver-cdrom,bus=ide.0,unit=1 \
-drive file=/mnt/virtio-scsi-disk,if=none,id=drive-scsi-disk,format=raw,cache=writethrough,werror=stop,rerror=stop \
-device virtio-scsi-pci,id=scsi0,addr=0x13 \
-device scsi-hd,drive=drive-scsi-disk,bus=scsi0.0,id=data-disk2 \
-device sga -vnc :2 -vga cirrus \
-chardev socket,path=/tmp/foo,server,nowait,id=foo \
-drive file=/usr/share/virtio-win/virtio-win_amd64.vfd,if=none,id=drive-fdc0-0-0,readonly=on,format=raw \
-global isa-fdc.driveA=drive-fdc0-0-0 \
-cdrom /mnt/driver.iso \
-device usb-ehci,id=ehci \
-device usb-storage,drive=drive-usb-0-1,id=usb-0-1,removable=on,bus=ehci.0,port=1 \
-drive file=/mnt/usb-ehci,if=none,id=drive-usb-0-1,media=disk,format=qcow2 \
Comment 85 errata-xmlrpc 2015-03-05 03:01:59 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHSA-2015-0349.html

Note You need to log in before you can comment on or make changes to this bug.