This service will be undergoing maintenance at 00:00 UTC, 2016-08-01. It is expected to last about 1 hours
Bug 603026 - CPU save version is now 9, but the format is _very_ different from non-RHEL5 version 9
CPU save version is now 9, but the format is _very_ different from non-RHEL5 ...
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kvm (Show other bugs)
5.5
x86_64 Linux
low Severity medium
: rc
: ---
Assigned To: Paolo Bonzini
Virtualization Bugs
:
Depends On:
Blocks: Rhel5KvmTier2 603027 603142
  Show dependency treegraph
 
Reported: 2010-06-11 06:35 EDT by Paolo Bonzini
Modified: 2011-01-13 18:36 EST (History)
7 users (show)

See Also:
Fixed In Version: kvm-83-198.el5
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 603027 (view as bug list)
Environment:
Last Closed: 2011-01-13 18:36:01 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:


Attachments (Terms of Use)
qemu patch (1.21 KB, patch)
2010-06-11 07:48 EDT, Paolo Bonzini
no flags Details | Diff
qemu patch v2 (4.95 KB, patch)
2010-07-26 14:10 EDT, Paolo Bonzini
no flags Details | Diff

  None (edit)
Description Paolo Bonzini 2010-06-11 06:35:54 EDT
Commit a82c8e4d836121cec49ccd9031438a3110f2e192 bumped the CPU version to 9, however the format is very different from the version 9 of upstream QEMU.  This causes problems in crash, which uses QEMU's savefiles as kvm core dumps.

Until now, the differences did nothing problematic, but for version 9 upstream does this:

                int32_t pending_irq = (int32_t) get_be32 (fp);
                if (pending_irq >= 0)
                        dx86->kvm.int_bitmap[pending_irq / 64] |=

instead of this:

                for (i = 0; i < 4; i++)
                        dx86->kvm.int_bitmap[i] = get_be64 (fp);

(Source code from qemu-load.c in git://git.engineering.redhat.com/users/pbonzini/qemu-reader.git).  In other words, the first 32 bits of the bitmap are treated as an index, causing an out-of-bounds access.

Of course, adding a "<= 255" check is easily done, but it's only a matter of time until RHEL5's version will hit 12 and we'll have serious problems handling both RHEL5 and RHEL6 dumps.

I suggest adding a fake __rhel5 section in the dumps for 5.5.z and 5.6, so that we can look for that in crash.  I'll attach the patch soon.
Comment 1 Paolo Bonzini 2010-06-11 07:48:04 EDT
Created attachment 423249 [details]
qemu patch
Comment 2 Lawrence Lim 2010-07-13 04:40:28 EDT
Hi Paolo,
Could you please suggest how we could verify this patch effectively?

Thanks.
Comment 3 Paolo Bonzini 2010-07-25 18:32:51 EDT
You can try grepping a dump for the string __rhel5.  If you do the dump early enough, possibly while grub is running, the chance of a false positive is ~zero (and it is pretty unlikely even if the system has already finished booting).
Comment 4 Paolo Bonzini 2010-07-26 14:10:07 EDT
Created attachment 434483 [details]
qemu patch v2

Unlike the previous one, this patch doesn't break backwards migration.
Comment 8 Cao, Chen 2010-11-14 21:54:33 EST
Verified on:

# rpm -q kvm
kvm-83-207.el5

# uname -r
2.6.18-231.el5

# grep __rhel5 /var/crash/2010-11-15-10:36/vmcore 
Binary file vmcore matches


host dmesg:
# dmesg |grep crashkernel
Command line: ro root=LABEL=/ crashkernel=128M@16M
Kernel command line: ro root=LABEL=/ crashkernel=128M@16M

and /proc/iomem
# grep -i crash /proc/iomem 
  01000000-08ffffff : Crash kernel

guest launching cmd:
/usr/libexec/qemu-kvm -name 'vm1' -monitor stdio -drive file='/home/RHEL-Server-6.0-64-virtio.qcow2',index=0,if=virtio,media=disk,cache=none,boot=on,format=qcow2 -net nic,vlan=0,model=virtio,macaddr='9a:30:70:9c:34:b4' -net tap,vlan=0,ifname='virtio_xxx_5900',script='/home/qemu-ifup-switch',downscript='no' -m 4096 -smp 2 -soundhw ac97 -vnc :0  -rtc-td-hack -M rhel5.6.0 -usbdevice tablet
Comment 11 errata-xmlrpc 2011-01-13 18:36:01 EST
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0028.html

Note You need to log in before you can comment on or make changes to this bug.