603026 – CPU save version is now 9, but the format is _very_ different from non-RHEL5 version 9

Bug 603026 - CPU save version is now 9, but the format is _very_ different from non-RHEL5 version 9

Summary: CPU save version is now 9, but the format is _very_ different from non-RHEL5 ...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 5
Classification:	Red Hat
Component:	kvm
Sub Component:
Version:	5.5
Hardware:	x86_64
OS:	Linux
Priority:	low
Severity:	medium
Target Milestone:	rc
Target Release:	---
Assignee:	Paolo Bonzini
QA Contact:	Virtualization Bugs
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	Rhel5KvmTier2 603027 603142
TreeView+	depends on / blocked

Reported:	2010-06-11 10:35 UTC by Paolo Bonzini
Modified:	2011-01-13 23:36 UTC (History)
CC List:	7 users (show)
Fixed In Version:	kvm-83-198.el5
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Clones:	603027 (view as bug list)
Environment:
Last Closed:	2011-01-13 23:36:01 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
qemu patch (1.21 KB, patch) 2010-06-11 11:48 UTC, Paolo Bonzini	no flags	Details \| Diff
qemu patch v2 (4.95 KB, patch) 2010-07-26 18:10 UTC, Paolo Bonzini	no flags	Details \| Diff
Show Obsolete (1) View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2011:0028	0	normal	SHIPPED_LIVE	Low: kvm security and bug fix update	2011-01-13 11:03:39 UTC

Description Paolo Bonzini 2010-06-11 10:35:54 UTC

Commit a82c8e4d836121cec49ccd9031438a3110f2e192 bumped the CPU version to 9, however the format is very different from the version 9 of upstream QEMU.  This causes problems in crash, which uses QEMU's savefiles as kvm core dumps.

Until now, the differences did nothing problematic, but for version 9 upstream does this:

                int32_t pending_irq = (int32_t) get_be32 (fp);
                if (pending_irq >= 0)
                        dx86->kvm.int_bitmap[pending_irq / 64] |=

instead of this:

                for (i = 0; i < 4; i++)
                        dx86->kvm.int_bitmap[i] = get_be64 (fp);

(Source code from qemu-load.c in git://git.engineering.redhat.com/users/pbonzini/qemu-reader.git).  In other words, the first 32 bits of the bitmap are treated as an index, causing an out-of-bounds access.

Of course, adding a "<= 255" check is easily done, but it's only a matter of time until RHEL5's version will hit 12 and we'll have serious problems handling both RHEL5 and RHEL6 dumps.

I suggest adding a fake __rhel5 section in the dumps for 5.5.z and 5.6, so that we can look for that in crash.  I'll attach the patch soon.

Comment 1 Paolo Bonzini 2010-06-11 11:48:04 UTC

Created attachment 423249 [details]
qemu patch

Comment 2 Lawrence Lim 2010-07-13 08:40:28 UTC

Hi Paolo,
Could you please suggest how we could verify this patch effectively?

Thanks.

Comment 3 Paolo Bonzini 2010-07-25 22:32:51 UTC

You can try grepping a dump for the string __rhel5.  If you do the dump early enough, possibly while grub is running, the chance of a false positive is ~zero (and it is pretty unlikely even if the system has already finished booting).

Comment 4 Paolo Bonzini 2010-07-26 18:10:07 UTC

Created attachment 434483 [details]
qemu patch v2

Unlike the previous one, this patch doesn't break backwards migration.

Comment 8 Cao, Chen 2010-11-15 02:54:33 UTC

Verified on:

# rpm -q kvm
kvm-83-207.el5

# uname -r
2.6.18-231.el5

# grep __rhel5 /var/crash/2010-11-15-10:36/vmcore 
Binary file vmcore matches


host dmesg:
# dmesg |grep crashkernel
Command line: ro root=LABEL=/ crashkernel=128M@16M
Kernel command line: ro root=LABEL=/ crashkernel=128M@16M

and /proc/iomem
# grep -i crash /proc/iomem 
  01000000-08ffffff : Crash kernel

guest launching cmd:
/usr/libexec/qemu-kvm -name 'vm1' -monitor stdio -drive file='/home/RHEL-Server-6.0-64-virtio.qcow2',index=0,if=virtio,media=disk,cache=none,boot=on,format=qcow2 -net nic,vlan=0,model=virtio,macaddr='9a:30:70:9c:34:b4' -net tap,vlan=0,ifname='virtio_xxx_5900',script='/home/qemu-ifup-switch',downscript='no' -m 4096 -smp 2 -soundhw ac97 -vnc :0  -rtc-td-hack -M rhel5.6.0 -usbdevice tablet

Comment 11 errata-xmlrpc 2011-01-13 23:36:01 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHSA-2011-0028.html

Note You need to log in before you can comment on or make changes to this bug.