RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 591176 - migration fails since virtio-serial-bus is using uninitialized memory
Summary: migration fails since virtio-serial-bus is using uninitialized memory
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: qemu-kvm
Version: 6.0
Hardware: All
OS: Linux
low
high
Target Milestone: beta
: ---
Assignee: Alon Levy
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-05-11 15:35 UTC by Alon Levy
Modified: 2014-08-04 22:08 UTC (History)
7 users (show)

Fixed In Version: qemu-kvm-0.12.1.2-2.62.el6
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2010-06-23 10:17:10 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Alon Levy 2010-05-11 15:35:39 UTC
Description of problem:
migration, also probably save to file (migration to file) fails with a virtio-serial device. It fails because the ports_map is filled with garbage since it is allocated too small by a factor of 4 (sizeof(unit32)).

Version-Release number of selected component (if applicable):


How reproducible:
always


Steps to Reproduce:
1. start a vm with -device virtio-serial -device virt-console
2. start a vm with same and -incoming tcp:localhost:9000 (assuming same machine)
3. migrate first vm to second
  
Actual results:
Migration fail with a complaint about virt-console device failing


Expected results:
Migration success


Additional info:
Patch below solves this problem (tested):

diff --git a/hw/virtio-serial-bus.c b/hw/virtio-serial-bus.c
index 3284d85..76c0153 100644
--- a/hw/virtio-serial-bus.c
+++ b/hw/virtio-serial-bus.c
@@ -810,7 +810,8 @@ VirtIODevice *virtio_serial_init(DeviceState *dev, uint32_t max_nr_ports)
     }

     vser->config.max_nr_ports = max_nr_ports;
-    vser->ports_map = qemu_mallocz((max_nr_ports + 31) / 32);
+    vser->ports_map = qemu_mallocz(((max_nr_ports + 31) / 32)
+        * sizeof(vser->ports_map[0]));
     /*
      * Reserve location 0 for a console port for backward compat
      * (old kernel, new qemu)
--
1.7.0.1

Comment 3 RHEL Program Management 2010-05-11 16:44:35 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux major release.  Product Management has requested further
review of this request by Red Hat Engineering, for potential inclusion in a Red
Hat Enterprise Linux Major release.  This request is not yet committed for
inclusion.

Comment 7 lihuang 2010-05-25 03:39:39 UTC
Hi Alon
I can not reproduce the failure on qemu-kvm-0.12.1.2-2.52.el6.x86_64.
could please you have a look ?


src cmd line :
root     16082 12108  6 03:11 pts/0    01:24:17 /usr/libexec/qemu-kvm -m 4096 -smp 2 -monitor stdio -boot c -drive file=rhel6-64.qcow2,if=virtio,boot=on,cache=none -device virtio-serial-pci,id=virtio-serial0,max_ports=16,vectors=4,bus=pci.0,addr=0x4 -device virtserialport,bus=virtio-serial0.0,chardev=channel0,name=org.linux-kvm.port.0 -chardev socket,path=channel0,server,nowait,id=channel0 -net nic -net tap -vnc :1 -usbdevice tablet -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x8

dst cmd line
root      5898 13365  6 23:19 pts/1    00:00:13 /usr/libexec/qemu-kvm -m 4096 -smp 2 -monitor stdio -boot c -drive file=rhel6-64.qcow2,if=virtio,boot=on,cache=none -device virtio-serial-pci,id=virtio-serial0,max_ports=16,vectors=4,bus=pci.0,addr=0x4 -device virtserialport,bus=virtio-serial0.0,chardev=channel0,name=org.linux-kvm.port.0 -chardev socket,path=channel01,server,nowait,id=channel0 -net nic -net tap -vnc :2 -usbdevice tablet -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x8 -incoming tcp:0:6789

after migration :
from src qemu monitor
(qemu) info migrate
Migration status: completed
(qemu) info status
VM status: paused
(qemu)

from dst qemu monitor
(qemu) info status
VM status: running

Comment 8 Alon Levy 2010-05-25 13:53:04 UTC
ok, I have no idea why I can't reproduce it either right now. I'm sure this is a bug because
 a. I saw it fail in virtio-serial-bus.c on virtio_serial_load when comparing the port_map gotten and the one initialized (the check is that they are the same, the bug was that the migrating machine which created the device allocated too few bytes so some of that memory was uninitialized, i.e. garbage).
 b. The original (before the patch) allocates too little space.

But for some reason I'm not seeing the bug now. Actually I'm seeing values which are weird (port_maps[0] should be 0x1, but it is 2866261505), but it is the same weird at the migrator and the target..

 It could be that migration changed a little. I will try going back a few commits. I am trying to use debug (--enable-debug) since I think I was running a debug version.

I'll update you.

Alon

Comment 9 Amit Shah 2010-05-25 16:09:22 UTC
Please try without the ,max_ports= parameter.

Comment 10 lihuang 2010-05-26 06:10:36 UTC
(In reply to comment #9)
> Please try without the ,max_ports= parameter.    

without max_ports also pass on qemu-kvm-0.12.1.2-2.52.el6.x86_64.

Comment 11 Alon Levy 2010-05-26 13:33:49 UTC
ok, it's very easy to see the erronous memory, since ports_map[0] should be 1 with a single port, and it is actually GGG1, where G stands for "garbage". The problem is that the garbage seems pretty consistent, so migration succeeds.

I'm going to stop trying to recreate the original problem (migration fails) that prompted me to look for this and find this bug.

I mean:
 a) this is clearly a bug (no one is disputing this - it is a clear that allocation is too small)
 b) the only reason it isn't happening now is the behavior of the memory - trying to run many instances of qemu I always get both the same address allocated to ports_map and the same contents at that address. This causes the comparison, which failed for me previously, to succeed, since the memory is actually the same - uninitialized in the same way.
 c) so to summarize the summary, migration doesn't fail now, but this is still a bug. And yes, if you still must have me recreate I will, it's just that it more work and I don't see the point. (i.e. I already tried and failed and am giving up right now unless you don't accept my explanation as enough). (btw, I haven't tested this, but I assume that two *different* machines will have different garbage).

Alon

Comment 14 lihuang 2010-06-23 10:16:53 UTC
(In reply to comment #11)
> ok, it's very easy to see the erronous memory, since ports_map[0] should be 1
> with a single port, and it is actually GGG1, where G stands for "garbage". The
> problem is that the garbage seems pretty consistent, so migration succeeds.
> 
> I'm going to stop trying to recreate the original problem (migration fails)
> that prompted me to look for this and find this bug.
> 
> I mean:
>  a) this is clearly a bug (no one is disputing this - it is a clear that
> allocation is too small)
>  b) the only reason it isn't happening now is the behavior of the memory -
> trying to run many instances of qemu I always get both the same address
> allocated to ports_map and the same contents at that address. This causes the
> comparison, which failed for me previously, to succeed, since the memory is
> actually the same - uninitialized in the same way.
>  c) so to summarize the summary, migration doesn't fail now, but this is still
> a bug. And yes, if you still must have me recreate I will, it's just that it
> more work and I don't see the point. (i.e. I already tried and failed and am
> giving up right now unless you don't accept my explanation as enough). (btw, I
> haven't tested this, but I assume that two *different* machines will have
> different garbage).
> 
> Alon    

Hi Alon
Thanks for the analyse. 
I have reproduced it on -2.49.el6 and verified in -2.62.el6.


Note You need to log in before you can comment on or make changes to this bug.