Bug 591176 - migration fails since virtio-serial-bus is using uninitialized memory
migration fails since virtio-serial-bus is using uninitialized memory
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: qemu-kvm (Show other bugs)
6.0
All Linux
low Severity high
: beta
: ---
Assigned To: Alon Levy
Virtualization Bugs
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2010-05-11 11:35 EDT by Alon Levy
Modified: 2014-08-04 18:08 EDT (History)
7 users (show)

See Also:
Fixed In Version: qemu-kvm-0.12.1.2-2.62.el6
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2010-06-23 06:17:10 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Alon Levy 2010-05-11 11:35:39 EDT
Description of problem:
migration, also probably save to file (migration to file) fails with a virtio-serial device. It fails because the ports_map is filled with garbage since it is allocated too small by a factor of 4 (sizeof(unit32)).

Version-Release number of selected component (if applicable):


How reproducible:
always


Steps to Reproduce:
1. start a vm with -device virtio-serial -device virt-console
2. start a vm with same and -incoming tcp:localhost:9000 (assuming same machine)
3. migrate first vm to second
  
Actual results:
Migration fail with a complaint about virt-console device failing


Expected results:
Migration success


Additional info:
Patch below solves this problem (tested):

diff --git a/hw/virtio-serial-bus.c b/hw/virtio-serial-bus.c
index 3284d85..76c0153 100644
--- a/hw/virtio-serial-bus.c
+++ b/hw/virtio-serial-bus.c
@@ -810,7 +810,8 @@ VirtIODevice *virtio_serial_init(DeviceState *dev, uint32_t max_nr_ports)
     }

     vser->config.max_nr_ports = max_nr_ports;
-    vser->ports_map = qemu_mallocz((max_nr_ports + 31) / 32);
+    vser->ports_map = qemu_mallocz(((max_nr_ports + 31) / 32)
+        * sizeof(vser->ports_map[0]));
     /*
      * Reserve location 0 for a console port for backward compat
      * (old kernel, new qemu)
--
1.7.0.1
Comment 3 RHEL Product and Program Management 2010-05-11 12:44:35 EDT
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux major release.  Product Management has requested further
review of this request by Red Hat Engineering, for potential inclusion in a Red
Hat Enterprise Linux Major release.  This request is not yet committed for
inclusion.
Comment 7 lihuang 2010-05-24 23:39:39 EDT
Hi Alon
I can not reproduce the failure on qemu-kvm-0.12.1.2-2.52.el6.x86_64.
could please you have a look ?


src cmd line :
root     16082 12108  6 03:11 pts/0    01:24:17 /usr/libexec/qemu-kvm -m 4096 -smp 2 -monitor stdio -boot c -drive file=rhel6-64.qcow2,if=virtio,boot=on,cache=none -device virtio-serial-pci,id=virtio-serial0,max_ports=16,vectors=4,bus=pci.0,addr=0x4 -device virtserialport,bus=virtio-serial0.0,chardev=channel0,name=org.linux-kvm.port.0 -chardev socket,path=channel0,server,nowait,id=channel0 -net nic -net tap -vnc :1 -usbdevice tablet -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x8

dst cmd line
root      5898 13365  6 23:19 pts/1    00:00:13 /usr/libexec/qemu-kvm -m 4096 -smp 2 -monitor stdio -boot c -drive file=rhel6-64.qcow2,if=virtio,boot=on,cache=none -device virtio-serial-pci,id=virtio-serial0,max_ports=16,vectors=4,bus=pci.0,addr=0x4 -device virtserialport,bus=virtio-serial0.0,chardev=channel0,name=org.linux-kvm.port.0 -chardev socket,path=channel01,server,nowait,id=channel0 -net nic -net tap -vnc :2 -usbdevice tablet -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x8 -incoming tcp:0:6789

after migration :
from src qemu monitor
(qemu) info migrate
Migration status: completed
(qemu) info status
VM status: paused
(qemu)

from dst qemu monitor
(qemu) info status
VM status: running
Comment 8 Alon Levy 2010-05-25 09:53:04 EDT
ok, I have no idea why I can't reproduce it either right now. I'm sure this is a bug because
 a. I saw it fail in virtio-serial-bus.c on virtio_serial_load when comparing the port_map gotten and the one initialized (the check is that they are the same, the bug was that the migrating machine which created the device allocated too few bytes so some of that memory was uninitialized, i.e. garbage).
 b. The original (before the patch) allocates too little space.

But for some reason I'm not seeing the bug now. Actually I'm seeing values which are weird (port_maps[0] should be 0x1, but it is 2866261505), but it is the same weird at the migrator and the target..

 It could be that migration changed a little. I will try going back a few commits. I am trying to use debug (--enable-debug) since I think I was running a debug version.

I'll update you.

Alon
Comment 9 Amit Shah 2010-05-25 12:09:22 EDT
Please try without the ,max_ports= parameter.
Comment 10 lihuang 2010-05-26 02:10:36 EDT
(In reply to comment #9)
> Please try without the ,max_ports= parameter.    

without max_ports also pass on qemu-kvm-0.12.1.2-2.52.el6.x86_64.
Comment 11 Alon Levy 2010-05-26 09:33:49 EDT
ok, it's very easy to see the erronous memory, since ports_map[0] should be 1 with a single port, and it is actually GGG1, where G stands for "garbage". The problem is that the garbage seems pretty consistent, so migration succeeds.

I'm going to stop trying to recreate the original problem (migration fails) that prompted me to look for this and find this bug.

I mean:
 a) this is clearly a bug (no one is disputing this - it is a clear that allocation is too small)
 b) the only reason it isn't happening now is the behavior of the memory - trying to run many instances of qemu I always get both the same address allocated to ports_map and the same contents at that address. This causes the comparison, which failed for me previously, to succeed, since the memory is actually the same - uninitialized in the same way.
 c) so to summarize the summary, migration doesn't fail now, but this is still a bug. And yes, if you still must have me recreate I will, it's just that it more work and I don't see the point. (i.e. I already tried and failed and am giving up right now unless you don't accept my explanation as enough). (btw, I haven't tested this, but I assume that two *different* machines will have different garbage).

Alon
Comment 14 lihuang 2010-06-23 06:16:53 EDT
(In reply to comment #11)
> ok, it's very easy to see the erronous memory, since ports_map[0] should be 1
> with a single port, and it is actually GGG1, where G stands for "garbage". The
> problem is that the garbage seems pretty consistent, so migration succeeds.
> 
> I'm going to stop trying to recreate the original problem (migration fails)
> that prompted me to look for this and find this bug.
> 
> I mean:
>  a) this is clearly a bug (no one is disputing this - it is a clear that
> allocation is too small)
>  b) the only reason it isn't happening now is the behavior of the memory -
> trying to run many instances of qemu I always get both the same address
> allocated to ports_map and the same contents at that address. This causes the
> comparison, which failed for me previously, to succeed, since the memory is
> actually the same - uninitialized in the same way.
>  c) so to summarize the summary, migration doesn't fail now, but this is still
> a bug. And yes, if you still must have me recreate I will, it's just that it
> more work and I don't see the point. (i.e. I already tried and failed and am
> giving up right now unless you don't accept my explanation as enough). (btw, I
> haven't tested this, but I assume that two *different* machines will have
> different garbage).
> 
> Alon    

Hi Alon
Thanks for the analyse. 
I have reproduced it on -2.49.el6 and verified in -2.62.el6.

Note You need to log in before you can comment on or make changes to this bug.