Bug 591176
Summary: | migration fails since virtio-serial-bus is using uninitialized memory | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Alon Levy <alevy> |
Component: | qemu-kvm | Assignee: | Alon Levy <alevy> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Virtualization Bugs <virt-bugs> |
Severity: | high | Docs Contact: | |
Priority: | low | ||
Version: | 6.0 | CC: | amit.shah, dblechte, ehabkost, lihuang, llim, syeghiay, virt-maint |
Target Milestone: | beta | ||
Target Release: | --- | ||
Hardware: | All | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | qemu-kvm-0.12.1.2-2.62.el6 | Doc Type: | Bug Fix |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2010-06-23 10:17:10 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Alon Levy
2010-05-11 15:35:39 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux major release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Major release. This request is not yet committed for inclusion. Hi Alon I can not reproduce the failure on qemu-kvm-0.12.1.2-2.52.el6.x86_64. could please you have a look ? src cmd line : root 16082 12108 6 03:11 pts/0 01:24:17 /usr/libexec/qemu-kvm -m 4096 -smp 2 -monitor stdio -boot c -drive file=rhel6-64.qcow2,if=virtio,boot=on,cache=none -device virtio-serial-pci,id=virtio-serial0,max_ports=16,vectors=4,bus=pci.0,addr=0x4 -device virtserialport,bus=virtio-serial0.0,chardev=channel0,name=org.linux-kvm.port.0 -chardev socket,path=channel0,server,nowait,id=channel0 -net nic -net tap -vnc :1 -usbdevice tablet -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x8 dst cmd line root 5898 13365 6 23:19 pts/1 00:00:13 /usr/libexec/qemu-kvm -m 4096 -smp 2 -monitor stdio -boot c -drive file=rhel6-64.qcow2,if=virtio,boot=on,cache=none -device virtio-serial-pci,id=virtio-serial0,max_ports=16,vectors=4,bus=pci.0,addr=0x4 -device virtserialport,bus=virtio-serial0.0,chardev=channel0,name=org.linux-kvm.port.0 -chardev socket,path=channel01,server,nowait,id=channel0 -net nic -net tap -vnc :2 -usbdevice tablet -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x8 -incoming tcp:0:6789 after migration : from src qemu monitor (qemu) info migrate Migration status: completed (qemu) info status VM status: paused (qemu) from dst qemu monitor (qemu) info status VM status: running ok, I have no idea why I can't reproduce it either right now. I'm sure this is a bug because a. I saw it fail in virtio-serial-bus.c on virtio_serial_load when comparing the port_map gotten and the one initialized (the check is that they are the same, the bug was that the migrating machine which created the device allocated too few bytes so some of that memory was uninitialized, i.e. garbage). b. The original (before the patch) allocates too little space. But for some reason I'm not seeing the bug now. Actually I'm seeing values which are weird (port_maps[0] should be 0x1, but it is 2866261505), but it is the same weird at the migrator and the target.. It could be that migration changed a little. I will try going back a few commits. I am trying to use debug (--enable-debug) since I think I was running a debug version. I'll update you. Alon Please try without the ,max_ports= parameter. (In reply to comment #9) > Please try without the ,max_ports= parameter. without max_ports also pass on qemu-kvm-0.12.1.2-2.52.el6.x86_64. ok, it's very easy to see the erronous memory, since ports_map[0] should be 1 with a single port, and it is actually GGG1, where G stands for "garbage". The problem is that the garbage seems pretty consistent, so migration succeeds. I'm going to stop trying to recreate the original problem (migration fails) that prompted me to look for this and find this bug. I mean: a) this is clearly a bug (no one is disputing this - it is a clear that allocation is too small) b) the only reason it isn't happening now is the behavior of the memory - trying to run many instances of qemu I always get both the same address allocated to ports_map and the same contents at that address. This causes the comparison, which failed for me previously, to succeed, since the memory is actually the same - uninitialized in the same way. c) so to summarize the summary, migration doesn't fail now, but this is still a bug. And yes, if you still must have me recreate I will, it's just that it more work and I don't see the point. (i.e. I already tried and failed and am giving up right now unless you don't accept my explanation as enough). (btw, I haven't tested this, but I assume that two *different* machines will have different garbage). Alon (In reply to comment #11) > ok, it's very easy to see the erronous memory, since ports_map[0] should be 1 > with a single port, and it is actually GGG1, where G stands for "garbage". The > problem is that the garbage seems pretty consistent, so migration succeeds. > > I'm going to stop trying to recreate the original problem (migration fails) > that prompted me to look for this and find this bug. > > I mean: > a) this is clearly a bug (no one is disputing this - it is a clear that > allocation is too small) > b) the only reason it isn't happening now is the behavior of the memory - > trying to run many instances of qemu I always get both the same address > allocated to ports_map and the same contents at that address. This causes the > comparison, which failed for me previously, to succeed, since the memory is > actually the same - uninitialized in the same way. > c) so to summarize the summary, migration doesn't fail now, but this is still > a bug. And yes, if you still must have me recreate I will, it's just that it > more work and I don't see the point. (i.e. I already tried and failed and am > giving up right now unless you don't accept my explanation as enough). (btw, I > haven't tested this, but I assume that two *different* machines will have > different garbage). > > Alon Hi Alon Thanks for the analyse. I have reproduced it on -2.49.el6 and verified in -2.62.el6. |