Bug 240009
Summary: | qemu-dm segfault installing FreeBSD 32 bit FV on heavily loaded machine | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Richard W.M. Jones <rjones> | ||||||||||
Component: | xen | Assignee: | Richard W.M. Jones <rjones> | ||||||||||
Status: | CLOSED CURRENTRELEASE | QA Contact: | |||||||||||
Severity: | medium | Docs Contact: | |||||||||||
Priority: | medium | ||||||||||||
Version: | 9 | CC: | felix.schwarz, katzj, sputhenp, virt-maint, xen-maint | ||||||||||
Target Milestone: | --- | ||||||||||||
Target Release: | --- | ||||||||||||
Hardware: | x86_64 | ||||||||||||
OS: | Linux | ||||||||||||
Whiteboard: | |||||||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||||||
Doc Text: | Story Points: | --- | |||||||||||
Clone Of: | Environment: | ||||||||||||
Last Closed: | 2009-03-24 18:27:24 UTC | Type: | --- | ||||||||||
Regression: | --- | Mount Type: | --- | ||||||||||
Documentation: | --- | CRM: | |||||||||||
Verified Versions: | Category: | --- | |||||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||||
Embargoed: | |||||||||||||
Attachments: |
|
Description
Richard W.M. Jones
2007-05-14 11:29:12 UTC
Created attachment 154641 [details]
FreeBSD installation notes
I also reproduced this bug with just load, no other guests running. On Dom0 (a 4 core Athlon) I am running: cd linux-2.6.21.1; while true; do make -j 4; make clean; done No guests are running, except a FreeBSD 6.2 FV 32-on-64 install. After a little while the install stops, and in Dom0's dmesg: qemu-dm[3075]: segfault at 0000000000000000 rip 0000000000000000 rsp 0000000041400c18 error 14 This bug also happens with an updated Xen hypervisor. [Background: Dan pointed out that cset 15038, http://xenbits.xensource.com/xen-3.1-testing.hg?rev/c00b2ab8af2c looked like it might have had something to do with this, but even with this change the segfault is still happening.] Created attachment 154722 [details]
Core dump from qemu-dm
Core dump from qemu-dm.
Corresponding binary:
$ rpm -qf /usr/lib64/xen/bin/qemu-dm
xen-3.1.0-0.rc7.1.fc7
Stack trace (from gdb):
Core was generated by `/usr/lib64/xen/bin/qemu-dm -d 2 -vcpus 1 -boot d -serial
pty -acpi -domain-name'.
Program terminated with signal 11, Segmentation fault.
#0 0x0000000000000000 in ?? ()
(gdb) bt
#0 0x0000000000000000 in ?? ()
#1 0x000000000042c085 in dma_thread_func (opaque=<value optimized out>)
at /usr/src/debug/xen-3.1.0-testing.hg-rc7/tools/ioemu/hw/ide.c:2402
#2 0x00000030310061b5 in start_thread () from /lib64/libpthread.so.0
#3 0x00000030304d043d in clone () from /lib64/libc.so.6
[Quite amazingly this 60K file expands to the full 39MB core dump with md5sum
189a904867814006d199f4d92c2f642c]
Stack trace from each thread: (gdb) thread apply all bt Thread 3 (process 29295): #0 0x00000030304c9952 in select () from /lib64/libc.so.6 #1 0x0000000000409555 in main_loop_wait (timeout=10) at /usr/src/debug/xen-3.1.0-testing.hg-rc7/tools/ioemu/vl.c:5216 #2 0x000000000046d251 in main_loop () at /usr/src/debug/xen-3.1.0-testing.hg-rc7/tools/ioemu/target-i386-dm/helper2.c:628 #3 0x000000000040b206 in main (argc=19, argv=0x7fff0c4f2e08) at /usr/src/debug/xen-3.1.0-testing.hg-rc7/tools/ioemu/vl.c:6903 #4 0x000000303041da54 in __libc_start_main () from /lib64/libc.so.6 #5 0x0000000000404809 in _start () Thread 2 (process 29306): #0 0x000000303100cabb in read () from /lib64/libpthread.so.0 #1 0x000000303180197a in read_all (fd=5, data=0xc9d2f0, len=16) at /usr/include/bits/unistd.h:35 #2 0x00000030318019f2 in read_message (h=0xc9b6b0) at xs.c:768 #3 0x0000003031801b4c in read_thread (arg=<value optimized out>) at xs.c:821 #4 0x00000030310061b5 in start_thread () from /lib64/libpthread.so.0 #5 0x00000030304d043d in clone () from /lib64/libc.so.6 Thread 1 (process 29426): #0 0x0000000000000000 in ?? () #1 0x000000000042c085 in dma_thread_func (opaque=<value optimized out>) at /usr/src/debug/xen-3.1.0-testing.hg-rc7/tools/ioemu/hw/ide.c:2402 #2 0x00000030310061b5 in start_thread () from /lib64/libpthread.so.0 #3 0x00000030304d043d in clone () from /lib64/libc.so.6 I compiled qemu-dm with -O0 -g and generated another core dump: http://annexia.org/tmp/qemu-dm.bz2 http://annexia.org/tmp/core.qemu-dm.10152.1179249168.bz2 Created attachment 154763 [details]
Patch to pass structure instead of pointers to the IDE DMA thread.
This patch is currently looking solid. The FreeBSD install has got much
further than before. If it stays up overnight I'll feed it upstream.
FreeBSD install finished successfully for the first time under load. Patch sent upstream. Created attachment 154836 [details]
Screenshot of FreeBSD install failing.
Unfortunately this patch hasn't corrected the problem. I'm still seeing
FreeBSD failing during the install at the same place as before, although with a
different error. This time qemu-dm isn't segfaulting, but FreeBSD itself is
giving an error as shown in the screenshot.
The error is:
anic: initiate_write_inodeblock_ufs2: already started
FYI regarding this bug: There was a recent exchange with someone complaining about IDE multi-threading problems. Keir has checked in a patch to 3.2/3.1 that fixes that particular problem; it may also be relevant here: http://lists.xensource.com/archives/html/xen-devel/2008-01/msg01151.html Chris Lalancette Changing version to '9' as part of upcoming Fedora 9 GA. More information and reason for this action is here: http://fedoraproject.org/wiki/BugZappers/HouseKeeping |