Bug 499210 - OpenSolaris installer segfaults from OOM under KVM, bug not with unaccelerated QEMU
Summary: OpenSolaris installer segfaults from OOM under KVM, bug not with unaccelerate...
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: qemu
Version: rawhide
Hardware: All
OS: Linux
low
medium
Target Milestone: ---
Assignee: Glauber Costa
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: F11VirtTarget
TreeView+ depends on / blocked
 
Reported: 2009-05-05 15:43 UTC by Jeff Layton
Modified: 2014-06-18 07:38 UTC (History)
10 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-05-27 12:11:48 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
screenshot of console showing SEGV messages from different processes (17.41 KB, image/png)
2009-05-05 15:43 UTC, Jeff Layton
no flags Details
/var/log/libvirt/qemu/opensolaris.log (5.61 KB, text/plain)
2009-05-05 17:37 UTC, Jeff Layton
no flags Details

Description Jeff Layton 2009-05-05 15:43:38 UTC
Created attachment 342481 [details]
screenshot of console showing SEGV messages from different processes

I've downloaded the opensolaris 0811 (Nov 2008) CD release and am trying to install it under KVM. It's consistently failing with many of the processes running inside the VM segfaulting.

Attached is a screenshot of a boot to text console. Installing on the same host, with an identical setup but using qemu instead seems to be working just fine.

Relevant packages:

kernel-2.6.29.1-111.fc11.x86_64
qemu-common-0.10-15.fc11.x86_64
qemu-system-cris-0.10-15.fc11.x86_64
qemu-system-m68k-0.10-15.fc11.x86_64
qemu-system-mips-0.10-15.fc11.x86_64
qemu-img-0.10-15.fc11.x86_64
qemu-system-sh4-0.10-15.fc11.x86_64
qemu-system-sparc-0.10-15.fc11.x86_64
qemu-system-x86-0.10-15.fc11.x86_64
qemu-0.10-15.fc11.x86_64
qemu-system-ppc-0.10-15.fc11.x86_64
qemu-user-0.10-15.fc11.x86_64
qemu-system-arm-0.10-15.fc11.x86_64
libvirt-0.6.2-3.fc11.x86_64

...let me know if you need other info.

Comment 1 Jeff Layton 2009-05-05 15:44:22 UTC
Here's the contents of /proc/cpuinfo as well:

$ cat /proc/cpuinfo 
processor	: 0
vendor_id	: AuthenticAMD
cpu family	: 15
model		: 67
model name	: AMD Athlon(tm) 64 X2 Dual Core Processor 5200+
stepping	: 2
cpu MHz		: 2600.000
cache size	: 1024 KB
physical id	: 0
siblings	: 2
core id		: 0
cpu cores	: 2
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 1
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt rdtscp lm 3dnowext 3dnow rep_good pni cx16 lahf_lm cmp_legacy svm extapic cr8_legacy
bogomips	: 5211.05
TLB size	: 1024 4K pages
clflush size	: 64
cache_alignment	: 64
address sizes	: 40 bits physical, 48 bits virtual
power management: ts fid vid ttp tm stc

processor	: 1
vendor_id	: AuthenticAMD
cpu family	: 15
model		: 67
model name	: AMD Athlon(tm) 64 X2 Dual Core Processor 5200+
stepping	: 2
cpu MHz		: 2600.000
cache size	: 1024 KB
physical id	: 0
siblings	: 2
core id		: 1
cpu cores	: 2
apicid		: 1
initial apicid	: 1
fpu		: yes
fpu_exception	: yes
cpuid level	: 1
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt rdtscp lm 3dnowext 3dnow rep_good pni cx16 lahf_lm cmp_legacy svm extapic cr8_legacy
bogomips	: 5211.05
TLB size	: 1024 4K pages
clflush size	: 64
cache_alignment	: 64
address sizes	: 40 bits physical, 48 bits virtual
power management: ts fid vid ttp tm stc

Comment 2 Mark McLoughlin 2009-05-05 17:13:30 UTC
Please also include /var/log/libvirt/$guest.log and anything interesting from dmesg

Random idea - try running the guest with more memory?

Interesting that kvm-autotest doesn't have any opensolaris tests

Comment 3 Jeff Layton 2009-05-05 17:37:54 UTC
Created attachment 342509 [details]
/var/log/libvirt/qemu/opensolaris.log

Guest logfile:

/var/log/libvirt/qemu/opensolaris.log

Comment 4 Jeff Layton 2009-05-05 17:38:58 UTC
Good call on the memory. I've doubled it from 512M to 1G and it seems to be behaving better. I won't declare victory until the install finishes though ;)

Comment 5 Mark McLoughlin 2009-05-05 18:27:11 UTC
Okay, it sounds like OpenSolaris is just lame at handling OOM? Please close the bug if victory is declared :-)

Comment 6 Jeff Layton 2009-05-05 18:39:55 UTC
I don't think so. If that were the case shouldn't I have been seeing the similar failures under "normal" qemu? When I tested it, I allocated 512m to the guest there as well.

It seems like there's something special about the KVM that was causing it to fail in this way.

Comment 7 Mark McLoughlin 2009-05-21 15:25:32 UTC
That certainly sounds like something worth investigating alright

Comment 8 Chris Lalancette 2009-05-26 10:46:56 UTC
I just tried this out on my intel box locally, with exactly the same version of opensolaris and 512MB of memory, and installation went perfectly fine.  I do have newer versions of the packages, though:

qemu-0.10.4-4.fc11.x86_64
libvirt-0.6.2-8.fc11.x86_64
kernel-2.6.29.3-155.fc11.x86_64

I'm going to try again on an AMD box to see if there is a difference there.

Chris Lalancette

Comment 9 Chris Lalancette 2009-05-26 12:18:35 UTC
Well, I tried on my AMD box, with the same results.  Although to be fair, it's a pretty different AMD box (not running F-11, and also pretty old RevF processors).  I tried to make sure my command-line options matched closely with what is posted in this bug.

I'm wondering if this has been fixed in the meantime, or if there's another piece to reproduce that I'm missing.  Jeff, any chance you can try again with the updated versions of the qemu/kernel packages?  If you are still having problems with the updated packages, then maybe I can hop onto your box to some debugging.

Thanks,
Chris Lalancette

Comment 10 Jeff Layton 2009-05-27 10:55:37 UTC
Installing now with 512M -- so far so good. 

kernel-2.6.29.3-155.fc11.x86_64
qemu-0.10-16.fc11.x86_64
libvirt-0.6.2-8.fc11.x86_64

...I'll note though that the hardware in this machine changed recently. It now has a quad-core AMD CPU. Here's cpuinfo from one of the cores:

processor	: 0
vendor_id	: AuthenticAMD
cpu family	: 16
model		: 4
model name	: AMD Phenom(tm) II X4 940 Processor
stepping	: 2
cpu MHz		: 800.000
cache size	: 512 KB
physical id	: 0
siblings	: 4
core id		: 0
cpu cores	: 4
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 5
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nonstop_tsc pni monitor cx16 lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt
bogomips	: 6000.75
TLB size	: 1024 4K pages
clflush size	: 64
cache_alignment	: 64
address sizes	: 48 bits physical, 48 bits virtual
power management: ts ttp tm stc 100mhzsteps hwpstate

...it's possible that the hardware change is what fixed this, but it seems more likely that the problem was due to a bug that's since been fixed. I've still got the old CPU/Motherboard in a different machine, but it's now running F10 and it'll be a while before I can upgrade it. When I do though, I'll try to remember to retest this.

I suggest that we go ahead and close this bug as fixed, and I'll plan to reopen it if the problem resurfaces at some point in the future.

Comment 11 Chris Lalancette 2009-05-27 12:11:48 UTC
OK, thanks for trying again Jeff, and like you said, re-open if you hit it again.

Chris Lalancette


Note You need to log in before you can comment on or make changes to this bug.