Bug 158367

Summary: Xen locks up when running xm create
Product: [Fedora] Fedora Reporter: Ian Anderson <fedora>
Component: xenAssignee: Rik van Riel <riel>
Status: CLOSED RAWHIDE QA Contact:
Severity: medium Docs Contact:
Priority: medium    
Version: 4CC: itamar, lsof
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2005-09-16 19:02:58 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Output of xm dmesg
none
cat /proc/cpuinfo none

Description Ian Anderson 2005-05-20 22:40:30 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.8) Gecko/20050512 Fedora/1.0.4-2 Firefox/1.0.4

Description of problem:
Following the instructions at http://www.fedoraproject.org/wiki/FedoraXenQuickstart everything works well until trying to run 'xm create -c rawhide' at which point the machine locks up solid, no crash dump, nothing.  Same problem occurs under both runlevel 3 and runlevel 5 (X).

Version-Release number of selected component (if applicable):
xen-2-20050424

How reproducible:
Always

Steps to Reproduce:
1.Install FC4test3
2.Follow instructions at http://www.fedoraproject.org/wiki/FedoraXenQuickstart
3.xm create -c rawhide
4.machine locks up
  

Actual Results:  Machine locks up

Expected Results:  expect to see the Xen guest OS booting up.

Additional info:

kernel-xen0-2.6.11-1.1323_FC4
kernel-xenU-2.6.11-1.1323_FC4

Asus P4R800vm motherboard, 2.6 GHz Hyperthreading CPU
ATI 9100 IGP chipset

Comment 1 Rik van Riel 2005-05-21 18:36:09 UTC
Did you get any Xen output on the serial console ?

I have Xen running on several systems here, without the problem you are seeing.

Btw, I have the latest Xen RPMs up on: http://people.redhat.com/riel/xen_for_fc4/


Comment 2 Ian Anderson 2005-05-21 20:41:49 UTC
No further output on the serial console.
Looked at the XenDemo CD and noticed that it configures the hypervisor to run
with the noht (no hyperthreading) option.
Tried that option and am now able to successfully boot Xen VMs under both
runlevel  3 and runlevel 5.
noht should be considered for a default xen configuration option.

Comment 3 Rik van Riel 2005-05-21 21:49:48 UTC
Exactly what system are you running on?

I am using Xen with hyperthreading on a 3GHz Pentium IV and it's working just fine.

Also, what output do Xen and xenolinux give?
At what stage in the boot does it hang?

Comment 4 Ian Anderson 2005-05-21 22:53:08 UTC
Created attachment 114679 [details]
Output of xm dmesg

Here is the output from the Xen kernel.  I do not see any messages when the
guest OS is booting and hyperthreading is enabled, I get "Using configfile
rawhide" and then it locks solid.  System is an Asus P4R800vm with an ATI 9100
IGP chipset.  everything is integrated on the motherboard with no additional
PCI cards.  I strongly suspect it is a quirk in the chipset, it did not come
out very long after Intel announced Hyperthreading.

Comment 5 Rik van Riel 2005-05-21 23:02:14 UTC
Btw, do you need to specify the "noht" option on the Xen boot line, or is it
enough to specify that on the domain 0 kernel boot options?

Comment 6 Ian Anderson 2005-05-21 23:39:34 UTC
I specify it in the grub.conf file on the xen kernel line:
kernel /boot/xen.gz com1=115200,8n1 noht


Comment 7 Rik van Riel 2005-05-22 00:16:09 UTC
In that case some ACPI workaround might work - and that will be automatically
imported from Linux once I upgrade to a newer version of Xen, in which all
things ACPI are implemented in domain 0 instead of the hypervisor.

I tried upgrading to such a version already, but the upstream xen tree didn't
boot on my test system when I tried, so FC4 will still have an older version of Xen.

I'll let you know when I've upgraded rawhide to a newer Xen with ACPI in domain
0 (post FC4).

Comment 8 Ian Anderson 2005-05-22 12:18:32 UTC
If it is an ACPI problem that would fit in with another symptom I've seen.  When
I boot with the Xen kernel, the system does not automatically power off when I
shut down domain 0.  It powers off fine when I use a regular kernel.

I simply assumed that is a limitation of the current Xen implementation, or do
you have systems that can automatically power off domain 0?

Comment 9 Ian Anderson 2005-05-24 22:27:16 UTC
Same problem seen with:

xen-2-20050522
kernel-xenU-2.6.11-1.1341_FC4
kernel-xen0-2.6.11-1.1341_FC4


Comment 10 Need Real Name 2005-07-01 19:55:02 UTC
I get this too.
The problem seems to be related to the init.d xendomains failing to start.
The xendomains script fails when trying to call "log_success_msg" which is a
function that doesn't exist.

Comment 11 Need Real Name 2005-07-18 17:48:23 UTC
*** Bug 163468 has been marked as a duplicate of this bug. ***

Comment 12 Need Real Name 2005-07-18 17:49:49 UTC
Any luck with getting ACPI into domain 0? or any workarounds?
Thanks.

Comment 13 Jarkko 2005-07-19 11:19:42 UTC
Same problem here with:

xen-2-20050522
kernel-xenU-2.6.12-1.1398_FC4
kernel-xen0-2.6.12-1.1398_FC4

The host is running: Intel(R) Pentium(R) 4 CPU 2.60GHz

Comment 14 Need Real Name 2005-07-19 13:12:31 UTC
Are you getting the crash in exactly the same place?
i.e. strace xm create -c xen1
 send(3, "POST /xend/domain HTTP/1.1\rHost"..., 149, 0) = 149
 send(3, "config=%28vm%28name+xen1%29+%28"..., 403, 0
(from bug 163468)

Comment 15 Itamar Reis Peixoto 2005-07-21 02:34:08 UTC
I have the same problem 

I have tested with the lasted rpm version of xen in FC4

xen-2-20050522
kernel-xenU-2.6.12-1.1398_FC4
kernel-xen0-2.6.12-1.1398_FC4

I have tested with noht and same problem, when I do a xm create the machine 
crash!



Comment 16 Need Real Name 2005-08-24 21:02:46 UTC
Rik van Riel - any update on this? Should I hang around with Fedora for Xen, or
get something else installed to test it?

Comment 17 Rik van Riel 2005-08-24 22:36:44 UTC
I built a new Xen package for FC4 recently, which will be pushed out together
with the next kernel update.  That should fix this issue.

Comment 18 Fedora Update System 2005-08-26 21:37:28 UTC
From User-Agent: XML-RPC

A new Xen version has been pushed to Fedora Core 4 updates.  This version of Xen (combined with the latest kernel update) should fix this issue.

Please reopen this bug if the problems continue with the latest Xen and kernel updates.

Comment 19 Need Real Name 2005-08-29 18:08:06 UTC
Created attachment 118220 [details]
cat /proc/cpuinfo

Booting kernel-xen0-2.6.12-1.1447_FC4, I get:

***
CPU0 fatal trap: vector = 6 (invalid operand)
[error_code=0000]
Aieee! CPU0 is toast...
***

There's a stack trace above that. If needed, I'll take a photo of it.

Comment 20 Rik van Riel 2005-08-29 18:09:57 UTC
You also need a newer Xen package, xen-2-20050823.

Comment 21 Need Real Name 2005-08-29 18:28:46 UTC
Bad dependency then? I've installed xen-2-20050823 from testing, and it fixes
the Aieee!, but boot won't finish.
First networking fails to start properly - it can't get an ip from my dsl router
(it can't even ping the router).
After starting bluetooth, the computer hangs.

Last lines in /var/log/messages:
 Aug 29 20:23:30 localhost kernel: Bluetooth: HCI socket layer initialized
 Aug 29 20:23:30 localhost kernel: Bluetooth: L2CAP ver 2.7
 Aug 29 20:23:30 localhost kernel: Bluetooth: L2CAP socket layer initialized

# lspci -v|grep -i net
02:08.0 Ethernet controller: 3Com Corporation 3Com 3C920B-EMB-WNM Integrated
Fast Ethernet Controller (rev 40)

Comment 22 Rik van Riel 2005-08-29 18:39:23 UTC
I've seen this on my test system here, once or twice.  I haven't managed to
reliably reproduce this problem though ;(((

Just to rule out the vsyscall page (where I made some changes), could you try
booting with "vdso=0" ?

Comment 23 Need Real Name 2005-08-29 18:53:31 UTC
No such luck unfortunately :(

I tried moving /lib/tls to /lib/tls.disabled, but I still got the warning, and I
still got the hang.

I have an onboard video card which I think shares the main ram. Could that be
the problem? Any ideas?

lspci -v|grep -i vga
01:05.0 VGA compatible controller: ATI Technologies Inc Radeon 9100 IGP (prog-if
00 [VGA])

Comment 24 Fedora Update System 2005-08-30 03:24:50 UTC
From User-Agent: XML-RPC

A new Xen version has been pushed to Fedora Core 4 updates.  This version of Xen (together with kernel 2.6.12-1.1435 or newer) should fix this issue.

Please reopen this bug if the problems continue with the latest Xen and kernel updates.

Comment 25 Itamar Reis Peixoto 2005-08-30 03:46:24 UTC
Now My machine is booted with 

[root@router ~]# rpm -qa |grep xen
kernel-xenU-2.6.12-1.1447_FC4
kernel-xen0-2.6.12-1.1447_FC4
xen-2-20050823
[root@router ~]#

but When I do 

[root@router ~]# service xend start
Exception connecting to xenstored: (2, 'No such file or directory')
Trying again...
Exception connecting to xenstored: (2, 'No such file or directory')
Trying again...
[root@router ~]# Exc

xend don´t start

Comment 26 Ian Anderson 2005-08-30 20:57:37 UTC
(In reply to comment #21)
I'm seeing the same problem with my system, and it looks like I have the same
hardware (ASUS P4R800vm, ATI 9100 IGP chipset)

Comment 27 Ian Anderson 2005-08-30 20:59:56 UTC
I was able to get the system to finish booting by disabling th OpenCT daemon. 
No network and no USB mouse.

Comment 28 Itamar Reis Peixoto 2005-08-31 14:00:44 UTC
about the comment #25 

 I have created /var/run/xenstored and /var/lib/xenstored and xend started fine.

Comment 29 Need Real Name 2005-09-27 15:41:06 UTC
I select the xen0 kernel, and my computer reboots.

Comment 30 Need Real Name 2005-09-27 15:50:45 UTC
kernel-xen0-2.6.12-1.1456_FC4


Comment 31 Itamar Reis Peixoto 2005-09-27 18:02:58 UTC
use

kernel-xen0-2.6.12-1.1456_FC4 
and
kernel-xenU-2.6.12-1.1456_FC4

and xen 3.0 from 

http://people.redhat.com/riel/xen_for_fc4/

for me is working.

Comment 32 Need Real Name 2005-10-01 14:27:04 UTC
Not for me. Crashes this time, just after starting bluetooth.

Tried 1526 kernel, got an error about not enough space on CPU0 or something.