Xen PV domU
Created attachment 381067 [details]
FC12 traces when processes hang
Description of problem:
I am running an x86_64 FC12 domU PV machine under x86_64 xen host.
Kernel Version is 188.8.131.52-174.fc12.x86_64.
Under heavy disk I/O (the system runs koji build environment), the kernel hangs,
sometimes forever (responds to ping, but no activity can be done), other times only some processes hangs.
In all cases a lot of processes results into "being blocked for more than 120 seconds). Sometimes is httpd, more frequently is pdflush, kswapd and kjournald.
The only cure is to hard reset the vm.
Initially I was suspecting filesystem issues, so move from ext4 to ext3, but the result is the same.
The domU is running over disk images (xvd block device) not on a phys device.
If the machine is converted to HVM, no more issues (but is slooow).
Attached kernel call traces when system hangs.
Version-Release number of selected component (if applicable):
domU is FC12 PV VM with kernel 184.108.40.206-174.fc12.x86_64
domO kernel version is : 2.6.18-164.9.1.el5xen
dom0 (xen host) is Centos 5.4 with latest centos xen:
[root@xen2 images]# xm info
host : xen2
release : 2.6.18-164.9.1.el5xen
version : #1 SMP Tue Dec 15 21:31:37 EST 2009
machine : x86_64
nr_cpus : 8
nr_nodes : 1
sockets_per_node : 2
cores_per_socket : 4
threads_per_core : 1
cpu_mhz : 2333
hw_caps : bfebfbff:20000800:00000000:00000140:0004e3bd:00000000:00000001
total_memory : 8189
free_memory : 38
node_to_cpu : node0:0-7
xen_major : 3
xen_minor : 1
xen_extra : .2-164.9.1.el5
xen_caps : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 hvm-3.0-x86_32p hvm-3.0-x86_64
xen_pagesize : 4096
platform_params : virt_start=0xffff800000000000
xen_changeset : unavailable
cc_compiler : gcc version 4.1.2 20080704 (Red Hat 4.1.2-46)
cc_compile_by : mockbuild
cc_compile_domain : centos.org
cc_compile_date : Tue Dec 15 20:50:26 EST 2009
xend_config_format : 2
domU xen config is:
name = "koji.voismart.net"
uuid = "7fc52711-974d-f475-c268-e6a8f495be5d"
maxmem = 3096
memory = 3096
vcpus = 4
bootloader = "/usr/bin/pygrub"
on_poweroff = "destroy"
on_reboot = "restart"
on_crash = "restart"
vfb = [ "type=vnc,vncunused=1,keymap=it" ]
disk = [ "tap:aio:/var/lib/xen/images/koji.voismart.net.img,xvda,w", "tap:aio:/var/lib/xen/images/koji.voismart.net.disk2.img,xvdb,w", "tap:aio:/var/lib/xen/images/koji.voismart.net.disk3.img.img,xvdc,w" ]
vif = [ "mac=00:16:36:02:ae:f1,bridge=xenbr0,script=vif-bridge" ]
Bug 551552 looks very similar to this one.
Sorry, that should be bug 550724
Created attachment 381749 [details]
Same result with upcoming 220.127.116.11-1.fc12.x86_64
I've tested the kernel 18.104.22.168-1.fc12.x86_64, from fedoraproject koji.
The system seems more stable (means that takes longer to hang and needs heavier I/O) but still happens.
dmesg log attached.
As I mentioned before, I think I am seeing this in bug #550724
Do you have a test that can reliably produce this problem?
I can't get it to happen on a test system, only my production LTSP server.
This bug I sure will get more attention, I'm sure, if a reliable test case can be found.
This message is a reminder that Fedora 12 is nearing its end of life.
Approximately 30 (thirty) days from now Fedora will stop maintaining
and issuing updates for Fedora 12. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as WONTFIX if it remains open with a Fedora
'version' of '12'.
Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version prior to Fedora 12's end of life.
Bug Reporter: Thank you for reporting this issue and we are sorry that
we may not be able to fix it before Fedora 12 is end of life. If you
would still like to see this bug fixed and are able to reproduce it
against a later version of Fedora please change the 'version' of this
bug to the applicable version. If you are unable to change the version,
please add a comment here and someone will do it for you.
Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.
The process we are following is described here:
Fedora 12 changed to end-of-life (EOL) status on 2010-12-02. Fedora 12 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.
If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version.
Thank you for reporting this bug and we are sorry it could not be fixed.