Bug 754014 - Host becoming irresponsive after hundreds rounds of guest installation
Summary: Host becoming irresponsive after hundreds rounds of guest installation
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: kernel
Version: 6.2
Hardware: Unspecified
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: Virtualization Maintenance
QA Contact: Virtualization Bugs
URL:
Whiteboard:
Depends On:
Blocks: 767187
TreeView+ depends on / blocked
 
Reported: 2011-11-15 06:05 UTC by Xiaoqing Wei
Modified: 2012-06-08 23:21 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2012-06-08 23:21:01 UTC
Target Upstream Version:


Attachments (Terms of Use)
after force_reboot the host, collect all the log here. (2.18 MB, application/octet-stream)
2011-11-15 06:10 UTC, Xiaoqing Wei
no flags Details

Description Xiaoqing Wei 2011-11-15 06:05:14 UTC
Description of problem:

Host becoming irresponsive after hundreds rounds of guest installation

Version-Release number of selected component (if applicable):
kernel-2.6.32-220.el6.x86_64  (Both host and guest are RHEL6.2-20111109.1)

How reproducible:
ONLY ONCE

Steps to Reproduce:
1.start install a guest :

 qemu-kvm -monitor stdio -chardev
socket,id=serial_id_20111111-110011-EI1G,path=/tmp/serial-20111111-110011-EI1G,server,nowait
\
-device isa-serial,chardev=serial_id_20111111-110011-EI1G \
-drive
file='RHEL-Server-6.2-32-virtio.qcow2',index=0,if=none,id=drive-virtio-disk1,media=disk,cache=none,format=qcow2,aio=threads
\
-device
virtio-blk-pci,bus=pci.0,addr=0x4,drive=drive-virtio-disk1,id=virtio-disk1 \
-device
virtio-net-pci,netdev=idtrp0j3,mac=9a:38:44:5e:0a:24,id=ndev00idtrp0j3,bus=pci.0,addr=0x3
\
-netdev tap,id=idtrp0j3,vhost=on  \
-m 4096 -smp 2,cores=1,threads=1,sockets=2 \
-drive
file='RHEL6.2-Server-i386.iso',index=1,if=none,id=drive-ide0-0-0,media=cdrom,readonly=on,format=raw
\
-device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 \
-drive
file='ks.iso',index=2,if=none,id=drive-ide0-0-1,media=cdrom,readonly=on,format=raw
\
-device ide-drive,bus=ide.0,unit=1,drive=drive-ide0-0-1,id=ide0-0-1 \
-cpu cpu64-rhel6,+sse2,+x2apic \
-kernel 'rhel62-32/vmlinuz' -initrd 'rhel62-32/initrd.img' \
-spice port=8000,disable-ticketing -vga qxl \
-rtc base=utc,clock=host,driftfix=slew -boot order=cdn,once=d,menu=off  \
-no-kvm-pit-reinjection --append 'ks=cdrom nicdelay=60 console=ttyS0,115200
console=tty0' \
-M rhel6.2.0 -usb -device usb-tablet -enable-kvm 


2. repeat step 1 till bug happen.
3.
  
Actual results:
host becoming irresponsive, monitor not output.
from /var/log/message can see
...
Nov 11 22:01:02 intel-i7-12-4 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 11 22:01:02 intel-i7-12-4 kernel: qemu          D 0000000000000000     0 11938  11889 0x00000080
Nov 11 22:01:02 intel-i7-12-4 kernel: ffff8803211b1bd8 0000000000000082 0000000000000000 ffff880335c38d80
Nov 11 22:01:02 intel-i7-12-4 kernel: ffff8803211b1bf8 ffffffff814eca40 0000000000000000 ffffffff81095ac3
Nov 11 22:01:02 intel-i7-12-4 kernel: ffff880331697078 ffff8803211b1fd8 000000000000f4e8 ffff880331697078
Nov 11 22:01:02 intel-i7-12-4 kernel: Call Trace:
Nov 11 22:01:02 intel-i7-12-4 kernel: [<ffffffff814eca40>] ? thread_return+0x4e/0x77e
Nov 11 22:01:02 intel-i7-12-4 kernel: [<ffffffff81095ac3>] ? __hrtimer_start_range_ns+0x1a3/0x460
Nov 11 22:01:02 intel-i7-12-4 kernel: [<ffffffff814ee0ae>] __mutex_lock_slowpath+0x13e/0x180
...

Nov 11 22:01:02 intel-i7-12-4 kernel: [<ffffffff814ee0ae>] __mutex_lock_slowpath+0x13e/0x180
Nov 11 22:01:02 intel-i7-12-4 kernel: [<ffffffff814edf4b>] mutex_lock+0x2b/0x50
Nov 11 22:01:02 intel-i7-12-4 kernel: [<ffffffff81112fe9>] generic_file_aio_write+0x59/0xe0
Nov 11 22:01:02 intel-i7-12-4 kernel: [<ffffffffa023ede1>] ext4_file_write+0x61/0x1e0 [ext4]
...

AND MORE WILL be attached. it's toooooo long.

Expected results:
Both guest and host works well.

Additional info:

processor	: 7
vendor_id	: GenuineIntel
cpu family	: 6
model		: 26
model name	: Intel(R) Core(TM) i7 CPU         920  @ 2.67GHz

12G ram

qemu-kvm-0.12.1.2-2.209.el6.x86_64

AND,  during guest installation loop, sometimes could trigger this bug:
"Bug 753694 - anaconda exits and call traces during installation "

Comment 1 Xiaoqing Wei 2011-11-15 06:10:52 UTC
Created attachment 533685 [details]
after force_reboot the host, collect all the log here.

host becoming inrresponsive,  can not ssh , monitor not output,
so unplug power cable and then powering it.

Comment 2 Xiaoqing Wei 2011-11-15 06:14:37 UTC
[root@intel-i7-12-4 sa]# /etc/init.d/kdump status 
Kdump is operational
[root@intel-i7-12-4 sa]# cat /proc/cmdline 
ro root=/dev/mapper/VolGroup-LogVol_root rd_NO_LUKS LANG=en_US.UTF-8 rd_NO_MD processor.max_cstate=1 nmi_watchdog=0 console=tty0 console=ttyS0,115200 SYSFONT=latarcyrheb-sun16 rd_LVM_LV=VolGroup/LogVol_swap rhgb crashkernel=129M@0M quiet rd_LVM_LV=VolGroup/LogVol_root  KEYBOARDTYPE=pc KEYTABLE=us rd_NO_DM


host didn't kdump after hung for a long time.

Comment 4 Xiaoqing Wei 2011-11-15 06:24:59 UTC
NOTE:  Detailest pls refer to the attached tar file,
but or whom'd like to have a quick glance, here's the Call Traces:


Nov 11 18:33:52 intel-i7-12-4 kernel: INFO: task master:7795 blocked for more than 120 seconds.
Nov 11 18:33:52 intel-i7-12-4 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 11 18:33:52 intel-i7-12-4 kernel: master        D 0000000000000004     0  7795   8150 0x00000080
Nov 11 18:33:52 intel-i7-12-4 kernel: ffff88021567faa8 0000000000000082 ffff88021567fa70 ffff88021567fa6c
Nov 11 18:33:52 intel-i7-12-4 kernel: ffff880300000000 ffff88033fc24900 ffff8800282d5fc0 0000000000000400
Nov 11 18:33:52 intel-i7-12-4 kernel: ffff88032e96ba78 ffff88021567ffd8 000000000000f4e8 ffff88032e96ba78
Nov 11 18:33:52 intel-i7-12-4 kernel: Call Trace:
Nov 11 18:33:52 intel-i7-12-4 kernel: [<ffffffff811a9420>] ? sync_buffer+0x0/0x50
Nov 11 18:33:52 intel-i7-12-4 kernel: [<ffffffff814ed1e3>] io_schedule+0x73/0xc0
Nov 11 18:33:52 intel-i7-12-4 kernel: [<ffffffff811a9460>] sync_buffer+0x40/0x50
Nov 11 18:33:52 intel-i7-12-4 kernel: [<ffffffff814edb9f>] __wait_on_bit+0x5f/0x90
Nov 11 18:33:52 intel-i7-12-4 kernel: [<ffffffff811a9420>] ? sync_buffer+0x0/0x50
Nov 11 18:33:52 intel-i7-12-4 kernel: [<ffffffff814edc48>] out_of_line_wait_on_bit+0x78/0x90
Nov 11 18:33:52 intel-i7-12-4 kernel: [<ffffffff81090c30>] ? wake_bit_function+0x0/0x50
Nov 11 18:33:52 intel-i7-12-4 kernel: [<ffffffff811a9416>] __wait_on_buffer+0x26/0x30
Nov 11 18:33:52 intel-i7-12-4 kernel: [<ffffffffa0244ef7>] __ext4_get_inode_loc+0x1e7/0x3b0 [ext4]
Nov 11 18:33:52 intel-i7-12-4 kernel: [<ffffffffa02451dc>] ext4_get_inode_loc+0x1c/0x20 [ext4]
Nov 11 18:33:52 intel-i7-12-4 kernel: [<ffffffffa0275bfc>] ext4_xattr_get+0x7c/0x2c0 [ext4]
Nov 11 18:33:52 intel-i7-12-4 kernel: [<ffffffffa0278c3b>] ext4_xattr_security_get+0x2b/0x30 [ext4]
Nov 11 18:33:52 intel-i7-12-4 kernel: [<ffffffff81199397>] generic_getxattr+0x87/0x90
Nov 11 18:33:52 intel-i7-12-4 kernel: [<ffffffff8120b755>] get_vfs_caps_from_disk+0x65/0xe0
Nov 11 18:33:52 intel-i7-12-4 kernel: [<ffffffff8120b9b8>] cap_bprm_set_creds+0x98/0x440
Nov 11 18:33:52 intel-i7-12-4 kernel: [<ffffffff8121829a>] selinux_bprm_set_creds+0x4a/0x2c0
Nov 11 18:33:52 intel-i7-12-4 kernel: [<ffffffff8113f880>] ? __vma_link_rb+0x30/0x40
Nov 11 18:33:52 intel-i7-12-4 kernel: [<ffffffff8113f92b>] ? vma_link+0x9b/0xf0
Nov 11 18:33:52 intel-i7-12-4 kernel: [<ffffffff8121887a>] ? selinux_vm_enough_memory+0x4a/0x60
Nov 11 18:33:52 intel-i7-12-4 kernel: [<ffffffff8113fa19>] ? insert_vm_struct+0x99/0x110
Nov 11 18:33:52 intel-i7-12-4 kernel: [<ffffffff8120c083>] security_bprm_set_creds+0x13/0x20
Nov 11 18:33:52 intel-i7-12-4 kernel: [<ffffffff8117cea1>] prepare_binprm+0xb1/0x110
Nov 11 18:33:52 intel-i7-12-4 kernel: [<ffffffff8117eecc>] do_execve+0x1bc/0x340
Nov 11 18:33:52 intel-i7-12-4 kernel: [<ffffffff8127722a>] ? strncpy_from_user+0x4a/0x90
Nov 11 18:33:52 intel-i7-12-4 kernel: [<ffffffff810095ea>] sys_execve+0x4a/0x80
Nov 11 18:33:52 intel-i7-12-4 kernel: [<ffffffff8100b54a>] stub_execve+0x6a/0xc0

Nov 11 22:01:02 intel-i7-12-4 kernel: INFO: task qemu:11938 blocked for more than 120 seconds.
Nov 11 22:01:02 intel-i7-12-4 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 11 22:01:02 intel-i7-12-4 kernel: qemu          D 0000000000000000     0 11938  11889 0x00000080
Nov 11 22:01:02 intel-i7-12-4 kernel: ffff8803211b1bd8 0000000000000082 0000000000000000 ffff880335c38d80
Nov 11 22:01:02 intel-i7-12-4 kernel: ffff8803211b1bf8 ffffffff814eca40 0000000000000000 ffffffff81095ac3
Nov 11 22:01:02 intel-i7-12-4 kernel: ffff880331697078 ffff8803211b1fd8 000000000000f4e8 ffff880331697078
Nov 11 22:01:02 intel-i7-12-4 kernel: Call Trace:
Nov 11 22:01:02 intel-i7-12-4 kernel: [<ffffffff814eca40>] ? thread_return+0x4e/0x77e
Nov 11 22:01:02 intel-i7-12-4 kernel: [<ffffffff81095ac3>] ? __hrtimer_start_range_ns+0x1a3/0x460
Nov 11 22:01:02 intel-i7-12-4 kernel: [<ffffffff814ee0ae>] __mutex_lock_slowpath+0x13e/0x180
Nov 11 22:01:02 intel-i7-12-4 kernel: [<ffffffff814edf4b>] mutex_lock+0x2b/0x50
Nov 11 22:01:02 intel-i7-12-4 kernel: [<ffffffff81112fe9>] generic_file_aio_write+0x59/0xe0
Nov 11 22:01:02 intel-i7-12-4 kernel: [<ffffffffa023ede1>] ext4_file_write+0x61/0x1e0 [ext4]
Nov 11 22:01:02 intel-i7-12-4 kernel: [<ffffffffa023ed80>] ? ext4_file_write+0x0/0x1e0 [ext4]
Nov 11 22:01:02 intel-i7-12-4 kernel: [<ffffffff8117619b>] do_sync_readv_writev+0xfb/0x140
Nov 11 22:01:02 intel-i7-12-4 kernel: [<ffffffff81090bf0>] ? autoremove_wake_function+0x0/0x40
Nov 11 22:01:02 intel-i7-12-4 kernel: [<ffffffff8121902b>] ? selinux_file_permission+0xfb/0x150
Nov 11 22:01:02 intel-i7-12-4 kernel: [<ffffffff8120c3c6>] ? security_file_permission+0x16/0x20
Nov 11 22:01:02 intel-i7-12-4 kernel: [<ffffffff8117722f>] do_readv_writev+0xcf/0x1f0
Nov 11 22:01:02 intel-i7-12-4 kernel: [<ffffffff810829d6>] ? group_send_sig_info+0x56/0x70
Nov 11 22:01:02 intel-i7-12-4 kernel: [<ffffffff81082a2f>] ? kill_pid_info+0x3f/0x60
Nov 11 22:01:02 intel-i7-12-4 kernel: [<ffffffff81177396>] vfs_writev+0x46/0x60
Nov 11 22:01:02 intel-i7-12-4 kernel: [<ffffffff81177452>] sys_pwritev+0xa2/0xc0
Nov 11 22:01:02 intel-i7-12-4 kernel: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b
Nov 11 22:01:02 intel-i7-12-4 kernel: INFO: task qemu:12015 blocked for more than 120 seconds.
Nov 11 22:01:02 intel-i7-12-4 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 11 22:01:02 intel-i7-12-4 kernel: qemu          D 0000000000000007     0 12015  11889 0x00000080
Nov 11 22:01:02 intel-i7-12-4 kernel: ffff88021557fbd8 0000000000000082 0000000000000000 0000000000000000
Nov 11 22:01:02 intel-i7-12-4 kernel: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
Nov 11 22:01:02 intel-i7-12-4 kernel: ffff8803310c2638 ffff88021557ffd8 000000000000f4e8 ffff8803310c2638
Nov 11 22:01:02 intel-i7-12-4 kernel: Call Trace:
Nov 11 22:01:02 intel-i7-12-4 kernel: [<ffffffff814ee0ae>] __mutex_lock_slowpath+0x13e/0x180
Nov 11 22:01:02 intel-i7-12-4 kernel: [<ffffffff814edf4b>] mutex_lock+0x2b/0x50
Nov 11 22:01:02 intel-i7-12-4 kernel: [<ffffffff81112fe9>] generic_file_aio_write+0x59/0xe0
Nov 11 22:01:02 intel-i7-12-4 kernel: [<ffffffffa023ede1>] ext4_file_write+0x61/0x1e0 [ext4]
Nov 11 22:01:02 intel-i7-12-4 kernel: [<ffffffffa023ed80>] ? ext4_file_write+0x0/0x1e0 [ext4]
Nov 11 22:01:02 intel-i7-12-4 kernel: [<ffffffff8117619b>] do_sync_readv_writev+0xfb/0x140
Nov 11 22:01:02 intel-i7-12-4 kernel: [<ffffffff81090bf0>] ? autoremove_wake_function+0x0/0x40
Nov 11 22:01:02 intel-i7-12-4 kernel: [<ffffffff8121902b>] ? selinux_file_permission+0xfb/0x150
Nov 11 22:01:02 intel-i7-12-4 kernel: [<ffffffff8120c3c6>] ? security_file_permission+0x16/0x20
Nov 11 22:01:02 intel-i7-12-4 kernel: [<ffffffff8117722f>] do_readv_writev+0xcf/0x1f0
Nov 11 22:01:02 intel-i7-12-4 kernel: [<ffffffff81177396>] vfs_writev+0x46/0x60
Nov 11 22:01:02 intel-i7-12-4 kernel: [<ffffffff81177452>] sys_pwritev+0xa2/0xc0
Nov 11 22:01:02 intel-i7-12-4 kernel: [<ffffffff8100b0f2>] system_call_fastpath+0x16/0x1b

Comment 9 Xiaoqing Wei 2012-06-05 09:49:48 UTC
Hi, I tried with the latest qemu-kvm and kernel, 
didn't manage to reproduce out of 100 rounds.

kernel-2.6.32-274.el6.x86_64 
qemu-kvm-0.12.1.2-2.295.el6.x86_64


Note You need to log in before you can comment on or make changes to this bug.