Bug 699310

Summary: call trace when there is high load in pv guest
Product: Red Hat Enterprise Linux 6 Reporter: Qixiang Wan <qwan>
Component: kernelAssignee: Xen Maintainance List <xen-maint>
Status: CLOSED NOTABUG QA Contact: Virtualization Bugs <virt-bugs>
Severity: medium Docs Contact:
Priority: medium    
Version: 6.1CC: drjones, leiwang, pbonzini
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-06-01 13:54:57 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 523117    
Attachments:
Description Flags
RHEL6.1 32bit PV DomU kernel dmesg none

Description Qixiang Wan 2011-04-25 02:59:34 UTC
Created attachment 494602 [details]
RHEL6.1 32bit PV DomU kernel dmesg

Description of problem:
The RHEL6.1 xen PV guest will get call trace when there is high load in the guest. Try to run the kernel compiling loop and iozone test loop at the same time can reproduce the error in about 2 days. guest can work well although there is call trace in kernel message.

Version-Release number of selected component (if applicable):
kernel-2.6.32-131.0.5.el6

How reproducible:
100%

Steps to Reproduce:
1. install a 32bit RHEL6.1 as xen pv guest
2. run kernel compiling loop and iozone test with guest
3. check the guest kernel dmesg after 1~3 days
  
Actual results:
there is call trace in kernel dmesg:
-------------------------------------------------------------------------
hrtimer: interrupt took 2035855 ns
INFO: task iozone:13096 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
iozone        D 00000000     0 13096   1771 0x00000080
 eaa45030 00000282 04c4fe5d 00000000 c243fe40 b750b98c ffffffff 000c2df7
 00000000 c243fe40 0000b333 d6e89cec 0000b333 c0ae2120 c0ae2120 eaa452d8
 c0ae2120 c0addb54 c0ae2120 eaa452d8 ec884000 3f4fb9e5 0001c099 42b5ace7
Call Trace:
 [<c0407391>] ? xen_sched_clock+0x21/0x90
 [<c043e925>] ? update_curr+0x185/0x2c0
 [<c0822f85>] ? schedule_timeout+0x195/0x250
 [<c043f4bd>] ? enqueue_entity+0x37d/0x400
 [<c0822ce9>] ? wait_for_common+0xe9/0x150
 [<c044bf20>] ? default_wake_function+0x0/0x10
 [<c054a3e2>] ? sync_inodes_sb+0x72/0x150
 [<c054f063>] ? __sync_filesystem+0x63/0x70
 [<c054f13c>] ? sync_filesystems+0xcc/0x100
 [<c054f1b8>] ? sys_sync+0x18/0x40
 [<c0824864>] ? syscall_call+0x7/0xb
-------------------------------------------------------------------------

Expected results:
There should no call trace

Additional info:

Comment 2 Andrew Jones 2011-04-26 08:47:10 UTC
Hi,

A couple questions to get more info.

Was the Xen host being used to run other guests with some load? Or was dom0 running something as well? Did iozone stay in D-state? Or did it return to running, or just exit?

Thanks,
Drew

Comment 3 Qixiang Wan 2011-04-26 09:14:18 UTC
(In reply to comment #2)
> Hi,
> 
> A couple questions to get more info.
> 
> Was the Xen host being used to run other guests with some load? Or was dom0
> running something as well? 

There are 2 RHEL6.1 PV DomUs (one is 32bit and the other is 64 bit) running on the host with the same testing (kernel compiling loop and iozone testing), only 32 bit guest get the call trace when check their status, not sure whether this only happen with 32bit.

> Did iozone stay in D-state? Or did it return to running, or just exit?

At least I saw iozone running from the output of screen. The loop was started with "while sleep 1; do iozone -a -n 10M -g 50M -i 0 -i 1 -i 5 -R;done", so I think it returned to running or just exist as the loop should be blocked if the process stay in D-state. but not sure whether it exited when there is the call trace.

Comment 4 Paolo Bonzini 2011-04-26 11:28:27 UTC
I don't think this is a bug.  It may even happen in a non-virtualized environment.