Bug 443621

Summary: kernel panic xen cluster.
Product: Red Hat Enterprise Linux 5 Reporter: makoto nohara <nohara.makoto>
Component: kernel-xenAssignee: Xen Maintainance List <xen-maint>
Status: CLOSED NOTABUG QA Contact: Martin Jenner <mjenner>
Severity: high Docs Contact:
Priority: low    
Version: 5.0CC: clalance, matt.baker, prickett233, tao, xen-maint
Target Milestone: rc   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2009-05-04 15:46:58 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 492568    

Description makoto nohara 2008-04-22 15:49:33 UTC
I am not good at English. 
I'm sorry in strange sentences. ;-)

The following kernel panics occurred. 
------------------
BUG: unable to handle kernel paging request at virtual address e07ba040

 printing eip:

c04d70a2

1027c000 -> *pde = 00000000:d7f6a001

0896a000 -> *pme = 00000000:030fb067

000fb000 -> *pte = 00000000:00000000

Oops: 0000 [#1]

SMP 

last sysfs file: /class/misc/evtchn/dev

Modules linked in: xt_physdev ipt_MASQUERADE ip_conntrack_ftp iptable_nat ip_nat
ip_conntrack nfnetlink iptable_filter ip_tables x_tables netloop netbk blktap
blkbk drbd(U) autofs4 hidp rfcomm l2cap bluetooth sunrpc bridge dummy 8021q ipv6
dm_mirror dm_mod video sbs i2c_ec button battery asus_acpi ac parport_pc lp
parport sg ide_cd i2c_i801 i2c_core cdrom pcspkr tg3 serio_raw 8250_pnp 8250
serial_core r8169 ata_piix libata sd_mod scsi_mod raid1 ext3 jbd ehci_hcd
ohci_hcd uhci_hcd

CPU:    0

EIP:    0061:[<c04d70a2>]    Not tainted VLI

EFLAGS: 00210282   (2.6.18-8.1.8.el5xen #1) 

EIP is at csum_partial+0xca/0x120

eax: 00000000   ebx: c04d70a2   ecx: 0000000b   edx: 000005a8

esi: e07ba068   edi: 000005a8   ebp: 00000034   esp: c06fbdfc

ds: 007b   es: 007b   ss: 0069

Process at-spi-registry (pid: 11203, ti=c06fb000 task=ce1b1000 task.ti=c9ace000)

Stack: e07ba000 00000034 c059778e e07ba040 000005a8 00000000 00000010 d226d76c 

       00000000 00000020 000005dc c08d6d14 c0598693 c08d6c00 000005a8 d226d76c 

       d784bccc d784bce0 c06fbef8 c059c0c3 4c738383 c8eec060 00000003 c06fbef8 

Call Trace:

 [<c059778e>] skb_checksum+0x111/0x27b

 [<c0598693>] pskb_expand_head+0xcf/0x113

 [<c059c0c3>] skb_checksum_help+0x64/0xb3

 [<e14462ee>] ip_nat_fn+0x42/0x185 [iptable_nat]

 [<e1446628>] ip_nat_local_fn+0x34/0xa4 [iptable_nat]

 [<c05b89a4>] dst_output+0x0/0x7

 [<c05b1738>] nf_iterate+0x30/0x61

 [<c05b89a4>] dst_output+0x0/0x7

 [<c05b185e>] nf_hook_slow+0x3a/0x90

 [<c05b89a4>] dst_output+0x0/0x7

 [<c05babc3>] ip_queue_xmit+0x37e/0x3cf

 [<c05b89a4>] dst_output+0x0/0x7

 [<e1068a97>] scsi_dispatch_cmd+0x21f/0x28c [scsi_mod]

 [<c041556a>] enqueue_task+0x29/0x39

 [<c045e7cd>] kmem_cache_alloc+0x54/0x5e

 [<c05c8527>] tcp_transmit_skb+0x5e4/0x612

 [<c05c9265>] tcp_retransmit_skb+0x4b7/0x595

 [<c05c247b>] tcp_enter_loss+0x1a2/0x1ff

 [<c05cb311>] tcp_write_timer+0x0/0x5d3

 [<c05cb710>] tcp_write_timer+0x3ff/0x5d3

 [<c0424a25>] run_timer_softirq+0x101/0x15c

 [<c041ffa7>] __do_softirq+0x5e/0xc3

 [<c040679c>] do_softirq+0x56/0xae

 [<c040673d>] do_IRQ+0xa5/0xae

 [<c053a0ad>] evtchn_do_upcall+0x64/0x9b

 [<c0404ec5>] hypervisor_callback+0x3d/0x48

 [<c05399fc>] force_evtchn_callback+0xa/0xc

 [<c05ee1fd>] unix_write_space+0x3f/0x69

 [<c0596ccb>] sock_wfree+0x21/0x36

 [<c059849b>] __kfree_skb+0x97/0xe3

 [<c05ecc9f>] unix_stream_recvmsg+0x33f/0x4a4

 [<c0592eb4>] do_sock_read+0xae/0xb7

 [<c0593411>] sock_aio_read+0x53/0x61

 [<c0461fa7>] do_sync_read+0xb6/0xf1

 [<c042cc1d>] autoremove_wake_function+0x0/0x2d

 [<c059346c>] sock_ioctl+0x0/0x1b3

 [<c04628c1>] vfs_read+0xb0/0x141

 [<c0462cfe>] sys_read+0x3c/0x63

 [<c0404cff>] syscall_call+0x7/0xb

 =======================

Code: 9c 13 46 a0 13 46 a4 13 46 a8 13 46 ac 13 46 b0 13 46 b4 13 46 b8 13 46 bc
13 46 c0 13 46 c4 13 46 c8 13 46 cc 13 46 d0 13 46 d4 <13> 46 d8 13 46 dc 13 46
e0 13 46 e4 13 46 e8 13 46 ec 13 46 f0 

EIP: [<c04d70a2>] csum_partial+0xca/0x120 SS:ESP 0069:c06fbdfc

 <0>Kernel panic - not syncing: Fatal exception in interrupt

 BUG: warning at arch/i386/kernel/smp-xen.c:529/smp_call_function() (Not tainted)

 [<c040db7f>] smp_call_function+0x59/0xfe

 [<c040dc37>] smp_send_stop+0x13/0x1e

 [<c041b470>] panic+0x45/0x16d

 [<c040595a>] die+0x24e/0x282

 [<c05f6812>] do_page_fault+0xa7a/0xbeb

 [<c04d70a2>] csum_partial+0xca/0x120

 [<c05f5d98>] do_page_fault+0x0/0xbeb

 [<c0404e83>] error_code+0x2b/0x30

 [<c04d70a2>] csum_partial+0xca/0x120

 [<c04d70a2>] csum_partial+0xca/0x120

 [<c059778e>] skb_checksum+0x111/0x27b

 [<c0598693>] pskb_expand_head+0xcf/0x113

 [<c059c0c3>] skb_checksum_help+0x64/0xb3

 [<e14462ee>] ip_nat_fn+0x42/0x185 [iptable_nat]

 [<e1446628>] ip_nat_local_fn+0x34/0xa4 [iptable_nat]

 [<c05b89a4>] dst_output+0x0/0x7

 [<c05b1738>] nf_iterate+0x30/0x61

 [<c05b89a4>] dst_output+0x0/0x7

 [<c05b185e>] nf_hook_slow+0x3a/0x90

 [<c05b89a4>] dst_output+0x0/0x7

 [<c05babc3>] ip_queue_xmit+0x37e/0x3cf

 [<c05b89a4>] dst_output+0x0/0x7

 [<e1068a97>] scsi_dispatch_cmd+0x21f/0x28c [scsi_mod]

 [<c041556a>] enqueue_task+0x29/0x39

 [<c045e7cd>] kmem_cache_alloc+0x54/0x5e

 [<c05c8527>] tcp_transmit_skb+0x5e4/0x612

 [<c05c9265>] tcp_retransmit_skb+0x4b7/0x595

 [<c05c247b>] tcp_enter_loss+0x1a2/0x1ff

 [<c05cb311>] tcp_write_timer+0x0/0x5d3

 [<c05cb710>] tcp_write_timer+0x3ff/0x5d3

 [<c0424a25>] run_timer_softirq+0x101/0x15c

 [<c041ffa7>] __do_softirq+0x5e/0xc3

 [<c040679c>] do_softirq+0x56/0xae

 [<c040673d>] do_IRQ+0xa5/0xae

 [<c053a0ad>] evtchn_do_upcall+0x64/0x9b

 [<c0404ec5>] hypervisor_callback+0x3d/0x48

 [<c05399fc>] force_evtchn_callback+0xa/0xc

 [<c05ee1fd>] unix_write_space+0x3f/0x69

 [<c0596ccb>] sock_wfree+0x21/0x36

 [<c059849b>] __kfree_skb+0x97/0xe3

 [<c05ecc9f>] unix_stream_recvmsg+0x33f/0x4a4

 [<c0592eb4>] do_sock_read+0xae/0xb7

 [<c0593411>] sock_aio_read+0x53/0x61

 [<c0461fa7>] do_sync_read+0xb6/0xf1

 [<c042cc1d>] autoremove_wake_function+0x0/0x2d

 [<c059346c>] sock_ioctl+0x0/0x1b3

 [<c04628c1>] vfs_read+0xb0/0x141

 [<c0462cfe>] sys_read+0x3c/0x63

 [<c0404cff>] syscall_call+0x7/0xb

 =======================

(XEN) Domain 0 crashed: rebooting machine in 5 seconds.
-----------


Additional info:

20-30 days running.
A certain day kernel panic suddenly. 
It is unbelievable though the problem seems to occur by csum_partial in the panic 
message. 
csum_partial_copy_generic is a function used very often. 
The kernel panic will more frequently be done if there is a problem. 
However, it running for 20-30 days.
Why ?

The environment
DRBD 8.0.5.1(compiled from source)
heartbeat 2.1.2(compiled from source)
Xen hypervisor 3.0.3-25.el5(include RHEL5.0)
LinuxKernel 2.6.18-8.el5xen(include RHEL5.0)
The failover cluster was made by using these packages.

Comment 1 makoto nohara 2008-04-22 16:05:25 UTC
H/W The environment.
------------------
PowerEdge 860 (cpu intel 3040)
memory 4G byte
80G byte HDDx2 (Software RAID1)
-----------------
 The failover cluster is composed by using machine parts similar to the above-
mentioned. 


Comment 2 makoto nohara 2008-04-22 16:22:12 UTC
RHES4.5 is running on Xen of RHEL5.0. 
,in a word
Dom0 = RHEL5.0. 
Domu = RHES4.5. 
When trouble occurs, VM(DomU) starts with another machine .



Comment 3 makoto nohara 2008-04-23 01:58:03 UTC
 BIOS of the version that corrects the bug is used though Xeon3040 installed in
DELL860 has the microcode bug. 

 I'm sorry that necessary information is added only by the additional 
information. After it writes, I think, "Is it this information and a 
necessity?".....



Comment 4 Daniel Berrangé 2008-07-09 13:27:01 UTC
The kernel/hypervisors versions you are using here:

  Xen hypervisor 3.0.3-25.el5(include RHEL5.0)
  Linux Kernel 2.6.18-8.el5xen(include RHEL5.0)

are still the RHEL-5.0 GA releases. Could you re-test with the RHEL-5.2 erratas
applied to the system

Comment 5 makoto nohara 2008-07-09 14:08:32 UTC
RHEL-5.2 system now testing .
However, the continuous running time is not enough. 
Time is necessary a little more for the result's turning out.

Comment 7 Chris Lalancette 2009-01-13 07:49:48 UTC
*** Bug 479756 has been marked as a duplicate of this bug. ***

Comment 8 makoto nohara 2009-02-02 09:06:05 UTC
RHEL-5.2 system testing .... 
However, the panic has occurred. 
This panic might be a cause different from the panic that occurred before. 


The environment
LinuxKernel 2.6.18-92.1.1.el5xen(download from redhat-network)
drbd-8.2.6-3(compiled from source)
heartbeat-2.1.3-1(compiled from source)
xen-3.0.3-64.el5(include RHEL5.2)

------------
      KERNEL: /usr/lib/debug/lib/modules/2.6.18-92.1.1.el5xen/vmlinux
    DUMPFILE: /mnt/127.0.0.1-2009-01-08-12:33:06/vmcore
        CPUS: 2
        DATE: Thu Jan  8 12:32:46 2009
      UPTIME: 22 days, 20:09:27
LOAD AVERAGE: 1.73, 1.23, 1.17
       TASKS: 226
    NODENAME: XXXXX1
     RELEASE: 2.6.18-92.1.1.el5xen
     VERSION: #1 SMP Thu May 22 09:31:19 EDT 2008
     MACHINE: i686  (1866 Mhz)
      MEMORY: 520 MB
       PANIC: "Oops: 0000 [#1]" (check log for details)
         PID: 979
     COMMAND: "nautilus"
        TASK: de5d5aa0  [THREAD_INFO: cfefc000]
         CPU: 0
       STATE: TASK_RUNNING (PANIC)
------------


crash log
-------------
BUG: unable to handle kernel paging request at virtual address e071d668
 printing eip:
c04e325a
00ecb000 -> *pde = 00000000:c6a49001
197c7000 -> *pme = 00000000:3e0fc067
000fc000 -> *pte = 00000000:00000000
Oops: 0000 [#1]
SMP
last sysfs file: /devices/pci0000:00/0000:00:00.0/irq
Modules linked in: xt_physdev ip_conntrack_ftp netloop netbk blktap blkbk ipt_MASQUERADE iptable_nat ip_nat xt_state ip_conntrack nfnetlink ipt_R
EJECT xt_tcpudp iptable_filter ip_tables x_tables drbd(U) autofs4 hidp rfcomm l2cap bluetooth sunrpc bridge dummy 8021q dm_mirror dm_multipath dm
_mod video sbs backlight i2c_ec button battery asus_acpi ac ipv6 xfrm_nalgo crypto_api parport_pc lp parport sg i3000_edac edac_mc ide_cd r8169 i
2c_i801 i2c_core pcspkr serial_core cdrom tg3 serio_raw ata_piix libata sd_mod scsi_mod raid1 ext3 jbd uhci_hcd ohci_hcd ehci_hcd
CPU:    0
EIP:    0061:[<c04e325a>]    Tainted: G      VLI
EFLAGS: 00010296   (2.6.18-92.1.1.el5xen #1)
EIP is at csum_partial+0xca/0x120
eax: 00000000   ebx: c04e325a   ecx: 0000000b   edx: 000005a8
esi: e071d690   edi: 000005a8   ebp: 00000034   esp: c071cde8
ds: 007b   es: 007b   ss: 0069
Process nautilus (pid: 979, ti=c071c000 task=de5d5aa0 task.ti=cfefc000)
Stack: e071d000 00000034 c05a9f12 e071d668 000005a8 00000000 00000010 db73112c
       00000000 00000020 000005dc cbea7f14 c05aae25 cbea7e00 000005a8 db73112c
       c98aa2cc c98aa2e0 c071cef8 c05ae744 c4d5de4c 00000000 00000003 c071cef8
Call Trace:
 [<c05a9f12>] skb_checksum+0x111/0x27b
 [<c05aae25>] pskb_expand_head+0xd6/0x11a
 [<c05ae744>] skb_checksum_help+0x64/0xb3
 [<e14bd2ee>] ip_nat_fn+0x42/0x185 [iptable_nat]
 [<c0469c92>] kmem_cache_alloc+0x54/0x5e
 [<e14bd628>] ip_nat_local_fn+0x34/0xa4 [iptable_nat]
 [<c05cb3c8>] dst_output+0x0/0x7
 [<c05c3e3c>] nf_iterate+0x30/0x61
 [<c05cb3c8>] dst_output+0x0/0x7
 [<c05c3f62>] nf_hook_slow+0x3a/0x90
 [<c05cb3c8>] dst_output+0x0/0x7
 [<c05cd711>] ip_queue_xmit+0x3cd/0x41e
 [<c05cb3c8>] dst_output+0x0/0x7
 [<c041b708>] __activate_task+0x1c/0x29
 [<c041bfd4>] try_to_wake_up+0x309/0x313
 [<e14f3dff>] net_rx_action+0x771/0x7de [netbk]
 [<c05db34d>] tcp_transmit_skb+0x5e4/0x612
 [<c043190c>] autoremove_wake_function+0xd/0x2d
 [<c041a99f>] __wake_up_common+0x2f/0x53
 [<c05dc0a6>] tcp_retransmit_skb+0x4c0/0x59e
 [<c041b567>] __wake_up+0x2a/0x3d
 [<c05d5223>] tcp_enter_loss+0x1a2/0x1ff
 [<c05de19c>] tcp_write_timer+0x0/0x5e4
 [<c05de5a1>] tcp_write_timer+0x405/0x5e4
 [<c0429600>] run_timer_softirq+0x101/0x15c
 [<c042613e>] __do_softirq+0x5e/0xc3
 [<c0406edf>] do_softirq+0x56/0xaf
 [<c0406e80>] do_IRQ+0xa5/0xae
 [<c0549b63>] evtchn_do_upcall+0x64/0x9b
 [<c04055d9>] hypervisor_callback+0x3d/0x48
 =======================
Code: 9c 13 46 a0 13 46 a4 13 46 a8 13 46 ac 13 46 b0 13 46 b4 13 46 b8 13 46 bc 13 46 c0 13 46 c4 13 46 c8 13 46 cc 13 46 d0 13 46 d4 <13> 46 d8
 13 46 dc 13 46 e0 13 46 e4 13 46 e8 13 46 ec 13 46 f0
EIP: [<c04e325a>] csum_partial+0xca/0x120 SS:ESP 0069:c071cde8

-------------

 
crash> bt
PID: 979    TASK: de5d5aa0  CPU: 0   COMMAND: "nautilus"
 #0 [c071cd0c] die at c040606e
 #1 [c071cd38] do_page_fault at c060abfc
 #2 [c071cdb0] error_code (via page_fault) at c0405595
    EAX: 00000000  EBX: c04e325a  ECX: 0000000b  EDX: 000005a8  EBP: 00000034
    DS:  007b      ESI: e071d690  ES:  007b      EDI: 000005a8
    CS:  0061      EIP: c04e325a  ERR: ffffffff  EFLAGS: 00010296
 #3 [c071cde4] csum_partial at c04e325a
 #4 [c071cdf0] skb_checksum at c05a9f0d
 #5 [c071ce34] skb_checksum_help at c05ae73f
 #6 [c071ce48] ip_nat_fn at e14bd2e9
 #7 [c071ce6c] ip_nat_local_fn at e14bd623
 #8 [c071ce80] nf_iterate at c05c3e39
 #9 [c071cea0] nf_hook_slow at c05c3f5d
#10 [c071cecc] ip_queue_xmit at c05cd70c
#11 [c071cf60] tcp_transmit_skb at c05db34b
#12 [c071cf94] tcp_retransmit_skb at c05dc0a1
#13 [c071cfbc] tcp_write_timer at c05de59c
#14 [c071cfcc] run_timer_softirq at c04295fe
#15 [c071cfe8] __do_softirq at c042613c
--- <soft IRQ> ---
------------

crash> ps
   PID    PPID  CPU   TASK    ST  %MEM     VSZ    RSS  COMM
      0      0   0  c066f2c0  RU   0.0       0      0  [swapper]
      0      1   1  c0d60550  RU   0.0       0      0  [swapper]
      1      0   0  c0d60aa0  IN   0.1    2076    680  init
      2      1   0  c0d60000  IN   0.0       0      0  [migration/0]
      3      1   0  c0198aa0  IN   0.0       0      0  [ksoftirqd/0]
      4      1   0  c0198550  IN   0.0       0      0  [watchdog/0]
      5      1   1  c0198000  IN   0.0       0      0  [migration/1]
      6      1   1  c5568aa0  IN   0.0       0      0  [ksoftirqd/1]
      7      1   1  c5568550  IN   0.0       0      0  [watchdog/1]
      8      1   0  c5568000  IN   0.0       0      0  [events/0]
      9      1   1  c0cfeaa0  IN   0.0       0      0  [events/1]
     10      1   1  c0cfe550  IN   0.0       0      0  [khelper]
     11      1   0  c0cfe000  IN   0.0       0      0  [kthread]
     13     11   0  c0cc2550  IN   0.0       0      0  [xenwatch]
     14     11   0  c0cc2000  IN   0.0       0      0  [xenbus]
     17     11   0  c0cb8000  IN   0.0       0      0  [kblockd/0]
     18     11   1  c0c9eaa0  IN   0.0       0      0  [kblockd/1]
     19     11   0  c0c9e550  IN   0.0       0      0  [kacpid]
     98     11   0  c0c0b000  IN   0.0       0      0  [cqueue/0]
     99     11   1  c0c11aa0  IN   0.0       0      0  [cqueue/1]
    103     11   0  c0c1b550  IN   0.0       0      0  [khubd]
    105     11   0  c0c20aa0  IN   0.0       0      0  [kseriod]
    170     11   0  c07d7aa0  IN   0.0       0      0  [pdflush]
    171     11   1  c07d1000  IN   0.0       0      0  [pdflush]
    172     11   0  c07d1550  IN   0.0       0      0  [kswapd0]
    173     11   0  c07d1aa0  IN   0.0       0      0  [aio/0]
    174     11   1  c0c4a000  IN   0.0       0      0  [aio/1]
    320     11   1  c0c6aaa0  IN   0.0       0      0  [kpsmoused]
    354     11   0  c0c58aa0  IN   0.0       0      0  [ata/0]
    355     11   1  c0c58550  IN   0.0       0      0  [ata/1]
    356     11   0  c0c58000  IN   0.0       0      0  [ata_aux]
    360     11   0  c0c79550  IN   0.0       0      0  [scsi_eh_0]
    361     11   0  c0c89550  IN   0.0       0      0  [scsi_eh_1]
    364     11   0  c0c2eaa0  RU   0.0       0      0  [md1_raid1]
    367     11   0  c0c28aa0  IN   0.0       0      0  [md0_raid1]
    368     11   0  c0c28000  IN   0.0       0      0  [kjournald]
    396     11   1  c0c28550  IN   0.0       0      0  [kauditd]
    430      1   1  c0c1b000  IN   0.2    2440    888  udevd
    675  31848   1  de1bd550  IN   1.3   24024   7176  gnome-session
    823    675   0  cb8bc550  DE   0.0       0      0  Xsession
    826    675   1  d0120550  IN   0.1    6492    604  ssh-agent
    855      1   0  d1536aa0  IN   0.1    2824    788  dbus-launch
    857      1   0  d0705aa0  IN   0.2    2756    956  dbus-daemon
    875      1   1  d1ea1aa0  IN   0.7    8224   3632  gconfd-2
    876      1   1  c8f59000  IN   1.5   38228   7844  scim-panel-gtk
    877      1   1  d1536000  IN   1.5   38228   7844  scim-panel-gtk
    878      1   1  cb0f5550  IN   0.2    9224    808  scim-launcher
    903     11   1  df218000  IN   0.0       0      0  [kedac]
    905      1   0  c6fa3aa0  IN   0.1    2576    764  gnome-keyring-d
    907      1   1  d64b8aa0  IN   1.5   34872   8088  gnome-settings-
    946      1   1  de979aa0  ??   1.5   34872   8088  gnome-settings-
    972      1   1  e0754aa0  IN   2.3   28320  12200  metacity
    977      1   1  d1c3baa0  IN   3.0   58048  15960  gnome-panel
>   979      1   0  de5d5aa0  RU  11.6  137020  61616  nautilus
    983      1   1  de5d5000  IN   0.6   40764   2996  bonobo-activati
    984      1   1  c62b4000  IN   0.6   40764   2996  bonobo-activati
    985      1   1  d257eaa0  IN   0.9   23556   4880  gnome-volume-ma
    987      1   0  deefc000  IN   1.6   45820   8428  eggcups
    989      1   0  c62b4550  IN   0.7   12392   3660  gnome-vfs-daemo
   1008      1   0  dfd19aa0  IN   1.0   15412   5068  bt-applet
   1016      1   1  d480caa0  IN   5.2  119356  27476  xulrunner-bin
   1021      1   1  d64b8550  IN   1.9   46208   9992  nm-applet
   1023      1   1  de77e550  IN   0.9   16216   4696  pam-panel-icon
   1024      1   0  d257e550  RU   1.2   46028   6276  gnome-power-man
   1025   1023   1  de92c000  IN   0.1    1856    620  pam_timestamp_c
   1060      1   0  d7f31aa0  IN   2.7   57148  14348  wnck-applet
   1062      1   0  dc7c3aa0  IN   1.7   76700   8832  trashapplet
   1125      1   1  dc7c3000  IN   5.2  119356  27476  xulrunner-bin
   1155      1   1  c6fa3000  IN   0.2    8104   1228  scim-bridge
   1197      1   1  c6fa3550  IN   1.5   24076   7876  notification-ar
   1199      1   1  ccd97aa0  IN   2.6   39724  13968  clock-applet
   1201      1   1  d1536550  IN   2.6   56968  13824  mixer_applet2
   1203      1   1  d56f4aa0  IN   5.2  119356  27476  xulrunner-bin
   1302      1   1  c8949aa0  IN   0.3   43468   1428  pcscd
   1303      1   1  d7f95000  IN   5.2  119356  27476  xulrunner-bin
   1487     11   0  df157aa0  IN   0.0       0      0  [kmpathd/0]
   1488     11   1  dea3a550  IN   0.0       0      0  [kmpathd/1]
   1515     11   0  df9eaaa0  IN   0.0       0      0  [kjournald]
   1771      1   1  cd969550  IN   0.9   17996   4676  gnome-screensav
   2569      1   1  c07f3aa0  IN   0.2   13188    812  auditd
   2570      1   1  c0c4a550  IN   0.2   13188    812  auditd
   2571   2569   1  dec66aa0  IN   0.2   14112    980  audispd
   2572   2569   1  c07f9550  IN   0.2   14112    980  audispd
   2594      1   0  c0c11000  IN   0.1    1732    620  syslogd
   2597      1   0  deefc550  RU   0.1    1684    408  klogd
   2609      1   1  de2e4000  IN   0.1    2444    368  irqbalance
   2630      1   0  c0c33aa0  IN   0.1    1820    548  portmap
   2659      1   0  c0c2e000  IN   0.1    1832    740  rpc.statd
   2699      1   0  c0c20550  IN   0.1    1848    396  mdadm
   2729      1   1  de979550  IN   0.1    5452    572  rpc.idmapd
   2789      1   1  df157000  IN   0.2    2888   1104  dbus-daemon
   2800      1   0  c0cb8aa0  IN   0.1    2160    780  hcid
   2806      1   0  c0c89000  IN   0.1    1752    520  sdpd
   2829      1   0  c026caa0  IN   0.0       0      0  [krfcommd]
   2870      1   0  c07f9aa0  IN   0.3   43468   1428  pcscd
   2885      1   0  c0c4aaa0  IN   0.3   43468   1428  pcscd
   2891      1   0  de8a6550  IN   0.1    1924    464  hidd
   2907      1   0  c0c6a000  IN   0.2   10852   1320  automount
   2908      1   1  de5cdaa0  IN   0.2   10852   1320  automount
   2909      1   1  c0c0baa0  IN   0.2   10852   1320  automount
   2912      1   1  c0c64aa0  IN   0.2   10852   1320  automount
   2915      1   0  df97daa0  IN   0.2   10852   1320  automount
   2926      1   0  c0cc2aa0  IN   0.1    1684    544  acpid
   2937      1   0  c0c9e000  IN   0.1    5084    764  hpiod
   2942      1   1  de1bd000  IN   0.9   14568   4788  python
   2957      1   0  df9ea550  IN   0.2    7000   1056  sshd
   2968      1   1  de2e4550  IN   0.5   10936   2416  cupsd
   2980      1   0  df218aa0  IN   0.2    2736    904  xinetd
   2995      1   0  df9ea000  RU   0.0       0      0  [drbd0_worker]
   3005      1   0  c0c0b550  RU   0.0       0      0  [drbd0_receiver]
   3013      1   0  c07f3550  IN   0.0       0      0  [drbd0_asender]
   3030      1   0  de5cd000  IN   0.2    4412   1096  ha_logd
   3039   3030   0  de2e4aa0  RU   0.1    4412    796  ha_logd
   3079      1   1  de5d5550  IN   2.3   12108  12108  heartbeat
   3090      1   1  c0c50aa0  IN   0.1    1908    488  gpm
   3101      1   0  c07f9000  IN   0.2    6220   1120  crond
   3120   3079   1  c0c2e550  ??   1.0    5512   5512  heartbeat
   3121   3079   1  dea3a000  IN   1.0    5508   5508  heartbeat
   3122   3079   1  c07d7000  IN   1.0    5508   5508  heartbeat
   3141      1   1  dea3aaa0  IN   0.4    4320   2132  xfs
   3162      1   0  c0c50550  IN   0.1    2256    440  atd
   3185      1   0  c026c000  IN   0.3    5016   1664  libvirtd
   3212      1   0  deefcaa0  IN   0.1    4644    412  rhnsd
   3236      1   0  dec66550  IN   0.7    5844   3924  hald
   3247   3236   0  df218550  IN   0.2    3148   1084  hald-runner
   3337   3185   0  c026c550  IN   0.1    1828    748  dnsmasq
   3358   3247   0  d1ea1550  IN   0.2    2008    808  hald-addon-keyb
   3360   3247   0  df157550  IN   0.2    2012    812  hald-addon-acpi
   3365   3247   0  d232a550  IN   0.1    1968    660  hald-addon-stor
   3618      1   1  d074c000  IN   0.2    2252   1112  xenstored
   3623      1   1  d1c3b550  IN   0.8   13032   4004  python
   3624   3623   0  d066b550  IN   1.0   96184   5552  python
   3626      1   0  c0c89aa0  IN   0.1   12212    608  xenconsoled
   3627      1   1  e0754550  IN   0.1   13544    792  blktapctrl
   3628      1   0  d0120aa0  IN   0.1   12212    608  xenconsoled
   3629      1   0  c0c6a550  IN   0.1   13544    792  blktapctrl
   3630   3623   0  df97d000  IN   1.0   96184   5552  python
   3633   3623   0  d06cd550  IN   1.0   96184   5552  python
   3634   3623   1  c0c79000  IN   1.0   96184   5552  python
   3900   3623   0  e0754000  IN   1.0   96184   5552  python
   3901   3623   1  d06e0550  IN   1.0   96184   5552  python
   3902      1   0  d06e0aa0  IN   1.9   25632  10208  yum-updatesd
   3907      1   0  d0705000  IN   0.2    2696   1224  gam_server
   3974      1   1  d0705550  ??   0.2    5472   1220  livxen.sh
   3975      1   1  d071d550  IN   0.2    5472   1224  xentop-logger.s
   3976      1   1  d0720000  IN   0.2    5472   1208  swaps-logger.sh
   4211      1   0  d074c550  IN   0.1    1996    520  smartd
   4223      1   1  cb792000  IN   0.1    1668    456  mingetty
   4224      1   0  cbb9b000  IN   0.1    1668    456  mingetty
   4227      1   0  dec66000  IN   0.1    1668    452  mingetty
   4236      1   0  cb792550  IN   0.1    1668    456  mingetty
   4238      1   1  c07f3000  IN   0.1    1664    452  mingetty
   4244      1   1  cb8bcaa0  IN   0.1    1668    456  mingetty
   4246      1   0  d0720aa0  IN   0.6   16652   3028  gdm-binary
   4471   4246   1  d232aaa0  IN   0.5   17256   2764  gdm-binary
   4473      1   0  c8110550  IN   0.8   28332   4200  gdm-rh-security
   4482   4471   1  cbb9b550  IN   2.9   38816  15204  Xorg
   4578      1   0  d073e000  RU   0.3   43468   1428  pcscd
   4584      1   1  cb6da550  IN   0.8   28332   4200  gdm-rh-security
   4622     11   1  d071daa0  IN   0.0       0      0  [kjournald]
   4745      1   1  df97d550  IN   0.1   15012    724  tapdisk
   4746      1   1  de8a6aa0  IN   0.1   15012    724  tapdisk
   4751      1   1  c872d000  IN   0.1   15012    720  tapdisk
   4753      1   1  d06cdaa0  IN   0.1   15012    720  tapdisk
   4786      1   1  c0c33000  IN   0.9   32720   4732  qemu-dm
   5014      1   0  c6c8aaa0  IN   0.9   32720   4732  qemu-dm
   5033      1   0  cb0f5aa0  IN   0.9   32720   4732  qemu-dm
   5083     11   1  c0c79aa0  IN   0.0       0      0  [xvd 1]
   5084     11   0  c6aacaa0  IN   0.0       0      0  [xvd 1]
   5838   3101   0  d232a000  IN   0.3    6796   1504  crond
   5839   5838   0  c72c9000  DE   0.0       0      0  python
   5840   5838   0  d64b8000  IN   0.4    7916   2268  sendmail
   7168   3976   1  d480c550  IN   0.1    4648    488  sleep
   7169      1   0  c92fc000  IN  11.6  137020  61616  nautilus
   7178   3974   1  d06cd000  RU   0.1    5472    488  livxen.sh
>  7179   7178   1  c07d7550  RU   1.0   12144   5356  python
   7180   7178   1  ca73e550  ??   0.1    4988    768  grep
   7181   7178   1  cfb9caa0  RU   0.0    5472    168  livxen.sh
  14882   4471   0  d074caa0  IN   1.4   24180   7296  gnome-session
  14923  14882   0  d1ea1000  DE   0.0       0      0  Xsession
  14926  14882   1  c0835aa0  IN   0.1    6492    640  ssh-agent
  14955      1   0  c62b4aa0  IN   0.1    2784    624  dbus-launch
  14956      1   1  c0c11550  IN   0.2    2760    980  dbus-daemon
  14962      1   0  c0c20000  IN   0.7    8212   3580  gconfd-2
  14975      1   1  c872daa0  IN   0.4   27744   1972  scim-launcher
  14978      1   0  d06e0000  IN   0.1    2580    764  gnome-keyring-d
  14980      1   0  cb0f5000  IN   1.5   34928   8216  gnome-settings-
  14982      1   1  dfd19550  ??   1.5   34928   8216  gnome-settings-
  14997      1   1  de77e000  IN   2.5   28884  13152  metacity
  15000      1   0  c8110aa0  IN   0.1    7112    768  scim-helper-man
  15001      1   0  d1c3b000  IN   1.5   38420   7816  scim-panel-gtk
  15002      1   0  c8110000  IN   1.5   38420   7816  scim-panel-gtk
  15003      1   1  c6aac550  IN   0.2    9220    812  scim-launcher
  15012      1   1  d066baa0  IN   3.1   58344  16488  gnome-panel
  15017      1   1  c0c33550  IN   3.9   96664  20820  nautilus
  15021      1   1  cbb9baa0  IN   0.6   39736   3020  bonobo-activati
  15023      1   1  c872d550  IN   1.6   45816   8428  eggcups
  15025      1   1  c6aac000  IN   0.7   12384   3692  gnome-vfs-daemo
  15028      1   1  c0cb8550  IN   0.6   39736   3020  bonobo-activati
  15029      1   1  d0120000  IN   0.9   23556   4948  gnome-volume-ma
  15039      1   0  de8a6000  IN   0.9   15400   5044  bt-applet
  15046      1   0  c6c8a550  IN   4.0   39544  21388  puplet
  15048      1   0  de77eaa0  IN   1.9   46208   9976  nm-applet
  15060      1   1  cb8bc000  IN   0.9   16216   4692  pam-panel-icon
  15062  15060   1  d073e550  IN   0.1    1856    620  pam_timestamp_c
  15064      1   0  c6645aa0  IN   0.5   18240   2636  escd
  15065      1   1  de979000  IN   0.3   43468   1428  pcscd
  15067      1   1  d0720550  IN   1.2   46032   6360  gnome-power-man
  15068      1   1  dfd19000  IN   0.5   18240   2636  escd
  15111      1   1  c6645550  IN   2.8   57336  14672  wnck-applet
  15113      1   1  c0835550  IN   2.6   87652  13928  trashapplet
  15141      1   1  de1bdaa0  IN   0.2    2480    880  mapping-daemon
  15267      1   1  db67faa0  IN   1.5   24076   7836  notification-ar
  15270      1   1  c0c64550  IN   2.6   39740  13976  clock-applet
  15272      1   1  d066b000  IN   3.0   59112  15884  mixer_applet2
  15414      1   0  dac5baa0  IN   3.8   79896  20144  gnome-terminal
  15417      1   1  db070aa0  IN   0.2    8104   1284  scim-bridge
  15418  15414   1  c0c1baa0  IN   0.1    2484    712  gnome-pty-helpe
  15424  15414   1  c6c8a000  IN   0.3    5476   1516  bash
  15425      1   0  db67f000  ??   3.8   79896  20144  gnome-terminal
  15907      1   1  dac5b000  IN   0.9   18012   4944  gnome-screensav
  15963  15424   1  cfef5550  IN   1.0    7840   5276  vncviewer
  25896   1771   0  d7f31000  RU   2.5   28268  13456  floaters
  26928   3975   1  c0c50000  IN   0.1    4652    488  sleep
  31836   2980   1  c6645000  IN   2.7   18844  14416  Xvnc
  31848   4246   1  c0c64000  IN   0.5   17248   2684  gdm-binary
-------------

Does this trouble relate to the trouble that occurs by RHEL5.0? Or, is it a new trouble?

Comment 9 Chris Lalancette 2009-02-02 09:34:20 UTC
The stack trace looks to be about the same to me, so it looks like the same crash.

Chris Lalancette

Comment 10 makoto nohara 2009-02-02 12:10:48 UTC
It is an additional information. 

> #0 [c071cd0c] die at c040606e
> #1 [c071cd38] do_page_fault at c060abfc
> #2 [c071cdb0] error_code (via page_fault) at c0405595
c040606e ?
It is made to display. 

----------
crash> kmem c040606e
c040606e (T) die+568  ../debug/kernel-2.6.18/linux-2.6.18.i686/arch/i386/kernel/traps-xen.c: 469

  PAGE    PHYSICAL   MAPPING    INDEX CNT FLAGS
c10080c0    406000         0         0  1 400

crash>
----------

Traps-xen.c was examined. 

traps-xen.c: 469
----------
    if (in_interrupt())
          panic("Fatal exception in interrupt"); # <- 469 
----------

The following things are being written in the 382 . 
----------
/* This is gone through when something in the kernel
 * has done something bad and is about to be terminated.
*/
----------
something bad?

How can I examine "something bad"?

Comment 11 Chris Lalancette 2009-02-02 12:27:44 UTC
That's not the interesting part of the trace; that's just showing you that something happened that we didn't like, and now we are going to panic.  Where you want to start looking is right before the error_code, namely at stack point #3.  The instruction there is the one that caused the crash; you have to look at it and figure out what was going on at that point to cause it.

Chris Lalancette

Comment 12 makoto nohara 2009-02-02 15:47:15 UTC

I understand neither C language nor the assembler. 
Therefore, I begin to pick it up expecting to think that there is 
relations. 
It is a limit of my ability. 

The ability to understand my English is a limit. 
I'm sorry in strange sentences. 

-----------------
crash> kmem c04e325a
c04e325a (T) csum_partial+202  include/asm/atomic.h: 165

  PAGE    PHYSICAL   MAPPING    INDEX CNT FLAGS
c1009c60    4e3000         0         0  1 400
crash>


/**
 * atomic_add_negative - add and test if negative
 * @v: pointer of type atomic_t
 * @i: integer value to add
 *
 * Atomically adds @i to @v and returns true
 * if the result is negative, or false when
 * result is greater than or equal to zero.
 */
static __inline__ int atomic_add_negative(int i, atomic_t *v)
{
        unsigned char c;

        __asm__ __volatile__(  #<-165
                LOCK_PREFIX "addl %2,%0; sets %1"
                :"+m" (v->counter), "=qm" (c)
                :"ir" (i) : "memory");
        return c;
}

-----------------



-----------------
crash> kmem c060abfc
c060abfc (T) do_page_fault+2688  ../debug/kernel-2.6.18/linux-2.6.18.i686/arch/i386/mm/fault-xen.c: 698

  PAGE    PHYSICAL   MAPPING    INDEX CNT FLAGS
c100c140    60a000         0         0  1 400

/*
 * Oops. The kernel tried to access some bad page. We'll have to
 * terminate things with extreme prejudice.
 */

        bust_spinlocks(1);

        if (oops_may_print()) {
        #ifdef CONFIG_X86_PAE
                if (error_code & 16) {
                        pte_t *pte = lookup_address(address);

                        if (pte && pte_present(*pte) && !pte_exec_kernel(*pte))
                                printk(KERN_CRIT "kernel tried to execute "
                                        "NX-protected page - exploit attempt? "
                                        "(uid: %d)\n", current->uid);
                }
        #endif
                if (address < PAGE_SIZE)
                        printk(KERN_ALERT "BUG: unable to handle kernel NULL "
                                        "pointer dereference");
                else
                        printk(KERN_ALERT "BUG: unable to handle kernel paging"
                                        " request");
                printk(" at virtual address %08lx\n",address);
                printk(KERN_ALERT " printing eip:\n");
                printk("%08lx\n", regs->eip);
                dump_fault_path(address);
        }
        tsk->thread.cr2 = address;
        tsk->thread.trap_no = 14;
        tsk->thread.error_code = error_code;
        die("Oops", regs, error_code);  #<-698
        bust_spinlocks(0);
        do_exit(SIGKILL);
-----------------


-----------------
crash> kmem c0405595
c0405595 (t) error_code+41  ../debug/kernel-2.6.18/linux-2.6.18.i686/arch/i386/kernel/entry.S

  PAGE    PHYSICAL   MAPPING    INDEX CNT FLAGS
c10080a0    405000         0         0  1 400
-----------------
 
/usr/src/debug/debug/kernel-2.6.18/linux-2.6.18.i686/arch/i386/kernel/entry.S
The file was not found. 
However, there was a file as follows. 
/usr/src/debug/kernel-2.6.18/xen/arch/x86/x86_32/entry.S

----------
.Lft16: movl %eax,%gs:8(%esi)
        test $TBF_EXCEPTION_ERRCODE,%cl
        jz   1f
        subl $4,%esi                    # push error_code onto guest frame
        movl TRAPBOUNCE_error_code(%edx),%eax
----------
.Lfx1:  sti
        SAVE_ALL_GPRS
        mov   UREGS_error_code(%esp),%esi
        pushfl                         # EFLAGS
        movl  $__HYPERVISOR_CS,%eax
        pushl %eax                     # CS
        movl  $.Ldf1,%eax
        pushl %eax                     # EIP
        pushl %esi                     # error_code/entry_vector
        jmp   handle_exception
----------
exception_with_ints_disabled:
        movl  UREGS_eflags(%esp),%eax
        movb  UREGS_cs(%esp),%al
        testl $(3|X86_EFLAGS_VM),%eax   # interrupts disabled outside Xen?
        jnz   FATAL_exception_with_ints_disabled
        pushl %esp
        call  search_pre_exception_table
        addl  $4,%esp
        testl %eax,%eax                 # no fixup code for faulting EIP?
        jz    1b
        movl  %eax,UREGS_eip(%esp)
        movl  %esp,%esi
        subl  $4,%esp
        movl  %esp,%edi
        movl  $UREGS_kernel_sizeof/4,%ecx
        rep;  movsl                     # make room for error_code/entry_vector
        movl  UREGS_error_code(%esp),%eax # error_code/entry_vector
        movl  %eax,UREGS_kernel_sizeof(%esp)
        jmp   restore_all_xen           # return to fixup code
----------
Does necessary information suffice in the above?

It seems to use "atomic_add_negative" that is a part of function of "Csum_partial 
function". 

Is "atomic_add_negative" used to switch the processing of host OS and guest OS?

It is imagined that PANIC was generated because the specified execution address reached a value not correct after the switch of the processing of host OS and guest OS. 

When an illegal address is generated by the processing of guest OS, does the value of EIP of the PANIC function become "csum_partial"?

Comment 13 Herbert Xu 2009-02-09 06:07:51 UTC
So the problem is that someone has unmapped the memory behind the packet that is still being retransmitted.  Could you try to determine the socket of the packet (skb->sk) and its IP/port numbers? That should help you find the application (which is probably not the process in which it crashed since it's in softirq context) that owns the socket and perhaps we can have a chance in reproducing it then.

Thanks!

Comment 14 Matthew Baker 2009-04-22 11:06:59 UTC
Hi,

we've been experiencing a similar problem recently with debian etch and a 2.6.18 kernel. For us the workaround was to turn off rx/tx checksumming on the relevant network interface, like so:

ethtool -K eth0 rx off tx off

Cheers,

Matt

p.s. here's our oops message:

BUG: unable to handle kernel paging request at virtual address c081b000

printing eip:
c01bb497
0e5a2000 -> *pde = 00000000:c4871001
0e5a3000 -> *pme = 00000000:06fa3067
00fa3000 -> *pte = 00000000:00000000
Oops: 0000 [#1]
SMP
Modules linked in: netloop button ac battery ip6table_filter ip6_tables
iptablen
CPU: 0
EIP: 0061:[<c01bb497>] Not tainted VLI
EFLAGS: 00010282 (2.6.18-6-xen-686 #1)
EIP is at csum_partial+0xd3/0x120
eax: 00000000 ebx: c01bb497 ecx: 0000000b edx: 0000059c
esi: c081b01c edi: 0000059c ebp: 00000040 esp: c88dfd84
ds: 007b es: 007b ss: 0069
Process python (pid: 9966, ti=c88de000 task=cf760000 task.ti=c88de000)
Stack: c081b000 00000040 c022da42 c081b000 0000059c 00000000 00000018
c6b808ac
00000001 0000002c 000005dc cfe33b3c c022e94e cfe33a00 0000059c
c6b808ac
ce7b54f8 ce7b550c c88dfe84 c02323fb aedd2abb cdc1dce0 00000003
c88dfe84
Call Trace:
[<c022da42>] skb_checksum+0x112/0x27e
[<c022e94e>] pskb_expand_head+0xce/0x112
[<c02323fb>] skb_checksum_help+0x5d/0xac
[<d13d52ea>] ip_nat_fn+0x42/0x184 [iptable_nat]
[<d13d8092>] ipt_local_hook+0x76/0xcc [iptable_mangle]
[<d13d561e>] ip_nat_local_fn+0x34/0xaa [iptable_nat]
[<c024e3b8>] dst_output+0x0/0x7
[<c02472f0>] nf_iterate+0x30/0x61
[<c024e3b8>] dst_output+0x0/0x7
[<c0247416>] nf_hook_slow+0x3a/0x90
[<c024e3b8>] dst_output+0x0/0x7
[<c02505b0>] ip_queue_xmit+0x35f/0x3b3
[<c024e3b8>] dst_output+0x0/0x7
[<c0155fcd>] kmem_cache_alloc+0x4a/0x54
[<c022edb1>] alloc_skb_from_cache+0x48/0x110
[<c025df78>] tcp_transmit_skb+0x604/0x632
[<c025ecd4>] tcp_retransmit_skb+0x4e2/0x5c7
[<c0257e28>] tcp_enter_loss+0x1a1/0x1fd
[<c0260dab>] tcp_write_timer+0x0/0x5c9
[<c02611a3>] tcp_write_timer+0x3f8/0x5c9
[<c0123376>] run_timer_softirq+0x101/0x15c
[<c011f346>] __do_softirq+0x5e/0xc3
[<c011f3e5>] do_softirq+0x3a/0x4a
[<c0106125>] do_IRQ+0x48/0x53
[<c020c614>] evtchn_do_upcall+0x64/0x9b
[<c0104a51>] hypervisor_callback+0x3d/0x48
Code: a8 13 46 ac 13 46 b0 13 46 b4 13 46 b8 13 46 bc 13 46 c0 13 46 c4
13 46 c
EIP: [<c01bb497>] csum_partial+0xd3/0x120 SS:ESP 0069:c88dfd84
<0>Kernel panic - not syncing: Fatal exception in interrupt
(XEN) Domain 0 crashed: rebooting machine in 5 seconds.

Comment 15 Herbert Xu 2009-04-24 11:45:49 UTC
Matthew, are you using anything like drbd? That seems to be a common thread in the other two reports.

In any case, the culprit here is the owner of the socket.  So it would really help if you can pin-point the port number and PID of the socket whose retransmitted packet triggered this.

Comment 16 Herbert Xu 2009-04-24 11:49:25 UTC
Other things like drbd would be iscsi, NFS, or anything that does TCP in the kernel.

Comment 17 Matthew Baker 2009-04-24 11:54:22 UTC
Hi Herbert,

yes we are using drbd. We are no longer experiencing the issue and I'm reluctant to remove the workaround as the problem only occurred on production services.

I would be happy to help if I can, though. Is there anything I can do in hindsight?

Matt

Comment 18 Herbert Xu 2009-04-24 12:10:19 UTC
Based on this information my conclusion is that there is a bug in drbd where it frees pages that are still owned by the TCP socket.

Comment 20 Chris Lalancette 2009-05-04 15:46:58 UTC
Since this looks like a drbd issue (see comment #18, and the common thread that all of the reported stacks are using drbd), and since we don't support drbd in RHEL-5, I'm going to close this as NOTABUG.  If this can be reproduced without drbd, or someone finds other evidence to the contrary, please feel free to re-open the bug.

Chris Lalancette

Comment 24 Laszlo Ersek 2011-09-21 08:43:42 UTC
*** Bug 666005 has been marked as a duplicate of this bug. ***