Bug 233937

Summary: Newest kernel causes random reboots/hangs, using 2.6.19 solves problem
Product: [Fedora] Fedora Reporter: repatch42
Component: kernel-xen-2.6Assignee: Juan Quintela <quintela>
Status: CLOSED DUPLICATE QA Contact:
Severity: high Docs Contact:
Priority: medium    
Version: 6CC: bjohnson, djuran, ehabkost, massimo, mishu, rjones, rolf
Target Milestone: ---   
Target Release: ---   
Hardware: i386   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2007-05-21 13:47:20 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
xen dmesg
none
domain0 dmesg output
none
host dmesg output none

Description repatch42 2007-03-26 04:03:35 UTC
Description of problem:

Machine randomly locks up running 2.6.20-1.2933.fc6xen #1 SMP

Version-Release number of selected component (if applicable):
2.6.20-1.2933.fc6xen #1

How reproducible:
Random, usually within a few minutes of boot.

Steps to Reproduce:
1. Boot up with the kernel version above
2. Do anything (i.e. even opening firefox)
3. Machine either hangs, reports or powers off
  
Actual results:


Expected results:


Additional info:
From my messages file:
Mar 25 12:43:21 PD804 kernel: last sysfs file: /block/hda/hda1/stat
Mar 25 12:43:21 PD804 kernel: Modules linked in: mga drm bridge netloop netbk
blktap blkbk autofs4 hidp rfcomm l2cap bluetooth sunrpc ib_iser rdma_cm ib_cm
iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi scsi_transport_iscsi
dm_mirror dm_multipath dm_mod video sbs i2c_ec dock button battery asus_acpi
backlight ac ipv6 lp sg snd_via82xx gameport snd_ac97_codec snd_seq_dummy
snd_seq_oss floppy snd_seq_midi_event snd_seq snd_pcm_oss snd_mixer_oss snd_pcm
snd_timer ac97_bus snd_page_alloc snd_mpu401_uart snd_rawmidi snd_seq_device snd
soundcore pcspkr saa7134 i2c_viapro video_buf compat_ioctl32 ir_kbd_i2c
ir_common videodev v4l2_common v4l1_compat i2c_core via_rhine ide_cd cdrom
pl2303 mii usbserial serial_core serio_raw parport_pc parport usb_storage
sata_via libata sd_mod scsi_mod ext3 jbd ehci_hcd ohci_hcd uhci_hcd
Mar 25 12:43:21 PD804 kernel: CPU:    1
Mar 25 12:43:21 PD804 kernel: EIP:    7f4c:[<c6e30707>]    Not tainted VLI
Mar 25 12:43:21 PD804 kernel: EFLAGS: 0a001ba7   (2.6.20-1.2933.fc6xen #1)
Mar 25 12:43:21 PD804 kernel: EIP is at 0xc6e30707
Mar 25 12:43:21 PD804 kernel: eax: 00000000   ebx: 00a68402   ecx: 00000073  
edx: 00000246
Mar 25 12:43:21 PD804 kernel: esi: bf9d8fac   edi: 0000007b   ebp: 00000000  
esp: ceecd01c
Mar 25 12:43:21 PD804 kernel: ds: 0000   es: 0000   ss: 0069
Mar 25 12:43:21 PD804 kernel: Process makewhatis (pid: 5232, ti=ceecc000
task=e53d2770 task.ti=ceecc000)
Mar 25 12:43:21 PD804 kernel: Stack: 7f5addd4 925b7795 02f5fc0a 0b2d68dd
47adf2b3 ee550afe dded54a3 34d44082 
Mar 25 12:43:21 PD804 kernel:        bec4c5e6 5fc14f75 0fdbe8b1 7fc3a757
c374d2c7 a7f4f704 540b6d08 a4adce64 
Mar 25 12:43:21 PD804 kernel:        ad4c144b 1df50978 156b7a89 b86f3b75
df69a8c3 cadaf765 f7c96901 c1e6dc8e 
Mar 25 12:43:21 PD804 kernel: Call Trace:
Mar 25 12:43:21 PD804 kernel: BUG: unable to handle kernel NULL pointer
dereference at virtual address 00000007
Mar 25 12:43:21 PD804 kernel:  printing eip:
Mar 25 12:43:21 PD804 kernel: c04055c2
Mar 25 12:43:22 PD804 kernel: 0f328000 -> *pde = 00000000:09566001
Mar 25 12:43:22 PD804 kernel: 14f66000 -> *pme = 00000000:00000000
Mar 25 12:43:22 PD804 kernel: Oops: 0000 [#2]
Mar 25 12:43:22 PD804 kernel: SMP 
Mar 25 12:43:22 PD804 kernel: last sysfs file: /block/hda/hda1/stat
Mar 25 12:43:22 PD804 kernel: Modules linked in: mga drm bridge netloop netbk
blktap blkbk autofs4 hidp rfcomm l2cap bluetooth sunrpc ib_iser rdma_cm ib_cm
iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi scsi_transport_iscsi
dm_mirror dm_multipath dm_mod video sbs i2c_ec dock button battery asus_acpi
backlight ac ipv6 lp sg snd_via82xx gameport snd_ac97_codec snd_seq_dummy
snd_seq_oss floppy snd_seq_midi_event snd_seq snd_pcm_oss snd_mixer_oss snd_pcm
snd_timer ac97_bus snd_page_alloc snd_mpu401_uart snd_rawmidi snd_seq_device snd
soundcore pcspkr saa7134 i2c_viapro video_buf compat_ioctl32 ir_kbd_i2c
ir_common videodev v4l2_common v4l1_compat i2c_core via_rhine ide_cd cdrom
pl2303 mii usbserial serial_core serio_raw parport_pc parport usb_storage
sata_via libata sd_mod scsi_mod ext3 jbd ehci_hcd ohci_hcd uhci_hcd
Mar 25 12:43:22 PD804 kernel: CPU:    1
Mar 25 12:43:22 PD804 kernel: EIP:    0061:[<c04055c2>]    Not tainted VLI
Mar 25 12:43:22 PD804 kernel: EFLAGS: 00010097   (2.6.20-1.2933.fc6xen #1)
Mar 25 12:43:22 PD804 kernel: EIP is at dump_trace+0x5c/0x93
Mar 25 12:43:22 PD804 kernel: eax: 00000ffd   ebx: 00000007   ecx: 0012362f  
edx: 013df400
Mar 25 12:43:22 PD804 kernel: esi: 00000000   edi: 00000000   ebp: c0693fce  
esp: ceecce7c
Mar 25 12:43:22 PD804 kernel: ds: 007b   es: 007b   ss: 0069
Mar 25 12:43:22 PD804 kernel: Process makewhatis (pid: 5232, ti=ceecc000
task=e53d2770 task.ti=ceecc000)
Mar 25 12:43:22 PD804 kernel: Stack: c0693e8e c0693fce 00000018 00000000
c0693fce c0405611 c06e44e0 c0693fce 
Mar 25 12:43:22 PD804 kernel:        ceecd07f c04056c0 c0693fce c0693fce
ceeccfe4 ceecd01c 00000002 0a001ba7 
Mar 25 12:43:22 PD804 kernel:        ceeccfe4 ceecd01c c0405856 c0693fce
00000010 e53d2904 00001470 ceecc000 
Mar 25 12:43:22 PD804 kernel: Call Trace:
Mar 25 12:43:22 PD804 kernel:  [<c0405611>] show_trace_log_lvl+0x18/0x2c
Mar 25 12:43:22 PD804 kernel:  [<c04056c0>] show_stack_log_lvl+0x9b/0xa3
Mar 25 12:43:22 PD804 kernel:  [<c0405856>] show_registers+0x18e/0x25d
Mar 25 12:43:22 PD804 kernel:  [<c0613405>] notifier_call_chain+0x19/0x29
Mar 25 12:43:22 PD804 kernel:  [<c0405a58>] die+0x133/0x22f
Mar 25 12:43:22 PD804 kernel:  [<c0406302>] do_iret_error+0xa7/0xb1
Mar 25 12:43:22 PD804 kernel:  [<c0417716>] __might_sleep+0x21/0xc1
Mar 25 12:43:22 PD804 kernel:  [<c040500c>] scrit+0xc/0x1a
Mar 25 12:43:22 PD804 kernel:  [<c040500d>] scrit+0xd/0x1a
Mar 25 12:43:22 PD804 kernel:  [<c040500e>] scrit+0xe/0x1a
Mar 25 12:43:22 PD804 kernel:  [<c0405013>] scrit+0x13/0x1a
Mar 25 12:43:22 PD804 kernel:  [<c042c063>] search_exception_tables+0x14/0x25
Mar 25 12:43:22 PD804 kernel:  [<c04144ef>] fixup_exception+0xb/0x20
Mar 25 12:43:22 PD804 kernel:  [<c0611b45>] do_general_protection+0x11c/0x16f
Mar 25 12:43:22 PD804 kernel:  [<c04068d1>] do_IRQ+0xc6/0xdd
Mar 25 12:43:22 PD804 kernel:  [<c0611a29>] do_general_protection+0x0/0x16f
Mar 25 12:43:22 PD804 kernel:  [<c040625b>] do_iret_error+0x0/0xb1
Mar 25 12:43:22 PD804 kernel:  [<c061162d>] error_code+0x35/0x3c
Mar 25 12:43:22 PD804 kernel:  =======================
Mar 25 12:43:22 PD804 kernel: Code: 9a d4 01 00 00 89 df 81 e7 00 f0 ff ff eb 0e
8b 4c 24 18 89 f2 89 e8 ff 51 08 83 c3 04 39 fb 76 29 8d 87 fd 0f 00 00 39 c3 73
1f <8b> 33 89 f0 e8 61 6a 02 00 85 c0 74 e2 eb d5 8b 4f 34 85 c9 74 
Mar 25 12:43:22 PD804 kernel: EIP: [<c04055c2>] dump_trace+0x5c/0x93 SS:ESP
0069:ceecce7c
Mar 25 12:43:22 PD804 kernel:  <3>BUG: sleeping function called from invalid
context at kernel/rwsem.c:20
Mar 25 12:43:22 PD804 kernel: in_atomic():0, irqs_disabled():1
Mar 25 12:43:22 PD804 kernel:  [<c043059a>] down_read+0x12/0x28
Mar 25 12:43:22 PD804 kernel:  [<c0438c0a>] acct_collect+0x38/0x13e
Mar 25 12:43:22 PD804 kernel:  [<c041fe2b>] do_exit+0x1b1/0x6f6
Mar 25 12:43:22 PD804 kernel:  [<c0405b2f>] die+0x20a/0x22f
Mar 25 12:43:22 PD804 kernel:  [<c061326f>] do_page_fault+0xab1/0xc2e
Mar 25 12:43:22 PD804 kernel:  [<c054adfc>] kcons_write_dom0+0x0/0x26
Mar 25 12:43:22 PD804 kernel:  [<c06114ff>] _spin_unlock_irqrestore+0x8/0x16
Mar 25 12:43:22 PD804 kernel:  [<c041d829>] release_console_sem+0x192/0x1d1
Mar 25 12:43:22 PD804 kernel:  [<c041de9a>] vprintk+0x2de/0x2e8
Mar 25 12:43:22 PD804 kernel:  [<c06127be>] do_page_fault+0x0/0xc2e
Mar 25 12:43:22 PD804 kernel:  [<c061162d>] error_code+0x35/0x3c
Mar 25 12:43:22 PD804 kernel:  [<c04100d8>] MPBIOS_trigger+0x4b/0xbc
Mar 25 12:43:22 PD804 kernel:  [<c04055c2>] dump_trace+0x5c/0x93
Mar 25 12:43:22 PD804 kernel:  [<c0405611>] show_trace_log_lvl+0x18/0x2c
Mar 25 12:43:22 PD804 kernel:  [<c04056c0>] show_stack_log_lvl+0x9b/0xa3
Mar 25 12:43:22 PD804 kernel:  [<c0405856>] show_registers+0x18e/0x25d
Mar 25 12:43:22 PD804 kernel:  [<c0613405>] notifier_call_chain+0x19/0x29
Mar 25 12:43:22 PD804 kernel:  [<c0405a58>] die+0x133/0x22f
Mar 25 12:43:22 PD804 kernel:  [<c0406302>] do_iret_error+0xa7/0xb1
Mar 25 12:43:22 PD804 kernel:  [<c0417716>] __might_sleep+0x21/0xc1
Mar 25 12:43:22 PD804 kernel:  [<c040500c>] scrit+0xc/0x1a
Mar 25 12:43:22 PD804 kernel:  [<c040500d>] scrit+0xd/0x1a
Mar 25 12:43:22 PD804 kernel:  [<c040500e>] scrit+0xe/0x1a
Mar 25 12:43:22 PD804 kernel:  [<c0405013>] scrit+0x13/0x1a
Mar 25 12:43:22 PD804 kernel:  [<c042c063>] search_exception_tables+0x14/0x25
Mar 25 12:43:22 PD804 kernel:  [<c04144ef>] fixup_exception+0xb/0x20
Mar 25 12:43:22 PD804 kernel:  [<c0611b45>] do_general_protection+0x11c/0x16f
Mar 25 12:43:22 PD804 kernel:  [<c04068d1>] do_IRQ+0xc6/0xdd
Mar 25 12:43:22 PD804 kernel:  [<c0611a29>] do_general_protection+0x0/0x16f
Mar 25 12:43:22 PD804 kernel:  [<c040625b>] do_iret_error+0x0/0xb1
Mar 25 12:43:22 PD804 kernel:  [<c061162d>] error_code+0x35/0x3c
Mar 25 12:43:22 PD804 kernel:  =======================

Comment 1 Maximilian Imgrund 2007-03-27 09:36:01 UTC
I also get this. And I get this in /var/log/messages:

Mar 27 11:01:46 heisenberg kernel: list_del corruption. next->prev should be
c34cd000, but was c0610000                                                   Mar
27 11:01:46 heisenberg kernel: ------------[ cut here ]------------            
                                                                      Mar 27
11:01:46 heisenberg kernel: kernel BUG at lib/list_debug.c:72!                 
                                                                   Mar 27
11:01:46 heisenberg kernel: invalid opcode: 0000 [#1]                          
                                                                   Mar 27
11:01:46 heisenberg kernel: SMP                                                
                                                                   Mar 27
11:01:46 heisenberg kernel: last sysfs file: /class/misc/evtchn/dev            
                                                                   Mar 27
11:01:46 heisenberg kernel: Modules linked in: ipt_REDIRECT xt_physdev
ipt_MASQUERADE iptable_nat nf_nat bridge netloop netbk blktap blkbk autofs4 hidp
l2cap bluetooth sunrpc nf_conntrack_netbios_ns ipt_REJECT nf_conntrack_ipv4
xt_state nf_conntrack nfnetlink iptable_filter ip_tables ip6t_REJECT xt_tcpudp
ip6table_filter ip6_tables x_tables ipv6 dm_multipath video sbs i2c_ec dock
button battery asus_acpi backlight ac lp e1000 snd_intel8x0 snd_ac97_codec
ac97_bus snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device
snd_pcm_oss i2c_i801 i2c_core snd_mixer_oss iTCO_wdt iTCO_vendor_support snd_pcm
snd_timer ne2k_pci snd soundcore 8390 snd_page_alloc pcspkr floppy parport_pc
parport ide_cd cdrom serial_core dm_snapshot dm_zero dm_mirror dm_mod ata_piix
libata sd_mod scsi_mod ext3 jbd ehci_hcd ohci_hcd uhci_hcd                     
                                                                Mar 27 11:01:46
heisenberg kernel: CPU:    0                                                   
                                                          Mar 27 11:01:46
heisenberg kernel: EIP:    0061:[<c04e1fba>]    Not tainted VLI                
                                                          Mar 27 11:01:46
heisenberg kernel: EFLAGS: 00010092   (2.6.20-1.2933.fc6xen #1)                
                                                          Mar 27 11:01:46
heisenberg kernel: EIP is at list_del+0x42/0x5d                                
                                                          Mar 27 11:01:46
heisenberg kernel: eax: 00000048   ebx: c0610000   ecx: c06e6b50   edx: f5416000
                                                         Mar 27 11:01:46
heisenberg kernel: esi: c34cd000   edi: c0d81dc0   ebp: c0d83180   esp: dc6fbef4
                                                         Mar 27 11:01:46
heisenberg kernel: ds: 007b   es: 007b   ss: 0069                              
                                                          Mar 27 11:01:46
heisenberg kernel: Process events/0 (pid: 8, ti=dc6fb000 task=c0da25f0
task.ti=dc6fb000)                                                  Mar 27
11:01:46 heisenberg kernel: Stack: c06a4b86 c34cd000 c0610000 c34cd240 c046278b
00000017 00000000 00000003                                         Mar 27
11:01:46 heisenberg kernel:        00000001 dc80ca24 dc80ca20 00000003 dc80ca00
00000000 c046289b 00000000                                         Mar 27
11:01:46 heisenberg kernel:        00000000 c0d83180 c0d81de4 c0d81dc0 c0d83180
00000000 c13ad740 c0463b01                                         Mar 27
11:01:46 heisenberg kernel: Call Trace:                                        
                                                                   Mar 27
11:01:46 heisenberg kernel:  [<c0610000>] io_schedule_timeout+0x20/0x63        
                                                                   Mar 27
11:01:46 heisenberg kernel:  [<c046278b>] free_block+0x5f/0xe5                 
                                                                   Mar 27
11:01:46 heisenberg kernel:  [<c046289b>] drain_array+0x8a/0xb5                
                                                                   Mar 27
11:01:46 heisenberg kernel:  [<c0463b01>] cache_reap+0x61/0x124                
                                                                   Mar 27
11:01:46 heisenberg kernel:  [<c042aa3a>] run_workqueue+0x85/0x125             
                                                                   Mar 27
11:01:46 heisenberg kernel:  [<c061149a>] _spin_lock_irqsave+0x12/0x17         
                                                                   Mar 27
11:01:46 heisenberg kernel:  [<c0463aa0>] cache_reap+0x0/0x124
Mar 27 11:01:46 heisenberg kernel:  [<c042b38c>] worker_thread+0xd9/0x105
Mar 27 11:01:46 heisenberg kernel:  [<c0418c8f>] default_wake_function+0x0/0xc
Mar 27 11:01:46 heisenberg kernel:  [<c042b2b3>] worker_thread+0x0/0x105

The corruption might be caused by
Mar 27 11:01:05 heisenberg kernel: ADDRCONF(NETDEV_CHANGE): vif1.0: link becomes
ready                                                                    Mar 27
11:01:05 heisenberg kernel: xenbr0: port 3(vif1.0) entering learning state     
                                                                   Mar 27
11:01:05 heisenberg kernel: xenbr0: topology change detected, propagating      
                                                                   Mar 27
11:01:05 heisenberg kernel: xenbr0: port 3(vif1.0) entering forwarding state   
              
But I am not sure about that.

Comment 2 Peter Backes 2007-03-30 17:34:53 UTC
I have similar issues (kernel-xen-2.6.20-1.2933.fc6 hangs/crashes; works fine
with kernel-xen-2.6.19-1.2911.6.5.fc6). I can reproduce it by copying a 80GB
partition with dd if=/dev/sda6 of=/dev/sda5 bs=128b, which will cause the
problem to occur within two or three minutes. Without I/O load, the system runs
fine for several days.


Comment 3 Eduardo Habkost 2007-04-02 13:55:13 UTC
This looks like a duplicate of bug #233749

Comment 4 Eduardo Habkost 2007-04-04 13:48:25 UTC
*** Bug 233749 has been marked as a duplicate of this bug. ***

Comment 5 Eduardo Habkost 2007-04-04 14:04:01 UTC
bug #234008 may have useful information for this bug, also. Maybe #234008 also 
have the same cause of this one.

Comment 6 Rolf Fokkens 2007-04-14 11:42:40 UTC
Latest greatest kernel-xen-2.6.20-1.2944 also hangs during boot. For a while all
goes well, and then during the startup of some service is totally locks up.

Comment 7 Mykola Lyakhovych 2007-04-30 11:02:56 UTC
Looks like duplicate for bug #238403

Comment 8 Rolf Fokkens 2007-05-04 17:17:21 UTC
For those who care: kernel-xen-2.6.20-1.2948 also locks up. I'm personally very
happy with KVM as it gives some alternative.

Comment 9 Jan ONDREJ 2007-05-21 07:41:04 UTC
I think this problem is still not solved. ;-(
And kernel-xen-2.6.19 packages was removed from fedora updates directory so I
can't install new guests with this kernel. :-(
Can somebody fix this bud or give back kernel-xen-2.6.19 into updates?
Is this bug also in Fedora 7 ?

Does kernel-xen-2.6.19 working for somebody?

I am attaching dmesg-s from my another machine. Machine is working well, but
applications hangs sometimes.

Comment 10 Jan ONDREJ 2007-05-21 07:42:41 UTC
Created attachment 155074 [details]
xen dmesg

Comment 11 Jan ONDREJ 2007-05-21 07:43:14 UTC
Created attachment 155075 [details]
domain0 dmesg output

Comment 12 Jan ONDREJ 2007-05-21 07:43:36 UTC
Created attachment 155076 [details]
host dmesg output

Comment 13 Eduardo Habkost 2007-05-21 13:47:20 UTC
The problem was fixed on FC7, and it is being fixed for FC6.

I am marking this ticket was duplicate of bug 234008, as bug 234008 is being 
used as the main ticket for those do_iret_error() and evtchn_upcall() Oopses 
on FC6.

*** This bug has been marked as a duplicate of 234008 ***