Bug 230894 - BUG: soft lockup detected on CPU#0 ... and CPU#1
Summary: BUG: soft lockup detected on CPU#0 ... and CPU#1
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 6
Hardware: x86_64
OS: Linux
medium
high
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Brian Brock
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2007-03-04 11:17 UTC by Siim Käba
Modified: 2007-12-01 07:04 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2007-12-01 07:04:32 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)

Description Siim Käba 2007-03-04 11:17:09 UTC
Description of problem:
/dev/vmmon[4094]: Module vmmon: registered with major=10 minor=165
/dev/vmmon[4094]: Module vmmon: initialized
/dev/vmmon[4098]: Module vmmon: unloaded
/dev/vmmon[4487]: Module vmmon: registered with major=10 minor=165
/dev/vmmon[4487]: Module vmmon: initialized
/dev/vmnet: open called by PID 4516 (vmnet-bridge)
/dev/vmnet: hub 0 does not exist, allocating memory.
/dev/vmnet: port on hub 0 successfully opened
bridge-eth0: enabling the bridge
bridge-eth0: up
bridge-eth0: already up
bridge-eth0: attached
/dev/vmnet: open called by PID 4528 (vmnet-natd)
/dev/vmnet: hub 8 does not exist, allocating memory.
/dev/vmnet: port on hub 8 successfully opened
/dev/vmnet: open called by PID 4540 (vmnet-netifup)
/dev/vmnet: hub 1 does not exist, allocating memory.
/dev/vmnet: port on hub 1 successfully opened
BUG: soft lockup detected on CPU#0!

Call Trace:
 [<ffffffff8026999a>] show_trace+0x34/0x47
 [<ffffffff802699bf>] dump_stack+0x12/0x17
 [<ffffffff802b6d9b>] softlockup_tick+0xdb/0xf6
 [<ffffffff80293cdd>] update_process_times+0x42/0x68
 [<ffffffff802749e7>] smp_local_timer_interrupt+0x34/0x55
 [<ffffffff8027509b>] smp_apic_timer_interrupt+0x51/0x69
 [<ffffffff8025ccf6>] apic_timer_interrupt+0x66/0x70
 [<ffffffff8028c6b0>] vprintk+0x2af/0x2ef
 [<ffffffff8028c742>] printk+0x52/0xbd
 [<ffffffff88467138>] :vmnet:VNetFileOpOpen+0x15c/0x174
 [<ffffffff80247929>] chrdev_open+0x149/0x198
 [<ffffffff8021e831>] __dentry_open+0xd9/0x1df
 [<ffffffff80227b55>] do_filp_open+0x2a/0x38
 [<ffffffff80219a1d>] do_sys_open+0x44/0xbe
 [<ffffffff8025f433>] sysenter_do_call+0x1b/0x67
 [<00000000ffffe410>]

BUG: soft lockup detected on CPU#1!

Call Trace:
 [<ffffffff8026999a>] show_trace+0x34/0x47
 [<ffffffff802699bf>] dump_stack+0x12/0x17
 [<ffffffff802b6d9b>] softlockup_tick+0xdb/0xf6
 [<ffffffff80293cdd>] update_process_times+0x42/0x68
 [<ffffffff802749e7>] smp_local_timer_interrupt+0x34/0x55
 [<ffffffff8027509b>] smp_apic_timer_interrupt+0x51/0x69
 [<ffffffff8025ccf6>] apic_timer_interrupt+0x66/0x70
 [<ffffffff88004ba5>] :uhci_hcd:uhci_irq+0x38/0x154
 [<ffffffff803ddb9e>] usb_hcd_irq+0x24/0x52
 [<ffffffff80210abc>] handle_IRQ_event+0x25/0x53
 [<ffffffff802b81b8>] handle_fasteoi_irq+0x92/0xd1
 [<ffffffff8026aba8>] do_IRQ+0x10e/0x15f
 [<ffffffff8025c641>] ret_from_intr+0x0/0xa
 [<ffffffff80210ab1>] handle_IRQ_event+0x1a/0x53
 [<ffffffff802b80df>] handle_edge_irq+0xed/0x134
 [<ffffffff8026aba8>] do_IRQ+0x10e/0x15f
 [<ffffffff8025c641>] ret_from_intr+0x0/0xa
 [<ffffffff80262c03>] _spin_unlock_irq+0xb/0xc
 [<ffff81007eba91c0>]
DWARF2 unwinder stuck at 0xffff81007eba91c0

Leftover inexact backtrace:

 [<ffffffff8029c410>] autoremove_wake_function+0x0/0x2e
 [<ffffffff802fcf4c>] kmsg_read+0x3a/0x44
 [<ffffffff8020b226>] vfs_read+0xcb/0x170
 [<ffffffff80211731>] sys_read+0x45/0x6e
 [<ffffffff8025c11e>] system_call+0x7e/0x83

/dev/vmnet: open called by PID 4541 (vmnet-netifup)
/dev/vmnet: port on hub 8 successfully opened
/dev/vmnet: open called by PID 4568 (vmnet-dhcpd)
/dev/vmnet: port on hub 1 successfully opened
/dev/vmnet: open called by PID 4564 (vmnet-dhcpd)
/dev/vmnet: port on hub 8 successfully opened
/dev/vmmon[4663]: Module vmmon: unloaded
bridge-eth0: down
bridge-eth0: detached


Version-Release number of selected component (if applicable):
2.6.19-1.2911.6.4.fc6

How reproducible:
I keep eye on it. First time right now.

Steps to Reproduce:
1.
2.
3.
  
Actual results:


Expected results:


Additional info:
If some fields are empty or look unusual you may have an old version.
Compare to the current minimal requirements in Documentation/Changes.
 
Linux nokkel.int 2.6.19-1.2911.6.4.fc6 #1 SMP Sat Feb 24 14:03:48 EST 2007
x86_64 x86_64 x86_64 GNU/Linux
 
Gnu C                  4.1.1
Gnu make               3.81
binutils               2.17.50.0.6-2.fc6
util-linux             2.13-pre7
mount                  2.13-pre7
module-init-tools      3.3-pre1
e2fsprogs              1.39
quota-tools            3.13.
PPP                    2.4.4
isdn4k-utils           3.9
Linux C Library        > libc.2.5
Dynamic linker (ldd)   2.5
Procps                 3.2.7
Net-tools              1.60
Kbd                    1.12
Sh-utils               5.97
udev                   095
Modules Loaded         appletouch autofs4 eeprom hci_usb hidp rfcomm l2cap
bluetooth ip_conntrack_netbios_ns ipt_REJECT xt_state ip_conntrack nfnetlink
xt_tcpudp iptable_filter ip_tables x_tables cpufreq_ondemand nls_utf8 hfsplus
loop dm_multipath video sbs i2c_ec button battery asus_acpi ac fglrx lp parport
snd_hda_intel sg snd_hda_codec snd_seq_dummy snd_seq_oss snd_seq_midi_event
snd_seq snd_seq_device snd_pcm_oss shpchp snd_mixer_oss ide_cd snd_pcm sky2
i2c_i801 pcspkr iTCO_wdt i2c_core snd_timer cdrom snd soundcore snd_page_alloc
ohci1394 ieee1394 dm_snapshot dm_zero dm_mirror dm_mod ata_piix libata sd_mod
scsi_mod ext3 jbd ehci_hcd ohci_hcd uhci_hcd

Comment 1 Jan ONDREJ 2007-03-07 11:31:36 UTC
Similiar problem for me:

SCSI subsystem initialized
ACPI: PCI Interrupt 0000:03:01.0[A] -> GSI 28 (level, low) -> IRQ 18
BUG: soft lockup detected on CPU#0!
 [<c0404fc8>] dump_trace+0x69/0x1b6
 [<c040512d>] show_trace_log_lvl+0x18/0x2c
 [<c0405728>] show_trace+0xf/0x11
 [<c040581c>] dump_stack+0x15/0x17
 [<c044e231>] softlockup_tick+0xad/0xc4
 [<c042f717>] update_process_times+0x39/0x5c
 [<c0418f5a>] smp_apic_timer_interrupt+0x95/0xb3
 [<c0404a07>] apic_timer_interrupt+0x1f/0x24
 [<c04ed47d>] delay_tsc+0x9/0x13
 [<c04ed4b0>] __delay+0x6/0x7
 [<f88707f1>] ips_init_morpheus+0x84/0x2f1 [ips]
 [<f886d6cd>] ips_reset_morpheus+0x91/0xcd [ips]
 [<f886fb68>] ips_insert_device+0x79c/0x848 [ips]
 [<c04f5e95>] pci_device_probe+0x36/0x57
 [<c0559724>] really_probe+0x39/0xda
 [<c055995b>] __driver_attach+0x73/0xab
 [<c0558e34>] bus_for_each_dev+0x37/0x59
 [<c0559647>] driver_attach+0x16/0x18
 [<c0559105>] bus_add_driver+0x61/0x165
 [<c04f5fed>] __pci_register_driver+0x6f/0x89
 [<f8802016>] ips_module_init+0x16/0x158 [ips]
 [<c0441bdd>] sys_init_module+0x1806/0x19b1
 [<c0403fef>] syscall_call+0x7/0xb
 [<08051e9e>] 0x8051e9e
 =======================
scsi0 : IBM PCI ServeRAID 7.12.05  Build 761 <ServeRAID 6i>
...

This happens on IBM eServer x226.

After this message server boots ok, but after some time (1-24 hours) server
crashes. I am unable to send crash report now.
Can this problem crash my machine?


Comment 2 Jan ONDREJ 2007-04-14 06:34:19 UTC
My problem looks to be solved by:
  - upgrading ServerRAID 6i firmware to 7.12.12
  - using kernel 2.6.20-1.2307.fc5smp

My system is up for 6 days now. May be it can be useful for original author of
this bug too.


Comment 3 Jan ONDREJ 2007-12-01 07:04:32 UTC
There are no problems from this day.
Closing this bug as solved.



Note You need to log in before you can comment on or make changes to this bug.